Review Data Ingestion and Modeling Phases
After completing this unit, you’ll be able to:
- Examine how data is ingested into Customer 360 Audiences.
- Apply basic data modeling concepts to your account.
Phase 1: Data Ingestion
In the overview we indicated that the data is first ingested from the source and stored in our system in a data source object, but we didn’t get into the details of how you can actually connect and access data in the source system. Data is retrieved from the source by way of a Connector. Connectors literally connect to the location where your data sources reside, establishing a communication between servers, so that your data can be continually accessed. Data Streams assist the connectors in that they dictate how often and when the connections should be established. And they also assist with actually populating the data into the data source object once the connector gains access.
The following connectors are available to Customer 360 Audiences users.
- Marketing Cloud Email Studio
- Marketing Cloud MobileConnect
- Marketing Cloud MobilePush
- Marketing Cloud Data Extensions
- Sales and Service Cloud
- AWS S3
Marketing Cloud Email Studio, MobileConnect, and MobilePush
Customer 360 Audiences provides starter data bundles that give you predefined data sets for email and mobile. Because these are all known system tables, these bundles take you all the way from step (1) importing the data set as-is to step (2) introducing it automatically to the data model layer. Which means within a few clicks, you are ready to get to work on your business use cases. The behavioral, engagement-oriented data sets retrieved by these connectors are refreshed hourly; the profile data sets are refreshed daily.
Marketing Cloud Data Extensions
You can also access custom data sets via Marketing Cloud Data Extensions. For example, you can use this connector to ingest ecommerce or survey data that you’ve already imported into Marketing Cloud. Simply provision your Marketing Cloud instance in Customer 360 Audiences after which point you’ll see the list of Data Extensions that can be brought in. Depending on how you choose to export your data extension from Marketing Cloud—Full Refresh or New/Updated Data Only—the data will be retrieved by the connector daily for the former option or hourly for the latter option. Keep in mind that unlike the starter data bundles, which both (1) import the data and (2) model the data for you, with this connector you must complete the modeling step yourself, since the data set is custom.
Sales and Service Cloud
Once you authenticate your Sales and Service Cloud instance, you can choose one object per data stream to connect to your Customer 360 Audiences account, either by selecting from a list of available objects or searching. The data is refreshed hourly, and once a week there’s also a full refresh of the data.
This option creates a data stream from data stored on an Amazon Web Services S3 location. The connector accommodates custom data sets and you have the option to retrieve data hourly, daily, weekly, or monthly. As with custom data sets, the connector completes the (1) import step, and you subsequently (2) map the data to the model.
For any one of these connectors, the Refresh History tab is a good resource to validate that the data is being retrieved at the expected cadence and without errors. Should there be a retrieval error, the Status column (shown in the following image) provides more information about the error.
Extend Using Formula Fields
Connectors fetch the original shape of the data by retrieving the full source field list, and you can create additional calculated fields if you choose. For example, if the connector retrieves an age field as a raw number and you want to band the data into age groups like 18–24 years, 25–34 years, 35–44 years, 45+ years, you can achieve that by adding a new formula. The formula is a combination of IF statements as well as <and, or> operators—to the data source object that is derived from the age source field.
There are several formula functions you can use. They fall into four categories.
- Text manipulation
- For example: EXTRACT(), FIND(), LEFT(), SUBSTITUTE()
- Type conversions
- For example: ABS(), MD5(), NUMBER(), PARSEDATE()
- Date calculations
- For example: DATE(), DATEDIFF(), DAYPRECISION()
- Logical expressions
- For example: (IF(), AND(), OR(), NOT()
Phase 2: Data Modeling
We mentioned that once all the data streams are ingested into the system, there’s a source-to-target mapping experience that utilizes the Cloud Information Model (CIM) to normalize the data sources. For example, you can use the CIM’s notion of an Individual ID to tag the source field corresponding to the individual who purchased a device (one data stream), called about a service issue (another data stream), received a replacement (yet another data stream)—and then review every event in the customer journey (yep, one more data stream). Data mapping helps you draw the lines between the applicable fields in the data sources to help tie everything together. Pay special attention to attributes like names, email addresses, and phone numbers (or similar identifiers). This information helps you link an individual’s data together and ultimately build a unified profile of the customer. But everything has a place and a connection to something else—you just need to draw the line.
The CIM is designed to be extensible both by adding more custom attributes to an existing standard object and by adding more custom objects. When you utilize standard objects, relationships between objects light up automatically when the fields relating the two objects are both mapped. In a later unit, we walk you through an example of when you might need to define the relationship between objects in cases where you have added custom objects to your model.
Now that you’re familiar with the basic concepts of data ingestion and modeling, you’re ready to move on to concrete examples.