Clean Your Data
Learning Objectives
After completing this unit, you’ll be able to:
- Explain the importance of database hygiene.
- Improve your data and follow best practices.
Squeaky Clean Data
No matter how prepared you are for an emergency, communications won’t get to the intended user if you have bad data. According to Harvard Business Review, bad data costs US companies $3 trillion dollars in wasted time and resources to make up for or fix that data. This is especially true in email marketing if you are paying for contacts that you can’t send to because you have bad information about them. Take a data assessment to check how your company is doing—or you can make the fair assumption that your data hygiene can always improve.
Database of Record
So where to begin? Let’s start by talking about your database of record (DBOR) or where you store the most accurate and up-to-date information about your customers. Step one is to make sure everyone (and we mean everyone) agrees and—more importantly—uses your DBOR as their source of truth about customer information. Data will always change because life happens (customers move, change names, and so on). Bad data happens when companies make small changes in one place to get a quick fix, but they don’t take the time to update the DBOR.
For example, Cumulus Bank uses Sales Cloud as its DBOR. One of Cumulus’s branch managers found a typo in the data he is importing into Marketing Cloud Engagement. Instead of logging into Sales Cloud to ensure a customer’s original record is correct, the manager makes an update to his spreadsheet before uploading it into Marketing Cloud Engagement. While it might not seem like a big deal, these little changes over time can make a big impact.
Make Data Hygiene a Priority
It’s time to make data hygiene a priority in your day-to-day workflow. Let’s review some possible risks and solutions to improve the quality of your data.
Situation | Risk | Action Steps |
---|---|---|
Customer’s Info Changes |
|
|
Inactive Subscribers |
|
|
Opted Out Subscribers |
|
|
MC Connect Sync |
|
|
All About the Data
Clean data is great. Fast processing of that data is even better. So let’s review some best practices in data storage that can help with account optimization and email processing times.
- Establish naming conventions and limit the number of data extensions in your account.
While there is no set limit to the number of data extensions in an account, an excessive amount can push account performance for both the UI and API activities. Implement naming conventions to reduce unnecessary data sources and to tidy up your account. Your naming conventions can reflect how you organize your data—either by campaign, date, or whatever makes sense to your organization. Just be consistent and review these often.
Suggestion: To help identify data extensions that are temporary, use a naming convention, like _DELETEME at the end of the extension name for easy cleanup. Another suggestion would be to identify data extensions that are for tests versus final, for example _QA or _FINAL.
- Limit the amount and content types of stored data.
To improve SQL performance and data processing times, limit the amount of data stored in the system. Only store data you use for campaign segmentation and personalization. And only store sensitive and identifying data (like date of birth or Social Security number) if it is essential. Just because you can import any type of data into Marketing Cloud Engagement, doesn’t mean you should.
- Limit column length and use the correct type of data extension.
Data extension columns should be set based on the maximum size of data to be stored in them. And use the correct type based on the data stored—for example, use a date type versus a text type if storing a birthday. One exception: SubscriberKey is stored in the system as a text column, even if you are using a numeric value.
Example: When storing a two-digit state code, the column should be limited to 2 characters rather than leaving the default of 50 characters.
- Limit overall table size.
For peak performance, keep the total byte size of the table under 8,000 characters. Byte size is determined by the sum of all column lengths in the data extension.
Example: So 20 columns of 100 characters is fine, while 8 columns of 1,000 characters exceed this limit.
- Create a primary key for any data extensions that are updated or added to.
The primary key is used to define a unique row in this particular data extension. Primary keys are required for imports or queries that are not using the overwrite option. One exception: Do not add primary keys to triggered send data extensions because your systems may attempt to resend the same data twice during a retry.
Examples: Subscriber_Key for sendable data extensions; Product_ID for a product lookup table; ZipCode for a region lookup table based on zip codes
- Only update with new data versus a full data refresh.
To help improve data processing time and reduce file size, use the add and update data extension functions versus a data overwrite.
Example: Cumulus Bank uploads a daily file of customer activity into Marketing Cloud Engagement. To help with processing, the company’s digital marketer requests its IT team to provide a smaller file each day that only includes the delta or changed records.
- Use shared data extensions across business units.
Rather than query or create multiple copies of data that can cause inconsistencies, use shared data extensions to make data available to other business units.
Example: Cumulus Bank’s parent account (its corporate account) has a shared data extension for high-risk customers that is shared with all business units (bank branches). The corporate account maintains that data extension with the newest information so that bank branches have the most up-to-date information when they need it.
Data Retention
One of the easiest options for ongoing data hygiene is to create a retention plan to limit the number of data extensions in your account and the amount of data you store. When you create a data extension in Marketing Cloud Engagement, you can choose how you want to apply data retention by selecting to delete specific data or the entire data extension.
Follow this advice.
- Allow data to expire when it is no longer needed.
- Automatically remove data extensions that are no longer needed regularly.
Prepped and Ready
In addition to saving your company money, reviewing data hygiene and storage is an important aspect of emergency preparedness. It’s like checking the expiration date of the food in your emergency kit. Just remember to focus on the following key principles.
- Evaluate your account regularly. Conduct an audit of your Marketing Cloud Engagement account and revisit it often.
- Commit to data hygiene. Remove data that is incorrect, incomplete, improperly formatted, or duplicated in your company’s database of record (DBOR).
- Scrub your send list. Make sure to review your subscriber list regularly and only send to engaged subscribers.
Great job, prepper! You have reviewed your account, identified risks that need to be addressed, and are prepared for an account emergency. Better yet, let’s hope your diligence avoids emergencies altogether!
Resources
- Salesforce Help: Best Practices for Data Extensions and Query Activities
- External: Impact of Bad B2B Marketing Data
- External: Bad Data Costs the US $3 Trillion Per Year
- External: Assess Whether You Have a Data Quality Problem
- Trailhead: Marketing Cloud Engagement Data Management