Clean Your Data

Learning Objectives

After completing this unit, you’ll be able to:

  • Explain the importance of database hygiene.
  • Improve your data and follow best practices.

Squeaky Clean Data 

No matter how prepared you are for an emergency, communications won’t get to the intended user if you have bad data. According to Harvard Business Review, bad data costs US companies $3 trillion dollars in wasted time and resources to make up for or fix that data. This is especially true in email marketing if you are paying for contacts that you can’t send to because you have bad information about them. Take a data assessment to check how your company is doing—or you can make the fair assumption that your data hygiene can always improve. 

Database of Record

So where to begin? Let’s start by talking about your database of record (DBOR) or where you store the most accurate and up-to-date information about your customers. Step one is to make sure everyone (and we mean everyone) agrees and—more importantly—uses your DBOR as their source of truth about customer information. Data will always change because life happens (customers move, change names, and so on). Bad data happens when companies make small changes in one place to get a quick fix, but they don’t take the time to update the DBOR.

For example, Cumulus Bank uses Sales Cloud as its DBOR. One of Cumulus’s branch managers found a typo in the data he is importing into Marketing Cloud. Instead of logging into Sales Cloud to ensure a customer’s original record is correct, the manager makes an update to his spreadsheet before uploading it into Marketing Cloud. While it might not seem like a big deal, these little changes over time can make a big impact. 

Make Data Hygiene a Priority

It’s time to make data hygiene a priority in your day-to-day workflow. Let’s review some possible risks and solutions to improve the quality of your data. 

Action Steps
Customer’s Info Changes

  • Duplicate records in Marketing Cloud
  • Unable to send an email to a subscriber due to bad info
  • Update DBOR daily. Create an automation to update your database of record when a customer’s data changes in Marketing Cloud.
  • Update external systems. If Marketing Cloud is the DBOR, create an automation that sends an updated file to an external FTP. This can also be done via API.
Inactive Subscribers
  • Deliverability issues caused by email service providers (ESPs) flagging your account
  • Try a reengagement campaign. It's worth trying to get unengaged subscribers engaged with your emails. If they don’t respond, remove them from your list.
  • Change email frequency. Send your inactive subscribers fewer messages.
  • Test subject lines. Consider trying different subject lines.
  • Test an offer. Try sending a winback email with a great offer.
Opted Out Subscribers
  • Processing times increase when the system has to review opted out email addresses before a send
  • Scrub lists. Remove opted out subscribers from your sending data extension.
MC Connect Sync
  • Processing times increase when the system has to sync contacts that won’t be used in Marketing Cloud
  • Delete contacts. Any contacts that aren’t needed in Sales Cloud should be deleted before syncing with Marketing Cloud.
  • Adjust sync settings. Only sync contacts or information that will be used in sends or personalization.

All About the Data

Clean data is great. Fast processing of that data is even better. So let’s review some best practices in data storage that can help with account optimization and email processing times.

  • Establish naming conventions and limit the number of data extensions in your account.

While there is no set limit to the number of data extensions in an account, an excessive amount can push account performance for both the UI and API activities. Implement naming conventions to reduce unnecessary data sources and to tidy up your account. Your naming conventions can reflect how you organize your data—either by campaign, date, or whatever makes sense to your organization. Just be consistent and review these often.

Suggestion: To help identify data extensions that are temporary, use a naming convention, like _DELETEME at the end of the extension name for easy cleanup. Another suggestion would be to identify data extensions that are for tests versus final, for example _QA or _FINAL.

  • Limit the amount and content types of stored data.

To improve SQL performance and data processing times, limit the amount of data stored in the system. Only store data you use for campaign segmentation and personalization. And only store sensitive and identifying data (like date of birth or Social Security number) if it is essential. Just because you can import any type of data into Marketing Cloud, doesn’t mean you should.

  • Limit column length and use the correct type of data extension.

Data extension columns should be set based on the maximum size of data to be stored in them. And use the correct type based on the data stored—for example, use a date type versus a text type if storing a birthday. One exception: SubscriberKey is stored in the system as a text column, even if you are using a numeric value. 

Example: When storing a two-digit state code, the column should be limited to 2 characters rather than leaving the default of 50 characters. 

  • Limit overall table size.

For peak performance, keep the total byte size of the table under 8,000 characters. Byte size is determined by the sum of all column lengths in the data extension. 

Example: So 20 columns of 100 characters is fine, while 8 columns of 1,000 characters exceed this limit.

  • Create a primary key for any data extensions that are updated or added to.

The primary key is used to define a unique row in this particular data extension. Primary keys are required for imports or queries that are not using the overwrite option. One exception: Do not add primary keys to triggered send data extensions because your systems may attempt to resend the same data twice during a retry.

Examples: Subscriber_Key for sendable data extensions; Product_ID for a product lookup table; ZipCode for a region lookup table based on zip codes

  • Only update with new data versus a full data refresh.

To help improve data processing time and reduce file size, use the add and update data extension functions versus a data overwrite.

Example: Cumulus Bank uploads a daily file of customer activity into Marketing Cloud. To help with processing, the company’s digital marketer requests its IT team to provide a smaller file each day that only includes the delta or changed records. 

  • Use shared data extensions across business units.

Rather than query or create multiple copies of data that can cause inconsistencies, use shared data extensions to make data available to other business units. 

Example: Cumulus Bank’s parent account (its corporate account) has a shared data extension for high-risk customers that is shared with all business units (bank branches). The corporate account maintains that data extension with the newest information so that bank branches have the most up-to-date information when they need it. 


Want to learn more about sending optimization? Review the Email Send Speed Optimization badge.

Data Retention

One of the easiest options for ongoing data hygiene is to create a retention plan to limit the number of data extensions in your account and the amount of data you store. When you create a data extension in Marketing Cloud, you can choose how you want to apply data retention by selecting to delete specific data or the entire data extension. 

Follow this advice. 

  • Allow data to expire when it is no longer needed.
  • Automatically remove data extensions that are no longer needed regularly.

Learn more about managing your data in the module, Marketing Cloud Data Management.  

Prepped and Ready

In addition to saving your company money, reviewing data hygiene and storage is an important aspect of emergency preparedness. It’s like checking the expiration date of the food in your emergency kit. Just remember to focus on the following key principles.

  • Evaluate your account regularly. Conduct an audit of your Marketing Cloud account and revisit it often.
  • Commit to data hygiene. Remove data that is incorrect, incomplete, improperly formatted, or duplicated in your company’s database of record (DBOR).
  • Scrub your send list. Make sure to review your subscriber list regularly and only send to engaged subscribers.

Great job, prepper! You have reviewed your account, identified risks that need to be addressed, and are prepared for an account emergency. Better yet, let’s hope your diligence avoids emergencies altogether! 


Keep learning for
Sign up for an account to continue.
What’s in it for you?
  • Get personalized recommendations for your career goals
  • Practice your skills with hands-on challenges and quizzes
  • Track and share your progress with employers
  • Connect to mentorship and career opportunities