Query Data with SQL
Learning Objectives
After completing this unit, you’ll be able to:
- Define SQL and why it is used.
- Use SQL within Marketing Cloud Engagement.
- Identify best practices for using SQL.
What Is SQL?
You’ve moved in to Marketing Cloud Engagement, unpacked, and even started to arrange your space. Now for the finishing touches. As we mentioned earlier, Marketing Cloud Engagement data extensions and Contact Builder use a relational database. So, you need a good way to get to all of that rich data. Enter: Structured Query Language (or SQL), a domain-specific language that can do just that. In Marketing Cloud Engagement, an SQL query activity is used to execute queries and retrieve data for reporting or segmenting audiences.
Use SQL in Marketing Cloud Engagement
First, let’s review how a few specific SQL commands are used in Marketing Cloud Engagement.
- SELECT: command to locate data on a database
- FROM: location where data reside within Marketing Cloud Engagement (usually a data extension)
- JOIN: allows the query to search multiple tables and/or data extensions
- WHERE: used to filter out data you do/don’t want
Here’s an example.
SELECT emailaddress as ‘Email_Address’, favoritecolor as ‘Favorite_Color’ FROM [MasterData Extension] m INNER JOIN JuneMarketingSend j ON m.emailaddress = j.emailaddress WHERE m.emailaddress is not NULL
Need a translation? Running this query retrieves email and favorite color from the master data extension, and joins it with any matching email addresses from the June marketing send. Additionally, it only returns emails that are not missing. Also notice the m and the j. These are SQL aliases that are used to give a table, or a field in a table, a temporary name. This is especially helpful when you have a common field name between data extensions, like CustomerID or EmailAddress. You can also see it written like the following.
SELECT o.OrderID, o.OrderDate, c.CustomerName FROM Customers AS c, Orders AS o WHERE c.CustomerID=o.CustomerID
SQL Data Views
System data views are Salesforce-created queries that you can use to find information about your subscribers. Just remember, you won’t be able to make changes to these precreated data views. Here are some of the most commonly used views.
Data View |
Query |
Information |
---|---|---|
Bounce |
_bounce |
Identify and then suppress commonly bounced email addresses from sends. |
Click and Open |
_Click _Open |
Query these data views in Automation Studio to view click and open data for emails from your Marketing Cloud Engagement account. Helps to identify additional messaging opportunities by indicating subscriber engagement on a specific JobID. |
Complaint |
_Complaint |
Query this data view in Automation Studio to view complaints data related to emails from your Marketing Cloud Engagement account. Use to prune your lists to ensure more accurate audiences and improve deliverability. |
Journey |
_Journey |
Find a journey's status, created and last modified date, and other general journey information using this Journey Builder data view. |
So how do you use these data views? First create a data extension to store the data you need from these views, then navigate to Automation Studio to create a query activity based on the desired data view table. After you complete and run the automation, the output is stored in your created data extension.
Here is a query example that pulls data from the complaint data view for the last 6 months and provides SubscriberKey, JobIDs, and Bounce Reasons.
SELECT SubscriberKey,JobID,SMTPBounceReason FROM _Bounce
Joining Data with SQL
Beyond data views, SQL queries can also be customized to solve for a variety of business needs. Say you want to send an email that has data stored in the send log and data found in your customer data extension. To craft these queries, you’ll want to understand the concept of joins—or taking data from one source and joining it with data from another source.
SQL joins can be complex, so let’s review some join scenarios using a Venn diagram.
Let’s start with the top row, known as the left and right joins.
Diagram 1: Left Outer Join
You want all records from data extension A, plus matching records in the data extension B.
SELECT * FROM DataExtension A LEFT JOIN DataExtension B ON A.Field = B.Field
Sample Result
Data Extension A |
Data Extension B |
||
---|---|---|---|
Name |
Email |
Name |
Email |
Joe |
joe@email.com |
Joe |
joe@email.com |
Jenn |
jenn@email.com |
null |
null |
Justin |
justin@email.com |
null |
null |
Diagram 2: Left Outer Join with Exclusions
In this join, we exclude the records we don’t want from data extension B.
SELECT * FROM DataExtension A LEFT JOIN DataExtension B ON A.Field = B.Field WHERE B.Field IS NULL
Sample Result
Data Extension A |
Data Extension B |
||
---|---|---|---|
Name |
Email |
Name |
Email |
Joe |
joe@email.com |
null |
null |
Jenn |
jenn@email.com |
null |
null |
Justin |
justin@email.com |
null |
null |
Diagram 3 and 4: Right Joins
As for the right joins, they’re the reverse of the previous scenarios.
Diagram | Code |
---|---|
3 |
SELECT * FROM DataExtension A RIGHT JOIN DataExtension B ON A.Field = B.Field |
4 |
SELECT * FROM DataExtension A RIGHT JOIN DataExtension B ON A.Field = B.Field WHERE A.Field IS NULL |
Now let’s look at inner and outer joins.
Diagram 5: Inner Join
This query returns records that have matching values in both data extensions.
SELECT * FROM DataExtension A INNER JOIN DataExtension B ON A.Field = B.Field
Sample Result
Data Extension A |
Data Extension B |
||
---|---|---|---|
Name |
Email |
Name |
Email |
Joe |
joe@email.com |
Joe |
joe@email.com |
Jonathan |
jonathan@email.com |
Jonathan |
jonathan@email.com |
Diagram 6: Full Outer Join
Full outer joins return all matching records from both data extensions.
SELECT * FROM DataExtension A FULL OUTER JOIN DataExtension B ON A.Field = B.Field
Sample Result
Data Extension A | Data Extension B | ||
---|---|---|---|
Name |
Email |
Name |
Email |
Joe |
joe@email.com |
Joe |
joe@email.com |
Jenn |
jenn@email.com |
null |
null |
Justin |
justin@email.com |
null |
null |
null |
null |
Jennifer |
jennifer@email.com |
Jonathan |
jonathan@email.com |
Jonathan |
jonathan@email.com |
Diagram 7: Outer Join
Returns all matching records from both data extensions, minus the records we don’t want.
SELECT * FROM DataExtension A FULL OUTER JOIN DataExtension B ON A.Field = B.Field WHERE A.Field IS NULL OR B.Field IS NULL
Sample Result
Data Extension A |
Data Extension B |
||
---|---|---|---|
Name |
Email |
Name |
Email |
Joe |
joe@email.com |
null |
null |
Jenn |
jenn@email.com |
null |
null |
Justin |
justin@email.com |
null |
null |
null |
null |
Jennifer |
jennifer@email.com |
null |
null |
Jonathan |
jonathan@email.com |
See SQL in Action
Cloud Kicks has decided to send an exclusive offer to customers who have opened an email in the last 6 months and have made a purchase during that time. Solution Architect Maggie Quinn needs to complete this task by using a SQL query. Let’s follow along.
Maggie starts by creating a data extension to store the information about the customers that meet this criteria. She then navigates to Automation Studio and follows these steps.
- Click New Automation.
- Select a starting source from Schedule or File Drop.
- Drag SQL Query to workflow and click Choose. (You can select an existing query or create a new query.)
- Select Create New Query Activity.
- Add query properties: name, external key, folder location, and description.
- Next, build your SQL query.
Maggie uses this query.
SELECT c.EmailAddress, c.CustomerID, c.First_Name FROM Customers c INNER JOIN Purchases p ON c.CustomerID = p.CustomerID WHERE [Purchase Date] > DateAdd(month, -6, GetDate()) AND [Open Date] > DateAdd(month, -6, GetDate())
Once completed, she follows the remaining steps.
- Click Validate Syntax.
- Choose the data extension created to store the query results.
- Choose the data action the query activity performs: Append, Update, or Overwrite.
- Click Save.
Once she is ready, Maggie can run the automation.
Common SQL Queries
Parent Accounts
You can query data extensions in the Shared Data Extension folder from the parent account by adding the prefix ent. to the data extension name in the query.
SELECT EmailAddress FROM ent.MasterDataExtension
All the Data
Sometimes you want everything.
SELECT * FROM MasterDataExtension
Use SELECT *
very carefully as this can cause slowness in the system as it is processing a large amount of data. The larger the dataset, the greater time and effort the system has to use to obtain the data. Queries time-out after 30 minutes, and the more data to process, the more likely the query can time out. To help with this, it is more effective to provide the exact fields you are looking for in the query.
SELECT field1, field2, field3 FROM MasterDataExtension
SQL is something you can spend years mastering, but for now, you have the basics. We hope you—and your data—are feeling settled in your new home. And we can’t wait to see what you achieve with the power of your Marketing Cloud Engagement data!