Create Lookups with the Dataset Builder
After completing this unit, you’ll be able to:
- Identify the dataset grain.
- Create a dataset using a lookup.
- Define the appropriate number of base-level datasets.
Welcome to Cloud Kicks
Cloud Kicks is a popular athletic and lifestyle brand for athletes, celebrities, and everyday people who want to stay in shape and look stylish.
Cloud Kicks’ primary goal is to maintain and expand the relationships it’s built. Tableau CRM helps Cloud Kicks with this. Jose Figueroa, the Cloud Kicks sales manager, appreciates Tableau CRM’s reporting and visualizations that enable him to stay on top of activities at each stage of the sales pipeline.Candace is Cloud Kicks’ top sales rep and uses Tableau CRM to prioritize opportunity interactions and nurture them into new accounts. Candace wants to be able to view dashboards and interact with visualizations. She wants to easily get counts of opportunities, amounts while adding filters, groupings, and formulas. Jose’s and Candace’s explorations often require interactions among related objects, such as Account and User. As the Cloud Kicks Tableau CRM admin, you want to make their work easier so they don’t need to wait for you or someone on your team to create the lenses they need.
In this module you go through the steps to build the datasets that will help Jose and Candace. It’s important that the datasets contain accurate representations without any duplication or omission of important information.
Get Familiar with Building Datasets
A Tableau CRM dataset is a collection of related data that can be created from many sources, including Salesforce objects, external data sources, and other datasets. Datasets are the foundation to analysis, dashboards, models, and predictions. Building datasets according to an expected business need is a critical step. Tools that you use to create datasets include the Dataset Builder, which incorporates the instructions for building datasets using a dataflow, and Data Prep, which cleans and transforms data using a recipe.
This module teaches best practices for creating a dataset and assumes you’re familiar with dataset construction using both of these methods. For step-by-step, hands-on practice with Data Prep and the Dataset Builder, try the Tableau CRM Data Integration Basics and Integrate Local and Remote Data in Tableau CRM modules also in Trailhead.
What Is Your Dataset About?
You and your team meet to discuss what you’ve done to support Jose and Candace. You come away with learning that they’re all about analyzing opportunity pipeline, such as amount by account, owner, and close date.Sales’ data comes from Salesforce objects that include Opportunity, Cases, Accounts, Campaigns, and so on.
- Defining what the new dataset is about. You decide to select the Opportunity object and its fields as the grain.
- Linking the Account object to Opportunity so Jose and Candace can analyze insights by customer accounts. For this step, you add the Account object and selected fields based on the AccountId, a field that Account and Opportunity share.
- Linking the User object to Opportunity so Jose and Candace can view activity by opportunity owner. For this step you’ll add the User object plus selected fields based on the OwnerId field, a field that User and Opportunity share. You could also add the account owner by linking the Account and User objects.
It’s important to establish the correct relationships among datasets to get the expected outcome. Dataset relationships can be implemented as either lookups or joins. However, don’t expect both operations to result in the same conclusion. It’s important to understand what each transformation delivers.
Creating lookups with the Dataset Builder
A lookup, for our purpose, is to find a single matching record. Opportunity and Account are used in this example, with AccountId as the common key between both objects.
The recipe starts with one Opportunity row with AccountId as its unique identifier.
The row initiates a “lookup” to one (and only one) matching AccountId from the Account object (right side of the lookup). If there are more than one matching Account record for that particular AccountId, the lookup brings back one random record and no more for a single lookup.Since a dataset’s accuracy starts with selecting the object that meets your objective as the grain, it’s important to understand the role each object has in your lookup task. The left side object defines the grain, and it remains unique as more and more fields are added from related objects or datasets coming from the right side of the lookup. Here’s an example of the final result of a dataset. Remember we started with the Opportunity object as the lowest grain.
The count of rows and sum of amount are both true values since every row of the Opportunity dataset is a unique opportunity.
Why the Grain Matters
Let’s see what would happen if a right dataset is a lower grain or different grain from the left dataset in a single lookup?
Consider the following scenario where we want to augment Opportunity to Accounts. A lookup is initiated from one Account row (on the left) to multiple matching Opportunities (on the right).
By looking at the datasets you can see there is:
- One matching Opportunity row for Abbott358 Inc in the Opportunity dataset.
- Two matching Opportunity rows for account Adkins907 Inc.
- Three matching Opportunity rows for Aguilar870 Inc.
Here is the result of a single lookup from the Account dataset to Opportunity dataset:
Do you notice what happened?
- For account Abbot358 Inc, we got the one matching opportunity row.
- For account Adams25 Inc, we had no matching opportunity row, hence we kept the initial grain (the account Adams25 Inc), and all the opportunity values at null or zero.
- For account Adkins907 Inc, we got only one row (out of the two). We lost an opportunity!
- Similarly, for account Aguilar870 Inc, we got only one row (out of three) and lost the details of the other two opportunities!
This is why you have to be very careful with lookups to datasets that have a different or lower grain. Later in this module, we discuss the join as another possible approach.
Data for the Task or Ask
It’s perfectly acceptable to have multiple “base level” datasets for each of the main objects used by the business. For example, Cloud Kicks will need a base dataset for Opportunity, User, and Account to start with. From there you’ll determine how to combine them in the data layer or as needed in the dashboard layer.
Let’s say you’ve been asked to build datasets to help your users answer queries such as, opportunities by account and products by account. How many base objects are needed as shown in Illustration 1.
You’ll have more options and flexibility with four base-level datasets, as shown in Illustration 2. If a user requires comprehensive Products analysis, the red circle illustrates how you can create a new Products dataset using related data by way of lookups from the Opportunity, Account, and User datasets.
The green circle illustrates the needed datasets for an Opportunities dataset using lookups from Account, and User. The blue circle is an instance when a user wants to do basic analysis just on all Accounts only, and the purple circle is just for Users.
If your user is also requesting an all Cases dataset by user and account, the number of base datasets increases to five, as highlighted by the magenta dotted line in Illustration 3.
Starting with the lowest grain, or what the dataset is about, ensures an accurate target dataset with unique rows. Lookups return all rows from the lowest grain and a single, matching row from a related dataset. Evaluate results when using a lookup between datasets that have a different grain to avoid skewed results.