Start tracking your progress
Trailhead Home
Trailhead Home

Extract Salesforce Data into Analytics

Learning Objectives

After completing this unit, you’ll be able to:
  • Explain what a dataflow is.
  • Create a dataset with Salesforce data using the dataset builder and the dataflow.
  • Monitor the dataflow and verify a new dataset.

Extract Salesforce Data Overview

You just took the first step to get the data you need into Analytics for the sales leadership team. They want to present their performance results when the executive team meets with the new CEO. But the CSV data is just a small part of what they need. Most of the data that interests them is already inside Salesforce—in the Opportunity, Account, and User objects. Now you bring this Salesforce data into Analytics.

Here’s where you are on the data journey.

Data journey map with the Salesforce object extraction process highlighted

Say Hello to the Dataflow

You use the dataflow to extract data from Salesforce objects. The dataflow is a set of instructions in JavaScript Object Notation (JSON) that runs to extract data and create datasets. These instructions specify which objects and fields you want to extract data from and the names of the datasets you want to create. The dataflow also has other uses, such as joining data together, but for now we just focus on its extraction skills.

Maybe you’re thinking, JSON—ack! Do we have to write code? The good news is that you don’t really need to know anything about JSON. Analytics has tools that write these instructions for you, and you’re going to be using one of them.

A dataflow is not single use. You can use it to create lots of datasets from lots of different objects at the same time. You can also schedule it to run regularly to keep the datasets up to date. Since there’s a chance the dataflow is already in use, it’s a good idea to make a backup before you add new instructions. The dataflow in your Developer Edition org isn’t already in use, but let’s back it up anyway to see how it’s done.

  1. In Analytics, click the gear icon ( Gear icon) and then click Data Manager. The data manager opens in a new browser tab.
  2. In the data manager, click the Dataflows & Recipes tab.
  3. On the right of the Default Salesforce Dataflow, click Dataflow menu button in data manager and select Run Now. This runs the dataflow.Monitor dataflow view
  4. Click Monitor to check the dataflow run's progress. Uh oh—the Default Salesforce Dataflow run failed. Based on the error message, Data Sync has not pulled fresh data from Salesforce into Analytics. Let's manually sync the data.
    1. Click the Connect tab. 
    2. To the right of the SFDC_LOCAL connection, click Dataflow menu button in data managerand select Run Now. This runs Data Sync for the entire connection, pulling data for all the listed objects from your local Salesforce org into Analytics.
    3. Click the Monitor tab to track progress of the data sync. Refresh the list until all objects sync with a Successful status message. Now there is a fresh copy of the Salesforce object data in Analytics for the dataflow to use. Let's get back to the dataflow!
  5. Click the Dataflows & Recipes tab.
  6. On the right of the Default Salesforce Dataflow, click Dataflow menu button in data manager and select Run Now.
  7. After the dataflow run completes, click Dataflow menu button in data manager again and select Download.
  8. Save the JSON file locally and keep it as a backup of your existing dataflow. To go back to this version of the dataflow later, repeat these steps and choose Upload in step 3.
  9. Click the gear icon ( Gear icon) and then click Analytics Studio.

Create a Dataset with the Dataset Builder

Dataset builder—sounds like something that builds datasets, right? Well, kind of. In reality, the dataset builder generates the JSON instructions needed to build a dataset and adds the instructions to your dataflow. The dataflow then does the actual building.

Before you use the dataset builder, consider how the Salesforce objects that you’re extracting are related. As a Salesforce admin, you know that opportunities have lookup relationships to accounts and users. When you create opportunity records in Salesforce, you're entering opportunity field values; you’re also “looking up” field values in the related account and user records.

Object relationships
Opportunity record

When you create a dataset you do something similar, but instead of records, you create rows. To help you, Analytics first asks you for the root object. This is simply the lowest object in the hierarchy of objects you’re extracting. In our case, the Opportunity object is the root object. You can only include related objects above the root object in the dataset. For example, if you select Account as the root object, you can include related objects such as User and Parent Account, but not Opportunity because it’s lower.



Data geeks refer to the root as the grain of a dataset—the unit of data in each row. In our dataset, each row is an opportunity, so the opportunity record is the grain. It’s an important concept when you’re joining data together from different sources, as you discover later.

  1. In the Analytics Studio, click Create and select Dataset.
  2. Click Salesforce Data.
  3. Enter the dataset name. Be as descriptive as you can here so that other people know what you’ve created. Name this dataset Opportunities with Accounts and Users.
  4. From the Select dataflow... picklist, select Default Salesforce Dataflow. The dataflow that you select here will create the dataset for you. If your org has more than one dataflow, you can choose which one to use.
  5. Click Next. You see the dataset builder with a list of Salesforce objects. The object you select here is the root object.
  6. In the Pick a Salesforce object to start search box, enter opp. The list shows matching objects.
  7. In the object list, click Opportunity. The root object you selected appears on the dataset builder canvas.
  8. Hover over the object and click the plus (+). Opportunity root object You see a list of the object’s fields.
  9. Click to select each field that you need. If you can’t see a field, start entering its name in the Search by name or metadatasearch box to filter the list. Select these fields:
    1. Amount
    2. Close Date
    3. Created Date
    4. Name
    5. Stage
  10. By default, the fields are listed in alphabetical order by name. To reverse the order, click the NAME column header, or to sort by field type click the TYPE column header.
  11. To open a list of objects related to your root object, at the top of the field list, click the Relationships tab. This is where you can select the fields you need from the related accounts and users. Select relationships Remember, only related objects above the root object are available here. If you select a different root object, you see a different list of related objects.
  12. In front of the Account ID relationship, click Join. You see the Account object on the dataset builder canvas. Account relationship
  13. In front of the Owner ID relationship, click Join. You see the User object on the canvas.

The next step is to select the fields you want from each of the related objects.

  1. To open a list of account fields, hover over the Account object and click the plus (+). Related objects in dataset builder
  2. Select these fields:
    1. Account Name
    2. Billing City
    3. Billing Country
    4. Industry
    5. SIC Code
  3. Notice the Relationships tab at the top of the field list. Use this if you wanted to include fields from objects related to the account object, such as Parent Account.
  4. To hide the field list, to the right of the Account object, click Close field list cross.
  5. Repeat steps 1–4 to select these fields from the User object:
    1. Full Name
    2. Title

Let’s quickly recap what you’ve done here. You selected your root object, Opportunity, and the fields you need. You also selected the related Account and User objects and the fields you need from those. When you create the dataset, each row represents an opportunity, with fields from the related account and owner user records.

Opportunity row

Let’s finish up by adding these instructions to the dataflow and creating the dataset.

  1. Click Next. The dataflow that you selected earlier opens in the dataflow editor in the background. Look closely and you can see that your dataset building instructions appear as boxes, or “nodes” in the editor.
  2. From the Select an app for your dataset picklist, select Sales Performance Datasets. You can edit the dataflow here to make adjustments, but yours is ready to go.
  3. Click Create Dataset. If all goes well, you see a notification that the dataset has been queued to be created. Behind the scenes, Analytics runs the dataflow to create the dataset.
  4. Click the Go to the data monitor link and let’s see how the dataflow is doing.

Monitor the Dataflow and Verify the New Dataset

  1. On the Monitor tab of the data manager, click the DATAFLOWS subtab. Dataflows subtab selected on Monitor tab in data manager The Dataflows subtab lets you see just your dataflow jobs.
  2. To refresh the view, at the top-right of the monitor, click Refresh button. The dataflow icon shows the status of the dataflow. If all’s well, it’s a checkmark and you see “Successful” when you hover over it. Dataflow successful If the status shows as Queued or Running, continue refreshing the view until the dataflow finishes.
  3. To see a list of all the times it’s run, click the dropdown in front of the dataflow.
  4. Click the dropdown in front of the most recent run. You see a list of all the JSON instructions the dataflow has performed. Dataflow nodes
Now let’s check the dataset itself.
  1. On the left of the data manager, click the Data tab.
  2. On the right of the Opportunities with Accounts and Users dataset, click Dataflow menu button in data manager and select Explore. If you don’t see the dataset, try refreshing your browser.
  3. On the left of the new lens, under Bar Length, click the Add a measure (+) button. The opportunity Amount field is here as a measure.
  4. Under Bars, click the Add a group (+) button. The Close Date and Created Date are available here for grouping, along with all the other dimension fields that you selected in the dataset builder.

Schedule the Dataflow

When you create a dataset with the dataset builder, Analytics runs the dataflow for you the first time. After this, you can start the dataflow manually or schedule it to run on a recurring basis. Scheduling is a great way to keep the dataset fresh with the latest data from your Salesforce objects.

  1. In the data manager, click the Dataflows & Recipes tab.
  2. On the right of the Default Salesforce Dataflow, click Dataset menu button in data manager and select Schedule. The schedule settings for the dataflow appear. Dataflow schedule settings screen
  3. Schedule the dataflow to run every 24 hours at 12:00 AM each weekday by selecting these settings.
    1. Schedule by: Hour
    2. Start at: 12:00 am
    3. Run every: 24 Hours
    4. Select days: M, Tu, W, Th, F
  4. This schedule ensures that your explorers have fresh data each morning with their coffee. It also prevents the dataflow from running during business hours. You don’t want your users to see differences in dashboards depending on the time of day.
  5. Click Save.


Your dataflow extracts data from your org each time it runs. The extract time is short with this simple dataflow, but will take longer with multiple complex dataflows. You can speed up your dataflows by scheduling the data extract to run beforehand with Data Sync. Schedule the sync on the Connect tab of the data manager. Check out the Resources for a link to instructions.

Great job! You’ve successfully created two datasets. The SIC Descriptions dataset contains data extracted from a CSV file. The Opportunities with Accounts and Users dataset has data extracted from objects in your Salesforce org. And you’ve scheduled the dataflow to ensure that you always have fresh Salesforce data. The next step is to join data from these two datasets. You want all this data available to explore from a single dataset. Onwards to the data preparation stage!