đź“Ł Attention Salesforce Certified Trailblazers! Link your Trailhead and Webassessor accounts and maintain your credentials by December 14th. Learn more.
close
trailhead

Extract Salesforce Data into Analytics

Learning Objectives

After completing this unit, you’ll be able to:
  • Explain what a dataflow is.
  • Create a dataset for Salesforce data using the dataset builder and the dataflow.
  • Monitor the dataflow and verify a new dataset.

Extract Salesforce Data Overview

You just took the first step to get the data you need into Analytics for the sales leadership team. They want to present their performance results when the executive team meets with the new CEO. But the CSV data is just a small part of what they need. Most of the data that interests them is already inside Salesforce—in the Opportunity, Account, and User objects. Now you bring this Salesforce data into Analytics.

Here’s where you are on the data journey.

Data journey map with the Salesforce object extraction process highlighted

Say Hello to the Dataflow

You use the dataflow to extract data from Salesforce objects. The dataflow is a set of instructions in JavaScript Object Notation (JSON) that runs to extract data and create datasets. These instructions specify which objects and fields you want to extract data from and the names of the datasets you want to create. The dataflow also has other uses, such as joining data together, but for now we just focus on its extraction skills.

Maybe you’re thinking, JSON—ack! Do we have to write code? The good news is that you don’t really need to know anything about JSON. Analytics has tools that write these instructions for you, and you’re going to be using one of them.

A dataflow is not single use. You can use it to create lots of datasets from lots of different objects at the same time. You can also schedule it to run regularly to keep the datasets up to date. Since there’s a chance the dataflow is already in use, it’s a good idea to make a backup before you add new instructions. The dataflow in your Developer Edition org isn’t already in use, but let’s back it up anyway to see how it’s done.

  1. In Analytics, click the gear icon (Gear icon) and then click Data Manager.

    The data manager opens in a new browser tab.

  2. In the data manager, click the Dataflows & Recipes tab.
  3. On the right of the Default Salesforce Dataflow, click Dataflow menu button in data manager and select Download.Monitor dataflow view
  4. Save the JSON file locally and keep it as a backup of your existing dataflow. To go back to this version of the dataflow later, repeat these steps and choose Upload in step 3.
  5. Click the gear icon (Gear icon) and then click Analytics Studio.

Create a Dataset with the Dataset Builder

Dataset builder—sounds like something that builds datasets, right? Well, kind of. In reality, the dataset builder generates the JSON instructions needed to build a dataset and adds the instructions to your dataflow. The dataflow then does the actual building.

Before you use the dataset builder, consider how the Salesforce objects that you’re extracting are related. As a Salesforce admin, you know that opportunities have lookup relationships to accounts and users. When you create opportunity records in Salesforce, you're entering opportunity field values; you’re also “looking up” field values in the related account and user records.

Object relationships Opportunity record

When you create a dataset you do something similar, but instead of records, you create rows. To help you, Analytics first asks you for the root object. This is simply the lowest object in the hierarchy of objects you’re extracting. In our case, the Opportunity object is the root object. You can only include related objects above the root object in the dataset. For example, if you select Account as the root object, you can include related objects such as User and Parent Account, but not Opportunity because it’s lower.

Note

Note

Data geeks refer to the root as the grain of a dataset—the unit of data in each row. In our dataset, each row is an opportunity, so the opportunity record is the grain. It’s an important concept when you’re joining data together from different sources, as you discover later.

  1. In the Analytics Studio, click Create and select Dataset.
  2. Click Salesforce Data.
  3. Enter the dataset name. Be as descriptive as you can here so that other people know what you’ve created. Name this dataset Opportunities with Accounts and Users.
  4. From the Select dataflow... picklist, select Default Salesforce Dataflow.
    The dataflow that you select here will create the dataset for you. If your org has more than one dataflow, you can choose which one to use.
  5. Click Next.
    You see the dataset builder with a list of Salesforce objects. The object you select here is the root object.
  6. In the Pick a Salesforce object to start search box, enter opp.

    The list shows matching objects.

  7. In the object list, click Opportunity.

    The root object you selected appears on the dataset builder canvas.

  8. Hover over the object and click the plus (+).
    Opportunity root object

    You see a list of the object’s fields.

  9. Click to select each field that you need. If you can’t see a field, start entering its name in the Search by name or metadata search box to filter the list. Select these fields:
    1. Amount
    2. Close Date
    3. Created Date
    4. Name
    5. Stage
    Tip

    Tip

    By default, the fields are listed in alphabetical order by name. To reverse the order, click the NAME column header, or to sort by field type click the TYPE column header.

  10. To open a list of objects related to your root object, at the top of the field list, click the Relationships tab. This is where you can select the fields you need from the related accounts and users.
    Select relationships
    Note

    Note

    Remember, only related objects above the root object are available here. If you select a different root object, you see a different list of related objects.

  11. In front of the Account ID relationship, click Join.

    You see the Account object on the dataset builder canvas.

    Account relationship
  12. In front of the Owner ID relationship, click Join. You see the User object on the canvas.

The next step is to select the fields you want from each of the related objects.

  1. To open a list of account fields, hover over the Account object and click the plus (+).Related objects in dataset builder
  2. Select these fields:
    1. Account Name
    2. Billing City
    3. Billing Country
    4. Industry
    5. SIC Code
    Tip

    Tip

    Notice the Relationships tab at the top of the field list. Use this if you wanted to include fields from objects related to the account object, such as Parent Account.

  3. To hide the field list, to the right of the Account object, click Close field list cross.
  4. Repeat steps 1–3 to select these fields from the User object:
    1. Full Name
    2. Title

Let’s quickly recap what you’ve done here. You selected your root object, Opportunity, and the fields you need. You also selected the related Account and User objects and the fields you need from those. When you create the dataset, each row represents an opportunity, with fields from the related account and owner user records.

Opportunity row

Let’s finish up by adding these instructions to the dataflow and creating the dataset.

  1. Click Next.

    The dataflow that you selected earlier opens in the dataflow editor in the background. Look closely and you can see that your dataset building instructions appear as boxes, or “nodes” in the editor.

  2. From the Select an app for your dataset picklist, select Sales Performance Datasets.

    You can edit the dataflow here to make adjustments, but yours is ready to go.

  3. Click Create Dataset.

    If all goes well, you see a notification that the dataset has been queued to be created. Behind the scenes, Analytics runs the dataflow to create the dataset.

  4. Click the Go to the data monitor link and let’s see how the dataflow is doing.

Monitor the Dataflow and Verify the New Dataset

  1. On the Monitor tab of the data manager, click the DATAFLOWS subtab.Dataflows subtab selected on Monitor tab in data manager
    The Dataflows subtab lets you see just your dataflow jobs.
  2. To refresh the view, at the top-right of the monitor, click Refresh button.

    The dataflow icon shows the status of the dataflow. If all’s well, it’s a checkmark and you see “Successful” when you hover over it.

    Dataflow successful
    Tip

    Tip

    If the status shows as Queued or Running, continue refreshing the view until the dataflow finishes.

  3. To see a list of all the times it’s run, click the plus (+) in front of the dataflow.
  4. Click the plus (+) in front of the most recent run.
    You see a list of all the JSON instructions the dataflow has performed.Dataflow nodes
Now let’s check the dataset itself.
  1. To return to the Analytics Studio, click the gear icon (Gear icon) and then click Analytics Studio.
  2. At the top of the Analytics Studio, click Datasets.
  3. Click the dataset you just created: Opportunities with Accounts and Users.
    Tip

    Tip

    If you don’t see the dataset, try refreshing your browser.

  4. On the left of the new lens, under Bar Length, click the Add a measure (+) button.

    The opportunity Amount field is here as a measure.

  5. Under Bars, click the Add a group (+) button.

    The Close Date and Created Date are available here for grouping, along with all the other dimension fields that you selected in the dataset builder.

Schedule the Dataflow

When you create a dataset with the dataset builder, Analytics runs the dataflow for you the first time. After this, you can start the dataflow manually or schedule it to run on a recurring basis. Scheduling is a great way to keep the dataset fresh with the latest data from your Salesforce objects.

Let’s get the dataset up and running first, then you can schedule future runs.

  1. In the data manager, click the Dataflows & Recipes tab.
  2. On the right of the Default Salesforce Dataflow, click Dataset menu button in data manager and select Schedule.
    The schedule settings for the dataflow appear.Dataflow schedule settings screen
  3. Schedule the dataflow to run every 24 hours at 12:00 AM each weekday by selecting these settings.
    1. Schedule by: Hour
    2. Start at: 12:00 am
    3. Run every: 24 Hours
    4. Select days: M, Tu, W, Th, F
    This schedule ensures that your explorers have fresh data each morning with their coffee. It also prevents the dataflow from running during business hours. You don’t want your users to see differences in dashboards depending on the time of day.
  4. Click Save.

Great job! You’ve successfully created two datasets. The SIC Descriptions dataset contains data extracted from a CSV file. The Opportunities with Accounts and Users dataset has data extracted from objects in your Salesforce org. And you’ve scheduled the dataflow to ensure that you always have fresh Salesforce data. The next step is to join data from these two datasets. You want all this data available to explore from a single dataset. Onwards to the data preparation stage!

retargeting