📢 Attention Salesforce Certified Trailblazers! Maintain your credentials and link your Trailhead and Webassessor accounts by April 19th. Learn more.
close

Kick Off the Data Journey

Learning Objectives

After completing this unit, you’ll be able to:
  • Explain what a dataset is and the advantages it brings to data exploration.
  • Describe the high-level process of creating a dataset.
  • Identify your data requirements and plan your data integration.

Becoming an Analytics Data Expert

You’re the Salesforce admin at DTC Electronics, and the VP of Sales just called you with an urgent request. The sales leadership team needs information and they need it fast: The new CEO of DTC just called a meeting for next week to review sales performance. The VP of Sales wants a set of dashboards showing performance by rep and region that she can use to run the meeting.

At a high level, your job is to take the data that the VP of Sales needs from wherever it is and put it in a dataset.

Source data to dataset

Usually, data integration like this is an onerous task, requiring the skills of a data scientist. Analytics makes the work much easier, giving Salesforce admins a powerful way to bring data in—even without that PhD in data science!

This module explains the main tools for bringing data into Analytics and preparing it for use. You walk through the process of bringing both external and Salesforce data into Analytics. You also learn how to prepare the data so you can build those dashboards that sales leadership needs—and so they can later explore the data on their own.

First, let’s cover some key concepts you need to understand to become an Analytics data expert.

What Is a Dataset?

Think of a dataset as a box of data. Here at Salesforce we see it as a hexagonal purple box, but hey, that’s just us. Anyone with access to this box can open it and explore the data.

A dataset can contain data from a single Salesforce object, such as opportunities. Or it can contain data combined from different objects, such as opportunities, accounts, and users, and data from external sources, such as financial data. You can also create a dataset by combining data from other datasets, each in turn containing data from multiple sources and datasets. Wow!

If you’ve used Salesforce before, you know you can go to the report builder, build a report, and view it right away. No dataset required. So why do you need a dataset in Analytics? There are several great reasons.

Datasets Are Faster

Data geeks would say that Salesforce data is normalized, which means that related data in different objects is “joined” together. When you run a Salesforce report, the report engine has to pull in data from these objects as it assembles the report. Joining data is OK for a few thousand rows, but if you’re dealing with millions of rows, this process can take a while.

Analytics datasets pull all the data together first, so it’s almost instantaneously available when you open an Analytics dashboard. Those same data geeks would say that Analytics data is denormalized, meaning that some of the preprocessing work is already done.

Datasets Make Queries Faster

Analytics datasets compress and index their contents, so querying is super fast.

Datasets Combine Salesforce Data and External Data

Analytics datasets let you combine such things as your Salesforce opportunity data with financial, quota, or demographic data from another system or source. This type of data combination is not possible with a Salesforce report without doing a ton of work with custom objects.

How Do You Create a Dataset?

There are two main stages in creating a dataset: extraction and preparation. Extraction, as painful as it can sound, is simply the process of bringing data into Analytics. Preparation involves getting that data into a form that’s meaningful to the people exploring it. To compare the process to cooking, extraction is taking the ingredients from your cupboards, and preparation is putting them together to make, say, a stew.

Extract and prepare

There are various ways you can extract data into Analytics. You can bring in external data through a CSV file using the CSV uploader, or bring it in using connectors or the Analytics API. For Salesforce data, you can use a powerful tool called the dataflow. For preparation, you can also use the dataflow—or you can get cooking with the dataset recipe tool.

Extract and prepare tools

You extract and prepare data later in this module, where we tell you more about these tools.

Plan the Data Journey

Now that you’re familiar with the key concepts, let’s talk about the planning you do before you try to bring data into Analytics. There are two steps to planning: identifying your data requirements and mapping the data’s journey through Analytics.

Step 1: Identify Your Data Requirements

When you identify data requirements, think about what data you need, where it’s located, and if it needs to be combined with other data. Fortunately, your VP of Sales has provided a list of fields that the sales leaders want to see when they explore opportunities in Analytics.

Name Account Name SIC Description
Created Date Industry Opportunity Owner Name
Close Date Billing City Opportunity Owner Title
Amount Billing Country
Stage SIC Code

Diligent admin that you are, you spend some time poking around in Salesforce. You soon realize that these fields don’t all come from a single object. There are opportunity fields, yes, but you also need to pull in fields from the Account and User objects. And SIC Descriptions? That data isn’t in Salesforce. Looks like you have to get it from somewhere else. When pushed, the VP of Sales recommends that you reach out to the Sales Operations team. As luck would have it, they can give you a CSV file of SIC code descriptions.

Your poking around has paid off. You’ve identified the fields you need and where they’re going to come from.
Opportunity Account User CSV File
Opportunity Icon Account Icon User Icon CSV Icon
Name Name Full Name SIC Description
Created Date Industry Title
Close Date SIC Code
Amount Billing City
Stage Billing Country

Step 2: Map the Data Journey

Once you’ve identified each source of data, you can start thinking about the journey: What route will the data take and when will it start? Timing is important here, as you have data arriving from various sources that you need to combine. If data isn’t available at the right time, you can’t combine it. To help you, Analytics lets you schedule your extracts and preparations so you extract the data when it’s freshest and have it when you need it for preparation. The act of mapping the data journey also breaks down the process into a series of steps for you to follow.

Since we haven’t yet covered the extract and prepare tools, we’ve mapped the data journey for you for this example. The map shows you steps you take as you go through this module.

Because the SIC descriptions are in a CSV file, you use the CSV uploader to extract them. The remaining data is all in Salesforce, so you use the dataflow for these. You can do this in any order, as long as all the data is extracted before you get to the prepare stage.

The dataflow also does some of the preparation for you by joining together the fields from the three Salesforce objects. Finally, you use a dataset recipe to add the SIC descriptions.

Here’s the data journey we’ve mapped for you.

Data Journey Map

When you do your own data integrations in Analytics, we recommend that you map the data journey in a similar way.

OK, so let’s get started. First up, we get you set up with an org so that you can follow along, and then you run through the process of extracting external data.

Try Analytics with a Developer Edition Org

A free Developer Edition org is a safe environment where you can practice the skills you’re learning. You definitely need one as you work through the challenges here on Trailhead.

Important

Important

For this trail, you can’t use an existing Developer Edition org or a Trailhead Playground org. You have to use a special Analytics-enabled Developer Edition org instead. You must sign up for this special Developer Edition because it comes with a limited Analytics Platform license and contains sample data required for this Analytics trail.

Let’s get you set up so you can log in and start working with Analytics.

  1. Go to developer.salesforce.com/promotions/orgs/wave-de.
  2. Fill out the form using an active email address. Your username must also look like an email address and be unique, but it doesn’t need to be a valid email account. For example, your username can be yourname@waverocks.de, or you can put in your company name.
  3. After you fill out the form, click Sign me up. A confirmation message appears.Confirmation message appears, asking you to check your email.
  4. When you receive the activation email, open it and click Verify Account.
  5. Complete your registration, and set your password and challenge question.
    Tip

    Tip

    Write down or remember your credentials. To log in and play, go to login.salesforce.com.

  6. Click Change Password.

    You’ll be logged in to your new Analytics Developer Edition org. If you see the Welcome to Lightning Experience window, close it.

Way to go! You now have a Salesforce org with DTC Electronics sales data.

retargeting