Skip to main content

Explore the Data Set

Learning Objectives

After completing this unit, you’ll be able to:

  • Set up a Tableau Public account.
  • Explain the raw data used in this trail.
  • Define key terminology about the Tableau data model.
Note

This module is for training purposes only. It does not indicate Love Productions, Ltd., relating to The Great British Bake Off, endorses Salesforce or its services.

Learn Data Analysis with Tableau Public

When it comes to using Tableau, the best way to learn data analysis is to analyze data. And the best way to analyze data is to have questions you’re curious about. This module is part of a trail where you learn how to use Tableau. And we whipped up something appetizing—a data set that focuses on The Great British Bake Off (or Great British Baking Show).

You don’t need to be a fan of the show or a baker to understand the data, of course. Just know it’s a reality show where contestants—the bakers—face three challenges per episode: the Signature Bake, the Technical Challenge, and the Showstopper. At the end of each episode, based on how they did in those three bakes, a baker is eliminated, and another baker is named Star Baker. The final episode is a showdown between the top three bakers.

So let’s see what’s on the menu today! Let’s first go over the equipment you need (a Tableau Public account), the ingredients (details about the data set), and techniques to get familiar with (data relationships).

Prepare Your Equipment: Tableau Public

The Tableau interactive units you work on use Tableau Public’s web authoring interface as the playground. Once you connect your account, there’s no need to go to another tab or switch to another window. You’re right there! So let’s get you ready by creating a Tableau Public account.

If you already have a Tableau Public account, feel free to skip to the next section.

A Tableau Public account is completely free. For a more in-depth introduction to Tableau Public, check out Data Storytelling with Tableau Public.

  1. Sign up for a Tableau Public account.
    • Fill in the form with your information, including a strong password.
    • Then, click CREATE MY ACCOUNT.
    • Make sure you have your username and password handy as you use them to connect your playground in the next unit.
    • You’ll receive an activation email from @tableau.com.
  2. Activate your account via the email you receive. This is necessary before you can link your account to the playground.

By creating an account, you're joining a community of inspiring data enthusiasts there to support you on your learning journey.

Gather the Ingredients: The Data Set

As any good baker knows, it's important to have all your ingredients assembled before you start mixing.

  1. Download the data files zip you'll use throughout the trail.
  2. Extract the files so they’re easy to access.

There are five .csv files:

  • Bakers
  • ChallengeBakes
  • Episodes
  • Outcomes
  • Seasons

Let's explore what kind of data is in each table. For a full data dictionary, refer to the data dictionary on Tableau Public.

The Bakers table has 5 fields and 168 rows. It contains data about each contestant, such as their name, age when they were on the show, and a link to a headshot.

A spreadsheet view of the Bakers.csv table with the field names highlighted.

The ChallengeBakes table has 8 fields and 1003 rows. It contains data about what each contestant baked in that episode and how they did in the Technical Challenge.

A spreadsheet view of the Challenges.csv table with the field names highlighted.

The Episodes table has 12 fields and 134 rows. It contains information about each episode, such as what baking challenges were assigned and the theme.

Note

Since this is a training data set, we’ve included made-up metrics about viewership and ratings.

A spreadsheet view of the Episodes.csv table with the field names highlighted.

The Outcomes table has 9 fields and 964 rows. It contains information about how each baker did in each episode. The final episodes are not present as they have a different outcome format.

A spreadsheet view of the Outcomes.csv table with the field names highlighted.

The Seasons table has 10 fields and 56 rows. It contains information about the judges, hosts, and winner, and the network and how the season was listed on various streaming platforms. Each season has four rows because the structure of the Hosts and Judges columns means there are four unique combinations of Host and Judge per season.

A spreadsheet view of the Seasons.csv table with the field names highlighted.

Review Your Techniques: Joins and Relationships

Because the data is stored in multiple tables, you need to build a data model that tells Tableau how the tables connect to each other. There are several options to combine tables, including unions, various kinds of joins, and relationships.

Unions merge tables of data by adding new rows across the same column structure. A new column is added to track the original table names.

This is like using the append operation in Excel to add new data to the bottom of a spreadsheet.

Two tables with three rows of data, and their unioned result showing six rows, and a new column indicating the original tables’ names

Joins merge tables of data by adding new columns. This is similar to a vlookup in Excel.

Two tables of data and four sample result tables illustrating an inner join with fewer rows, an outer join with nulls in two columns, and a left and right join with nulls

In this example, the tables are joined on the Name column. If the name is the same in both tables, the rows match.

  • Inner joins keep rows that have the same name in both tables. The joined table has no nulls and drops rows with mismatched values.
  • Outer joins keep all rows from both tables with nulls for mismatched names. No rows are dropped.
  • Left joins keep all rows from the left table and bring in columns from the right table with nulls for mismatched names. Rows for names that are only in the right table are dropped.
  • Right joins keep all rows from the right table and bring in columns from the left table with nulls for mismatched names. Rows for names that are only in the left table are dropped.

Relationships are the default method to combine data in Tableau. Relationships are built using relationship clauses that establish which fields connect which tables. (If you’re familiar with joins, the relationship clause is analogous to a join clause.)

Setting up a relationship defines how two tables could be joined, but rather than merging the tables right away like a join or a union, a relationship simply holds the information. A related data source stays very flexible and dynamic because the data isn't combined ahead of time into a single, fixed configuration. As you do your analysis, Tableau uses that relationship information to automatically create the necessary joins behind the scenes as you use the data source. Pretty cool, huh?

Preview the Data Model

Here's a sneak peek at the data model you build in the next few units:

A schematic showing the relationships between the tables in the data set. Episodes is to the far left, with two branches. One is a relationship to Seasons on the Season field. The other branch from Episodes is to ChallengeBakes on the Season and Episode fields. ChallengeBakes also has two branches. One is a relationship to the Bakers table on both the Baker and Season fields. The other is to Outcomes on the Baker field and SeasonEpisode field.

Don't worry, you'll build it step by step as you go.

In the next unit, you get hands-on with these techniques in Tableau Public. Make sure you have your Tableau Public account credentials ready so you can link your playground.

Note

Keep in Mind for This Module

In the following interactive units, you get hands-on with Tableau Public. There are a few differences between Tableau Public’s free capabilities and the capabilities in the purchased editions of Tableau. But they’re mainly about what data sources you can use, privacy and governance, and between authoring on the web versus in Tableau Desktop. The interactive units may contain visuals or directions that aren’t a perfect match for other parts of the Tableau platform, and the resources here may refer to functionality and UI that doesn’t exactly match Tableau Public.

Resources

Share your Trailhead feedback over on Salesforce Help.

We'd love to hear about your experience with Trailhead - you can now access the new feedback form anytime from the Salesforce Help site.

Learn More Continue to Share Feedback