Explore the Data Set
Learning Objectives
After completing this unit, you’ll be able to:
- Set up a Tableau Public account.
- Explain the raw data used in this trail.
- Define key terminology about the Tableau data model.
Learn Data Analysis with Tableau Public
When it comes to using Tableau, the best way to learn data analysis is to analyze data. And the best way to analyze data is to have questions you’re curious about. This module is part of a trail where you learn how to use Tableau. And we whipped up something appetizing—a data set that focuses on The Great British Bake Off (or Great British Baking Show).
You don’t need to be a fan of the show or a baker to understand the data, of course. Just know it’s a reality show where contestants—the bakers—face three challenges per episode: the Signature Bake, the Technical Challenge, and the Showstopper. At the end of each episode, based on how they did in those three bakes, a baker is eliminated, and another baker is named Star Baker. The final episode is a showdown between the top three bakers.
So let’s see what’s on the menu today! Let’s first go over the equipment you need (a Tableau Public account), the ingredients (details about the data set), and techniques to get familiar with (data relationships).
Prepare Your Equipment: Tableau Public
The Tableau interactive units you work on use Tableau Public’s web authoring interface as the playground. Once you connect your account, there’s no need to go to another tab or switch to another window. You’re right there! So let’s get you ready by creating a Tableau Public account.
If you already have a Tableau Public account, feel free to skip to the next section.
A Tableau Public account is completely free. For a more in-depth introduction to Tableau Public, check out Data Storytelling with Tableau Public.
- Sign up for a Tableau Public account.
- Fill in the form with your information, including a strong password.
- Then, click CREATE MY ACCOUNT.
- Make sure you have your username and password handy as you use them to connect your playground in the next unit.
- You’ll receive an activation email from @tableau.com.
- Fill in the form with your information, including a strong password.
- Activate your account via the email you receive. This is necessary before you can link your account to the playground.
By creating an account, you're joining a community of inspiring data enthusiasts there to support you on your learning journey.
Gather the Ingredients: The Data Set
As any good baker knows, it's important to have all your ingredients assembled before you start mixing.
-
Download the data files zip you'll use throughout the trail.
- Extract the files so they’re easy to access.
There are five .csv files:
- Bakers
- ChallengeBakes
- Episodes
- Outcomes
- Seasons
Let's explore what kind of data is in each table. For a full data dictionary, refer to the data dictionary on Tableau Public.
The Bakers table has 5 fields and 168 rows. It contains data about each contestant, such as their name, age when they were on the show, and a link to a headshot.
The ChallengeBakes table has 8 fields and 1003 rows. It contains data about what each contestant baked in that episode and how they did in the Technical Challenge.
The Episodes table has 12 fields and 134 rows. It contains information about each episode, such as what baking challenges were assigned and the theme.
The Outcomes table has 9 fields and 964 rows. It contains information about how each baker did in each episode. The final episodes are not present as they have a different outcome format.
The Seasons table has 10 fields and 56 rows. It contains information about the judges, hosts, and winner, and the network and how the season was listed on various streaming platforms. Each season has four rows because the structure of the Hosts and Judges columns means there are four unique combinations of Host and Judge per season.
Review Your Techniques: Joins and Relationships
Because the data is stored in multiple tables, you need to build a data model that tells Tableau how the tables connect to each other. There are several options to combine tables, including unions, various kinds of joins, and relationships.
Unions merge tables of data by adding new rows across the same column structure. A new column is added to track the original table names.
This is like using the append operation in Excel to add new data to the bottom of a spreadsheet.
Joins merge tables of data by adding new columns. This is similar to a vlookup in Excel.
In this example, the tables are joined on the Name column. If the name is the same in both tables, the rows match.
-
Inner joins keep rows that have the same name in both tables. The joined table has no nulls and drops rows with mismatched values.
-
Outer joins keep all rows from both tables with nulls for mismatched names. No rows are dropped.
-
Left joins keep all rows from the left table and bring in columns from the right table with nulls for mismatched names. Rows for names that are only in the right table are dropped.
-
Right joins keep all rows from the right table and bring in columns from the left table with nulls for mismatched names. Rows for names that are only in the left table are dropped.
Relationships are the default method to combine data in Tableau. Relationships are built using relationship clauses that establish which fields connect which tables. (If you’re familiar with joins, the relationship clause is analogous to a join clause.)
Setting up a relationship defines how two tables could be joined, but rather than merging the tables right away like a join or a union, a relationship simply holds the information. A related data source stays very flexible and dynamic because the data isn't combined ahead of time into a single, fixed configuration. As you do your analysis, Tableau uses that relationship information to automatically create the necessary joins behind the scenes as you use the data source. Pretty cool, huh?
Preview the Data Model
Here's a sneak peek at the data model you build in the next few units:
Don't worry, you'll build it step by step as you go.
In the next unit, you get hands-on with these techniques in Tableau Public. Make sure you have your Tableau Public account credentials ready so you can link your playground.