The Ultimate Tableau Workbook v0.4 - Compressed
The Ultimate Tableau Workbook v0.4 - Compressed
The Ultimate Tableau Workbook v0.4 - Compressed
Ultimate
Tableau
Workbook
Get Up Get Started In 4 Hours
By: Shashank Kalanithi
Framing the Course
Thank you for downloading my
Tableau Workbook. My hope is
that by the time you’re done
reading this you’ll know enough
to start figuring out Tableau on
your own. This course requires
no prior experience and is
designed to get you off the
ground.
This course comes with an
accompanying video located
here
https://www.youtube.com/watch?
v=Gl 2lg-TtRJo&t.
I believe that there is no better
way to truly learn a concept than
to apply it as soon as you learn
it. To help you with this, we’ll be
applying the concepts we learn
step by step as we go along.
Framing the Course
This course won’t make you
an expert at Tableau but
should get you off the
ground.
Like a plane, I believe that one
of the hardest parts of learning
is just starting, and I hope to
help you take off and become
self-sufficient.
Ke
You’ll notice a colored bar at
the top of each page. The color
of this bar represents the
purpose of the page as shown
below.
Shashank Kalanithi
https://commons.wikimedia.org/wiki/File:Anscombe%27s_quartet_3.svg#/media/File:Anscombe's_quartet_3.svg
What Makes Tableau so
Great
Tableau is operating in a crowded space and
needs to differentiate itself as a best-in-class
tool in order to stand out from the crowd.
Gartner has placed Tableau in the “Leader”
section of their Magic Quadrant for the last
eight years consecutively with consistent marks
given to its “Ease of visual exploration and data
manipulation” and “Customer enthusiasm”.
• Customer Enthusiasm: You only need to attend one Tableau
user group to see how well-liked the tool is by its users. This
enthusiasm ensures that answers to questions on the forum
are forthcoming and that support for problems is easy to
obtain. Tableau Conference 2019 had over 20,000 attendees
from all over the world who came to learn about the newest
release, connect with other Tableau users, and apply new
and innovative tricks and tips to visualize and manage
their data.
Shashank Kalanithi
https://www.gartner.com/doc/reprints?id=1-1YBTIWVR&ct=200211&st=sb
Career Opportunities
with Tableau
Tableau is a tool that is widely used
across the world and job opportunities
that directly work with Tableau are
plentiful and well-paying.
Who I’ve Seen Use
Tableau
Account Manager
• Explore areas of biggest concern when
servicing an account
Supply Chain Director
• Automate the weekly report out to a C-Level
executive
Warranty Manager
• Monitor a specialized warranty metric to ensure
product quality was maintained
Data Analyst
• Drive insights, clean data, explore data,
communicate findings
Shashank Kalanithi
Ease of visual exploration
and VizQL
Tableau was formed in the early 2000’s as
a spinoff of a project given to Stanford by
the Department of Defense. The project
yielded a new language called VizQL™.
VizQL™ forms the core of Tableau and
was revolutionary in how it was able to
translate dragging and dropping
elements on the screen into fully
interactive visualizations that allowed
the user to more easily translate thought
into action. As you create a Tableau
visualization,
you’ll realize that the software is designed
to easily allow you to dive deeper and
deeper into the data and uncover patterns
that you might have not seen before.
Additionally VizQL™ unifies the process of
creating visualizations so that you can
quickly change how you want to visualize
the data without fundamentally changing
anything about the structure of your data.
Shashank Kalanithi
The Hyper Database
In version 10.5 Tableau released the Hyper
database system. This was the product of an
acquisition that Tableau had made in 2016
from the Technical University of Munich.
Traditionally database engines (the underlying
software that reads, deletes, creates, and
updates the database) optimized for retrieving
the data from the database because that was
by far the slowest part of any data process.
As modern computer hardware had gotten
exponentially cheaper (CPUs and memory
specifically), the importance of both retrieving
and processing the data has become more
important. Hyper takes advantage of the
dozens of cores on modern computers and
more efficient query optimization to read,
write, update, and process data faster than
most databases out there.
As a tool, Tableau not only allows for the
presentation of data, but for the analysis of
data. For anyone who has worked in analytics,
pulling and manipulating considerable
amounts of data can not only slow down the
analytics process but also mess up your train
of thought. Having a fast database makes
analysis not only fast, but enjoyable
The Hyper Database
Shashank Kalanithi
The Hyper Database
Hyper has four main design principles:
•One System
• Is a general purpose database system that
combines transaction processing, data
ingestion, and data analytics
•One State
• Transaction processing, data ingestion, and
analytics are all on the same state so real-
time analytics can be performed
•No Tradeoffs
• Has SQL language support
• Has transactional guarantees
•No Delays
• Scales with available hardware resources
Shashank Kalanithi
The Hyper Database
Hyper compiles SQL code to Machine code and
executes the machine code
• Traditional databases use interpreters
instead of compilers
• You would normally write in Machine Code using C,
but compiling C code to Machine Code can be slow
• Tableau solves this by generating LLVM (low
level virtual machine) code instead which
compiles to Machine Code more quickly
• Tableau writes its own VM for short running queries
• Compilation is attractive now because historically,
memory was so expensive that just retrieving the
data was the slowest step, therefore efficient
processing of the data once it was obtained was
pointless
• Query optimization is important but is inherently
hard to figure out
• It's hard to compute
• Very few researchers globally specialize in
query optimization because it's such a hard
field to do research in
• In buying Hyper, Tableau had access to Dr. Thomas
Neumann who was one of the few people in the
world doing research in query optimization
• Modern CPUs have many cores
• Traditional parallelization only scales to a few cores
• They're not good at load balancing
• Hyper uses morsel-driven parallelization to
utilize a large number of cores
• Hyper divides the processing task into very small
components (morsels) and each core just takes on
a morsel whenever it can so you're always using as
many cores as possible
Shashank Kalanithi
How to Install Tableau
Shashank Kalanithi
Connect to
When you open Tableau, you will be greeted with a screen similar
to the one shown below
1. This is a sample of the data sources that you’ll be able to
connect to, look for the type of data source that you want
and click on it to connect to it
• You can also copy data from another application and paste it here to
quickly create a data source on the fly. (Just hit CTRL + V while on this
screen or on a blank sheet)
2. These are your previous workbooks. As you create more,
they will be shown here for easy access
3. Content to get you started with Tableau. There is a
limitless supply of tutorials for free at kb.tableau.com
4. This is where the “Viz of the Day” normally appears. These are
curated
visualizations that are made in Tableau and released on Tableau
Public
You can return to this screen or connect to another data source
at any time by clicking on the Tableau logo on the upper left of
the screen
2 3
4
Connect to
1. Click on “Microsoft Excel” under the
“Connect” menu
2. Select the workbook you want
to connect to: Chicago Filming
Dataset
1
2
Connect to
A Tableau data source can only be constructed
a table at a time. As such, when you import an
Excel document with multiple sheets you’ll be
asked to select one at a time to import. For now
we’ll just import a single sheet.
3. Drag the sheet “Chicago Filming Permits onto the
area marked “Drag sheets here”
• When we start creating more complicated data sources
the area in orange is where you’ll import all of your data
and form your joins and unions.
3
Tableau Data Source
Page
When you connect to most data sources in
Tableau you’ll be brought to this screen, this
is the Data Source Page.
Creating a data source is the first step of any
visualization and importing data is the first
step of creating a data source. If you’re
lucky, most data you import will be in a
format you can immediately use to start
working, but if that’s not the case, this is the
page you will use to get your data to a
workable format.
Tableau Data Source
Page
1. This is a list of your connections. A
connection would be something like a
database or an Excel workbook. This
is the first step in importing data to
create your data source.
2. This is a list of sheets associated with
your
current connection
4
3 6
1
5
2
7
Tableau Data Source
Page
3. This refresh button allows you to
refresh a data source that is live or
that has been modified
4. This dropdown allows you to toggle
through your connections
5. This is your canvas which allows
you to create your data source by
bringing in different sheets and
combining them
4
3 6
1
5
2
7
Tableau Data Source
Page
6. This is where you set your “Data
Source Filters”. These are filters
that limit your data before you
perform any other calculations on
it and can help improve
performance
7. This is the Data Grid and by default
displays the first 1,000 rows of your
data. You can change the textbox in
blue to display more data
4
3 6
1
5
2
7
Connect to
Now that we’ve connected to our first data source,
let’s
go to our first sheet and make a visualization
4. Select “Sheet 1” near the bottom of the screen to
go
to your workspace
4
Connect to
Your screen should look something like this.
Tableau
The Tableau workspace can be a little
intimidating at first glance but is
actually quite easy to understand. Here
I’ll give you a quick overview of what is
what on the Tableau Workspace.
Tableau
1. Here is your list of columns split into two
categories “Dimensions” and “Measures”
2. This is where your sheets and
dashboards will show up
3. Here is a list of all of the data sources
you’re connected to
4 8
3 6
1 7
2
Tableau
4. Here are some of the basic controls for
your Tableau workspace
•Open the Connect Menu to connect to other
data
sources
•Undo and Redo. Tableau has unlimited
undos and redos for a given work session
(every time you open a workbook)
•Add a data source
•Create a new sheet/dashboard
•Duplicate your current sheet or dashboard
4 8
3 6
1 7
2
Manipulating our Data
Let’s get back to our data.
Although Tableau intelligently handles data
imports, sometimes data is improperly
recognized. When this
happens it’s important for you to understand
your data well enough to recognize the error so
that it can be corrected.
1. Drag the Measure [Wards] from the
Measures section to the Dimensions
section (highlighted in blue)
1
Dimensions vs. Measures
-4 Kinds of Data-
What did we just do? We changed one of
the fields (columns) in our dataset from
a Measure to a Dimension.
Data in Tableau is classified as either a:
• Dimension
• Dimensions are any qualitative data
• Examples of qualitative data include names, colors, and cities
• This is mostly data that cannot be or is not numerical
• Measure
• Quantitative data, anything that can be numerically measured
• Population or profit
Discrete Dimensions
• Discrete
• Continuous
Continuous Measures
Measures s
• Discrete
Continuous Dimensions
Discrete Measures
3
Making Your First
Visualization: Bar Chart
Although basic, Bar Charts are some of the most
effective visualizations you’ll have in your toolbox.
Don’t skip on a bar chart just because it might
not be as visually stunning as another
visualization you have
1. Double-click “Sheet 1” and rename to “Applications
by
Ward Crosstab”, then right-click and click “Duplicate”
2. Rename your new sheet to “Applications by
Ward Bar Graph”
3. Click the “Show Me” button
4. Click on the bar graph option
5. Sort the data in descending order by the year 2019
3
5
4
1 2
Making Your First
Visualization: Bar Chart
Congratulations, in a few clicks you’ve created a
completely different type of chart. The ability to
easily duplicate sheets and see what charts
can be made will allow you to try all the
different charts in Tableau in rapid succession
to find what works best for you.
3
5
4
1 2
Making Your First
Visualization: Geospatial
1. Create a new sheet by clicking on the
new sheet button
2. Drag [Latitude] and [Longitude] into rows and
columns respectively
• Don’t use [Latitude (generated)] or [Longitude
(generated)]
3. Drag [Application Number] into the Details box on
the Marks card
4. Select “Add All Members”
2 1
Making Your First
Visualization: Geospatial
Once you have something like this you’re ready
to move onto the next step, if your graph looks
different the next few pages can help you
troubleshoot
Making Your First
Visualization: Geospatial
If you see something like the graph below,
don’t worry, that most likely just means that
you reversed the [Latitude] and [Longitude].
There are two ways to fix this:
1. Click on the “Swap Rows and Columns” button on
the toolbar.
This will swap the [Latitude] and [Longitude] in
the Rows and Columns shelves which will allow
Tableau to recognize that you’re trying to make
a map not a scatterplot
1
Making Your First
Visualization: Geospatial
This is the second way to fix the graph.
1. Click on the “Symbol Maps” visualization
type under the “Show Me” menu
1
Making Your First
Visualization: Geospatial
2. Drag the [Application Number] Measure
from the Rows Shelf where Tableau put
back to the Details Card where we want it
This will explicitly tell Tableau what
visualization you’re looking for and it will
reorganize your data to produce that
visualization
2
Making Your First
Visualization: Geospatial
Make sure your graph now looks like this
Making Your First
Visualization: Geospatial
5. Drag [Wards] into colors
• Again click on “Add All Members” to bring in all
50
wards
6. Rename this sheet to
“Application by Ward Map”
2
Making a
1. Click on the “New Dashboard” button
2. Navigate to the new
Dashboard you created
1
Making a
You might notice that the space dedicated
to your dashboard (outlined in blue) might
be small or awkwardly sized (depending on
how Tableau was used/configured on your
machine. This is how you change that:
3. Under “Size” > “Range” select “Automatic”
This will dynamically resize the space your
dashboard
takes depending on the size of the device it’s viewed
on
3
Making a
4. Double click the “Applications by
Ward Crosstab” and Applications
by Ward Map” sheet on the left
side of the window to add them
to the dashboard
1
How to Save Your Workbook
Nice! You’ve created your first
Dashboard. Now you might want to
save your work, and if you’re using the
full edition of Tableau it’s as easy as
clicking on “File” > “Save As” and
saving your work. On Tableau Public,
you only have the option of saving to
Tableau’s Public cloud. This means that
anything you create on Tableau Public
can be seen and shared with anyone.
Sheets, Dashboards,
and Stories oh my!
You might have noticed the three
buttons near the bottom of the screen
when you were creating your
dashboard. These are the three basic
workspaces that you can use in
Tableau
Sheets, Dashboards,
and Stories oh my!
• Sheet
• Used to create a single metric
• You must create sheets in order to use Dashboards
• Dashboard
• Houses multiple Sheets
• Great for tracking metrics
• Story
• Houses multiple Sheets and/or Dashboards
• Usually used for your final product
• Acts as a presentation, is used to show a data story
• Helps you present one of Seven Data Stories
Seven Data Stories
Change Over Time
• One of the most common stories, great for
tracking performance
• Company profit over time
Drill Down
• Focuses on a potential cause of a phenomenon
Zoom Out
• Explains how something affects the bigger picture
Contrast
• Shows the difference between two categories
Intersection
• How do two seemingly unrelated categories relate
• Is there a correlation between the
temperature in a region and the number of
batteries sold
Factors
• Divides a phenomenon into categories
• Sales by battery category
Outliers
• Helps show and/or explain an anomaly
• Does a certain distributor buy more of a
certain battery or not
https://help.tableau.com/current/pro/desktop/en-us/story_best_practices.htm
Section 1:
Congratulations! You’ve
completed the first part of this
course. I believe there is no
better way to learn than to do
and hopefully the exercises
you’ve done will give you a
feel for how to use Tableau.
As you use it for your own
purposes you’ll start to
formulate questions that can
generally be answered by
going to:
kb.tableau.com
This is the official Tableau
knowledge base and contains
all of their documentation for
Section 1:
users.
Section 1:
This is everything that we’ve learned so far!
• What is Tableau
• The Importance of Data Visualization
• What Makes Tableau so Great
• Ease of Visual Exploration and VizQL
• Hyper Database System
• How to Install Tableau
• Connecting to Static Data
• Anatomy of the Tableau Data Page
• Anatomy of the Tableau Workspace
• Cleaning Data
• Dimensions vs. Measures
• Tables
• Bar Charts
• Geospatial Data
• Basic Troubleshooting
• Resizing Dashboards
• Saving your Work
• Sheets, Dashboards, Stories
• Seven Data Stories
Objectives: Section
!/vizhome/Shash
ankTableauCourseSection1/Das
hbo ard1
Objectives: Section
Shashank Kalanithi
Calculated
There isn’t a field in the dataset that
tells
us if a film is a Big Budget film or an
Independent film, so we’ll have to
make our own field to determine
this. This new field we’re creating is
called a “Calculated Field” because
it is populated by values calculated
from other fields.
3
4
5
Calculated
We know that Big Budget films get charged
$250 per day of shooting while
Independents get charged $25 per day
of shooting. We also know what fees a
movie was charged through the
column [Total Fees]. Films can also
have some fees waived which must be
taken into account through the
[Waived Fees] column. What we need
to figure out now is how many days a
movie was shooting for.
3
4
5
Calculated
1. Create a new sheet
2. Under “Analysis” click “Create Calculated
Field”
3. Name your new calculated field, “Days
Spent Shooting”
4. Use the following formula
• DATEDIFF('day', [Application Start Date],
MAX([Application End Date], [Application
Expired Date])) + 1
5. If you don’t see a gray pop-up on the
side of the Calculated Field window, then
click this arrow to reveal it
3
4
5
Calculated
You can see that a pop-up with the
description of “DATEDIFF” pops up.
DATEDIFF is a function that is designed
to calculate the difference between two
dates using whatever time period you
want. (‘years’, ‘months’, ‘days’ etc.)
You can see the bolded text in the
description below shows you the
different parameters you need to fill in
this function with.
Calculated
DATEDIFF('day', [Application Start Date],
MAX([Application End Date], [Application Expired
Date])) + 1
2
3
4
Calculated
([Total Fees] - ZN([Waived Fees]))/[Days Spent
Shooting]
2
3
4
Calculated
IF [Fees Per Day] = 25 THEN "Independent“
ELSEIF [Fees Per Day] = 250 THEN "Big
Budget" ELSE NULL
END
1 4
2
3
Exercis
Using what you’ve learned, recreate
the
sheet shown below
Exercis
How to Filter
Click on the 2019 value under
“Application
Start Date” and select “Keep Only”
How to Filter
You could also create a filter by
dragging
the 2019 value from the “Dimensions”
shelf to the “Filter” card and
selecting the blue “year” and
“2019-”
1 2
How to Filter
You could also create a filter by
dragging
the 2019 value from the “Dimensions”
shelf to the “Filter” card and
selecting the blue “year” and
“2019-”
Changing Data
1. Right-click the date pill under
Columns
2. Select the first Month option
2
Line Graph
1. Under “Show Me” click on the first
line
graph option
1
Warning! Dates and
Line Graphs
Be careful which set of dates you use when
you’re creating line graphs.
If you look carefully you might notice
that there are two sets of date values
when we right- clicked on our date pill.
The top half highlighted in blue are
discrete values and the bottom half
highlighted in red are continuous.
Warning! Dates and
Line Graphs
Why is this important? To illustrate:
1. Duplicate this sheet so you don’t lose
your
work
2. Right-click your filter
3. Select “Edit Filter…”
4. Select the year “2018” so that you
include
data from 2018 and 2019 now
5. Select “OK”
Warning! Dates and
Line Graphs
Why is this important? To illustrate:
1. Right-click your filter
2. Select “Edit Filter…”
3. Select the year “2018” so that you
include
data from 2018 and 2019 now
4. Select “OK”
Warning! Dates and
Line Graphs
You might notice that the x-axis of the graph
has not changed at all. How can this
be? We’ve added an entire year’s worth
of values to our graph. Right-click the
date pill and you’ll see why.
Warning! Dates and
Line Graphs
Like we mentioned earlier, the Blue and
Red boxes represent whether Tableau
graphs the dates as Discrete or
Continuous respectively. But what about
the black box?
The black box just changes how the values
are represented on the axis. To more clearly
illustrate, let’s change the value in the
black box to be “Discrete”
Warning! Dates and
Line Graphs
You’ll notice that only the axis changed, we’re
still only seeing 12 x-values for two years of
data. This is because we chose “Month” from
the “Discrete” section therefore Tableau is
creating the months as a categorical
variable, meaning it’s adding all of the
January’s together and all of the February’s
together without regard to year. As you can
expect this isn’t what we really want. On this
sheet continue to play around with these
settings to see if they help you understanding.
Line
1. When you’re done experimenting go
back to the sheet “Big
Budget vs Independent
Movies by Year”
2. It would be helpful if the axis
had labels corresponding to the
names of months. Right click the
date pill and select “Discrete”
2
Line
3. As an option if you want you
can use this dropdown to tell
Tableau how to fit this
visualization onto the screen
3
Section 2:
Great job! It looks like over the year 2019
Chicago brought in some good business
with hundreds of permits being given to
Big Budget movies. This concludes Section
2 of the course. Go ahead and save your
work. I will leave a copy of this workbook
here:
https://
public.tableau.com/profile/shashank.kalani
thi!/vizhome/ShashankTableauCourseSecti
on2/B igBudgetvsIndependentMoviesbyYear
Section 2:
• Calculated Fields
• Date Calculations
• If-Then statements
• Filtering Data
• Line Graphs
• Continuous vs. Discrete Dates
Section 3:
5
6
4
Connect to
7. Drag in the sheet labeled “Non-
English Players”
8. Click “OK”
3
Potential
If you find that the window you’re
supposed to drag “Non-English
Players” to disappears when you
try and drag the sheet in, it’s just
hiding behind the main Tableau
window. Move that window and
you’ll be able to drag it in as
expected.
2
Potential 3
Union
What we just did was union our data. You will start to notice that
Tableau borrows the terminology for a lot of its concepts from SQL
and Excel. This makes the tool really easy to pick up if you’re
familiar with these concepts and the skills you learn here easy to
transition to SQL and Excel if you’re just learning them now.
Put simply, a union (SQL Union) is the process of stacking two
tables on top of one another. You will usually do this when your
data is split up into multiple sections like an excel spreadsheet of
a year’s sales split by month.
As you can see from below, Tableau stacks the columns based on
column name, and will put in null values wherever a value for a
given column doesn’t exist
Union
al Data
although an ID number is numerical,
you can’t perform math on it so you
should classify it as a string
5
Aliase
Sometimes data we import is
encoded in a way that won’t make
sense for a visualization. When this
happens a solution is to change the
aliases of the data. Changing aliases
does not change the underlying
data, just how it appears in your
visualizations.
1. Right-click “Skill Moves” and
click “Aliases…”
Aliase
2. Use the “Value (Alias)” column to fill in the
aliases
as shown below:
1. Weak Skill
2. Below Average
3. Average Skill
4. Above Average
5. High Skill
3. Click “OK”
Now whenever the value 1 appears in relation
to [Skill Moves] Tableau will instead show “Weak
Skill”, i.e. an Alias
2 3
Aggregation
3. Drag:
•Height (Inches) into Columns
•Weight into Rows
•Right-click the measures
4. Change the aggregation of
both your Height and Weight
from Sum to Average
1
Aggregatio
Aggregations are one of the most
important and most complicated
data- related concepts for newbies
to understand. The basic idea is
that you might want to collapse or
expand a table in different ways
and aggregations tell Tableau what
to do with your Measures when you
do that. Let’s look at a simple
example.
2020 2 2
2021 2 1
Aggregation: ATTR
ATTR is a very special aggregation function. It
will return a value if there is only one for every
row in a group, otherwise it returns an
asterisk“*”. This is a really useful function to
use in calculated fields where you cannot
normally combine aggregated and non-
aggregated values.
Yea ATTR(MONTH)
r
2020 *
2021 January
Scatterplot
Let’s see if there is any relation
between
player position and compensation.
1. Drag “Position” into “Colors” on the
Marks card
• If you get a warning, select “Add all
Members”
Scatterplot
We have our scatterplot, but it’s almost
impossible to tell what’s what. Let’s see
if
we can change the axes to help with
that
Editing Axis
1. Right click the y-axis and click on the
“Edit Axis” option
2. This dialog box allows you to
manipulate the axes as you please,
uncheck the “Include Zero” box
3. You can also change the title here
if you want to
4. Go ahead and follow the same
process for the x-axis to remove
the 0 value
3
Axes Don’t Have to Include
There’s a common misconception that the y-
axis needs to start at 0. While messing with
the y-axis is one of the easiest ways to Lie with
Statistics, oftentimes you need to adjust the
axes depending on the audience. The Fox
News example below is particularly egregious
example of fiddling with the y- axis to prove
a point (They even moved the axis to
the left hand side so you’re likely to miss it
entirely).
The axis starts at 34% to make the ~5%
marginal increase seem much larger than it
actually is.
Axes Don’t Have to Include
You should always make an effort to represent
your data accurately, but it can be just as
important to consider the context surrounding
your visualization.
An example of this would be a visualization I
needed to put together to illustrate how profits
changed during the 2019-2020 Coronavirus
Pandemic for our execs.
While profits did in fact drop, you could only
visualize the difference and analyze it if you
zoomed in on the y- axis, if I pegged the axis to
0 then you wouldn’t think the pandemic did
anything to our profit.
The two graphs below show the exact same
data, but to the informed audience (such as
your company execs) the one on the right-hand
side gives you more information. As a
visualization professional, you need to digest
data, and communicate information.
Ja
nu
ar
y
Fe
br
ua
ry
Ma
rc
h
Ap
ril
Moving the Graph
While conducting an exploratory analysis, you
might find that you need to manually
manipulate the graph. Here are a few
commands that allow you to do so.
• You can use Ctrl + Scroll to zoom in and out of a
viz
• You can use Shift + Drag to pan around a viz
• Double-click your current sheet and rename it to
“Average Measurements by Position”
• You can also right-click any axis and click “Edit
Axis”
Grouping Your
Sometimes the level of detail you
have is too granular, in these
situations grouping the data might
help you out. In the example below, all
these positions listed individually don’t
really help us out that much.
Grouping data adds a column that
categorizes
the data the way you want
1. Right-click [Position] and click “Group…”
Grouping Your
2. Ctrl + Select the positions that go
together and click on “Group”
•Make sure you check the “Include Other”
checkbox
3. Click “OK”
4. Click on “Data Source”
3
4
Joins and Importing
Unusual Data
They say that Data Scientists spend about 70-
80% of their time just gathering and cleaning
data. Oftentimes this happens just because
data comes in inconvenient formats. Today,
we’ll be working with one such format in the
form of a table inside a PDF. Tableau has the
ability to help us parse through this kind of data
quite easily.
We don’t have access to some very basic player
compensation info. Unfortunately this data is in a
PDF format which would normally make it very
hard to work with, but not with the power of
Tableau!
Joins and Importing
Unusual Data
1. Using the “Add” button above our first data
source, import the PDF “Data to Join”
2. On the dialog box that pops up, click “All” to
import
all of the pages in the PDF
Joins and Importing
Unusual Data
3. Click on the first Page, hold onto
the Shift key on your keyboard
and scroll down to select the last
Page, drag all of these Pages to
the right of “English Players+”
Joins and Importing
Unusual Data
4. Click on the Venn Diagram
between the two data
connections
5. Select the Left Join option
6. Go back to the sheet we were
working on
6
Joins
What we just did was join our data. Joins are
another concept that Tableau borrows from
SQL. In the way that Unions combine two tables
vertically, Joins combine two tables horizontally.
For a join, like a Union you have to have at least
two tables, what we call our Left Table and our
Right Table. You (mostly) have to have at least
one matching column between the two tables,
and you will match rows from these columns.
The most common way to visualize the types of
Joins are through Venn Diagrams.
ID Left Table
Color Model ID Right
Car Table
Inner Join
Left Join
Right Join
Full Join
Joins
You’ll mostly be sticking to Left and Inner Joins.
It’s worth your time to learn more about Joins
because they are some of the most powerful
tools you can use to manipulate data. I use
Joins basically every single day in my work. For
this course we’re going to stick with relatively
simple Joins.
Joins: Inner Join
We’re now going to do something called an
Inner Join on the [ID] column which will only
output exact matches from the [ID] column in
our output.
8
Changing the Default
Number Format
9. Click on “Currency (Custom)”
10. Enter a € symbol under “Prefix”
• You can get this symbol by opening
a word document and clicking Ctrl +
Alt + E and copying the Euro symbol
into Tableau
11. Repeat this for the “Wage” and
“Value”
columns of data
9
Exercis
Now let’s see if we can visualize
how much each [Club] spends on
their players by [Position (group)].
Try and recreate this visualization
Exercis
Here’s how you would do it
Exercis
You’ll notice that the aggregation
of our data is CNTD, or Count
Distinct. This means that the
length of the bars corresponds to
the number of distinct values in
that group. The default
aggregation of this data is CNTD
instead of SUM because we
converted it from a string. Right-
click and change it to SUM
Exercis
Then you just need to sort the data
in descending order using the
button at the top
Exercis
Let’s rename the visualization to:
Player
Value by Club
Marks
The Marks Card is one of the key ways you’ll
change your visualizations in Tableau. Put
generally, the Columns and Rows shelf is where
you place your data to create the basic form of
your visual, and you change the details using
the Marks Card.
Marks are how your rows are represented in
Tableau, they are the individual “points” on your
visualization. If you want to get a better idea of
what a Mark is, try changing the Mark type
using the dropdown highlighted in blue.
Marks
If you want to change any individual
aspect of your visualization, the
Marks Card should be your first
stop. From here you can change
colors, how much detail is in your
visualization, whether labels show
up, and what the tooltip that comes
up when you hover over a mark
says.
End of Graph
Bars are great, but this graph would
benefit from having labels on it at least
for the full value of the bar
Using what we learned about the Marks
card
you might want to add labels to the chart
Dragging the [Value] measure to the
Label Marks Card (and changing the
aggregation) gives us something that
looks like the below.
This isn’t particularly useful, let’s see if we
can get the labels at the end of the graph.
End of Graph
1. Click on the “Analytics” tab on the
upper
left of the interface
2. Double-click on “Reference Line”
3. Select the “Per Cell” option
4. Aggregate by Sum
5. Fill the label with Values
6. Hit OK
7. Remove the [Value] measure
from the Marks Card
1
3
4
2
5
6
End of Graph
Here’s what we end up with.
When we put the [Value] column in the
“Label” part of the Marks Card, Tableau
tried to label at the lowest level of
aggregation on the chart which was the
player [Position (group)] level. What we
did to fix this was to add a line at the end
of the graph that was aggregated at the
[Club] level and then labeled that line.
Adding an Average
This visualization is great and
clearly communicates where teams
are relative to each other, but
what about relative to the
average?
1. Drag “Average Line” to the
“Table” section of the menu
(Under the “Analytics” menu
1
Adding an Average
2. Right-click the average line and
select
“Edit…”
Adding an Average
3. Select “Custom”
4. Type in: <Computation>: <Value>
5. Hit OK
6. Click on the Connect Menu icon
3 4
5
Import Spatial
We happen to have some data on
stadium locations. Let’s see if we
can bring this in and correlate it to
the numerical data
we’ve compiled.
1. Select “Spatial File”
2. Import the “Data to Blend” dataset
Import Spatial
3. Right-click on Name
4. Click Rename and name it “Club
• Rename Description: Stadium
5. Create a new Sheet and navigate to
it
Spatial
The file we just imported is
something called a .KML file.
This is a spatial file type that
stores information about
geography.
Spatial files can store three
basic geospatial data types:
•Points
•Lines
•Polygons
You can use these to create
visualizations that would
otherwise be very difficult to
create.
Data
Now we need to relate our spatial
dataset
to our player data.
1. Click on Data on the menu tab and
select “Edit Relationships”
Data
You’ll see that from your primary
source to your secondary source,
they are automatically linked on the
[Club] column.
This is the menu where you would
define what columns are linked
together if you needed to.
• Press “OK” or exit out of this menu
Data
Like Joins and Unions, the Data Blend is a way to
combine data from multiple sources.
A Data Blend emulates a traditional Left Join, the
difference is that a Join will combine the data
row-by- row then aggregate the resultant table,
whereas in a Data Blend, the tables will be
aggregated as necessary for the view, then
combined.
One important difference with a Data Blend vs a
Left Join is that a Data Blend will not duplicate
results, instead, if there is a many-to-one
relationship between the two data sources that
you’re combining, then Tableau will output an
asterisk “*”.
A Data Blend will never output more rows than exist
in
the “Left” or Primary table.
The next page will compare a Left Join to a
Data Blend where we have multiple matches to
illustrate the differences between the two.
Data
You’ll notice in this Left Join the data duplicates as
we’d
expect it to. (Highlighted in Orange)
3
2
1
Geospatial
4. Select the English Players+ data
connection
• Look at the blue box and you’ll see that
Tableau automatically linked this data
source using the [Club] field
5. Drag [Value] to “Size” under the Marks
Card
• Make sure the aggregation is a SUM
6. Click on Size and edit the slider until
you
find a size you’re comfortable with
4
5
Geospatial
You can use the Color section of
the Marks card to edit the opacity,
colors, and the border of the Marks
on your visualization
Generally I like to use 75% opacity
and make my borders black. I find
this tends to make your
visualizations pop a little more but
experiment and see what works
best for you.
Parameters and
Parameter
1. Drag [Club] under the English
Players++ dataset to the
Filters card
2. Select the “Top” tab
3. Select the “By field” radio button
4. On the second dropdown, select
“Create a New Parameter”
2
3 4
1
Parameters and
Parameter
5. Change the
Parameter’s Value Display As
name to “Top
Teams by 1 Top Team
Player Value”
6. Change “Allowable 5 Top Five Teams
values:” 10 Top Ten Teams
7. Copy the
values in this 20 Top Twenty Teams
table into the
“List of values”
section
8. Click “OK”
7
Parameters and
Parameter
9. Select [Value] and “Sum” on the
next
two drop downs
10. Click “OK”
9
Parameters and
Parameter
11. Right click your new
Parameter (below Measures), and
make sure that “Show Parameter
Control” is selected
12. Name your new sheet “Top
Teams by Player Value Map”
10
Parameters
Parameters are constants that
can be used all over Tableau.
You can use it to dynamically
change a calculation, to filter
data by measures, or allow your
audience to define certain criteria
when exploring your data.
In the example we just finished,
we used a Parameter to allow
the user to change which teams
they saw on the visualization.
When setting our filter normally,
we’d have to select a constant to
filter on, but because we
connected it to a Parameter, we
change that constant as we
please.
Choropleth
1. Create a new sheet
2. Right-click [Nationality] and
select Geographic Role and
Select
“Country/Region”
1
Choropleth
3. Double-click on [Nationality]
4. Under Show Me click on
the Choropleth Map
5. Drag [Value] to Colors
6. Change the aggregation to SUM for
colors if you need to
3
Correcting
1. Click on the “6 Unknown” icon on
the
bottom right
2. Click on “Edit Locations”
1
Correcting
Tableau is very strict in how it
recognizes geographic locations so
this is something you’ll find yourself
doing often
3. Correct Macedonia and
Korea Republic
4. Click “OK”
4
Correcting
5. Name your sheet “Map of
Countries by Player Value”
6. Create a new sheet
5 & 6
Exercise
Now let’s see if we can create a
visualization of the top five
players by [Position (group)]
Recreate the visualization below.
Using Index to Rank
1. Create a calculated field called
“Index”
2. The formula is “Index()”
Using Index to Rank
3. Right-click [Index] to convert the
measure into a discrete value
Using Index to Rank
4. Place [Index] in between [Position]
and [Name] in your rows shelf
Using Index to Rank
5. Right-click [Index] in the Rows shelf and select
“Edit Table Calculation…”
6. Under “Compute Using” click on
“Specific Dimensions”
7. Choose “Restarting every” [Position (group)]
8. Make sure all of the columns are selected
9. Under “Sort Order”, select “Custom” and sort in
descending order by [Value] and aggregate by
SUM
8
7
9
Using Index to Rank
10. Drag [Index] from the Rows Shelf
to
the Filters Card
11. Select numbers 1 through 5 and
hit
“OK”
10
11
Using Index to Rank
What we just did was sort all of
the data in our visualization in
descending order by [Position
(group)] and then attach an index
which counted each position. We
then filtered on this index which
allowed us to only include the top 5
values per [Position (group)]
10
11
Dashboard
1. Go to the “Map of Countries by
Player
Value”, right click the sheet
tab and select “Duplicate as
Crosstab”
Dashboard
2. Bring the [Club] and [Name] of
the players into the “Rows”
shelf in the order shown
below
Dashboard
3. Create a new Dashboard and set the
size to “Automatic”
Dashboard
4. Recreate the Dashboard below by
dragging in the map and the crosstab
Dashboard
5. Click “Actions” option under the
“Dashboard” menu item
Dashboard
6. Under “Add Actions” select “Filter”
Dashboard
7. Under the “Source Sheets” select
the
“Map of Countries by Player Value”
8. Under the “Target Sheets” select
the “Map of Countries by Player
Value (2)”
9. Make sure the action runs on
“Select”
10. Under “Target Filters” select
“Selected Fields” and use the
[Nationality] as the column to
filter on
7 9
10
10
Dashboard
You’ll notice that as we click around
to different countries on the map
the values in the table change
Dashboard
In Tableau, Dashboards are a
collection of sheets that when
viewed together give the user
a fuller picture of the data
being viewed. As such, it
makes sense that actions
taken on one sheet in a
dashboard would affect other
parts of the dashboard. This is
where dashboard actions come
in handy.
Dashboard actions allow us to
define how an action such as
selecting or highlighting a
mark will affect the
visualization such as through
filtering.
Dashboard
You’ll notice that as we
click around to different
countries on the map the
values in the table
change
Differentiating
You might notice that the colors on
this map are not well
differentiated, let’s see if we can
change that
1. Click on the drop-down for the color
legend and select “Edit Colors…”
Differentiating
The problem here is that the range of SUM[Value]
is too great here. We’ll address that in a moment,
for now let’s change the following settings to see
if any change takes place
2. Click on the “Advanced” option
3. Set the Start and End values to be
10,000,000 and 500,000,000 respectively.
The Center value will automatically reset
4. Select the “Stepped Color” option and click “Apply”
You might notice that there isn’t a large difference
in the graph. We can more clearly see the nations
that have a lot of high payed players (Brazil,
France, Spain), but most of the map is still
highlighted ineffectively.
4
2
3
Creating a
5. Go back to “Map of Countries by
Player Value”
6. Right-click [Nationality], and select
“Create”>”Set…”
5
Creating a
7. Name the Set “Top 10 Countries by
Salary Value” then the “Top” tab
8. Select the “By field” option.
“Top” “10” [Value] “Sum” and
hit “OK”
8
Creating a
9. Drag your newly created Set to the
filter card
10. Go back to the “Edit Colors” menu
10
9
Creating a
9. Drag your newly created Set to the
filter card
Set
In Tableau there are generally
multiple ways to accomplish a
desired result. A set provides
a very easy way to add a
custom field that defines a
subset of data as either “In” or
“Out” of the set. In this case
we use it to limit our map to
only countries with large
combined player salaries. As
you might have figured, we
could have also accomplished
this result using an Index, a set
is just another way to do so.
Section 3:
Congratulations! This is the end of
Section 3. You now have several
visualizations you can use to
analyze player salaries. The point of
this section was not so much to
answer a question as it was to
show you how you could do the
same thing multiple ways in
Tableau.
Section 3:
In Section 3 you learned how to:
•Connect to PDF data
•Union Data
•Join Data
•Blend Data
•Alias Data
•Split Columns
•Define Data Types
•Aggregate data
• SUM
• AVG
• COUNT and COUNTD
•Edit Axes
•Make Scatterplots
•Group Data
•Use the Marks Card
•Add Labels to the end of a Bar Graph
•Create Parameters and Parameter Actions
•Correct Locations
•Create Dashboard Actions
•Create Sets
End of
Section 1 Stories
Section 2:
• What is Tableau • Seven Data
Stories • Calculated Fields
• The Importance of • Date Calculations
Data Visualization • If-Then statements
• What Makes Tableau • Filtering Data
so
Great • Line Graphs
• Ease of Visual • Continuous vs.
Exploration and Discrete Dates
VizQL
• Hyper Database
System
• How to Install
Tableau
• Connecting to Static
Data
• Anatomy of the
Tableau Data Page
• Anatomy of the
Tableau
Workspace
• Cleaning Data
• Dimensions
vs. Measures
• Tables
• Bar Charts
• Geospatial Data
• Basic Troubleshooting
• Resizing Dashboards
• Saving our Work
• Sheets, Dashboards,
End of
Section 3:
• Connect to PDF data
• Union Data
• Join Data
• Blend Data
• Alias Data
• Split Columns
• Define Data Types
• Aggregate data
• SUM
• AVG
• COUNT and COUNTD
• Edit Axes
• Make Scatterplots
• Group Data
• Use the Marks Card
• Add Labels to the end of
a Bar Graph
• Create Parameters
and Parameter
Actions
• Correct Locations
• Create
Dashboard
Actions
• Create Sets
Resource
•kb.tableau.com
• Tableau’s online documentation. Information
on all the features and examples are
available here. Most questions about
Tableau can be answered here or on
Tableau’s forums
•Storytelling With Data
• A great book to get you started with
learning how to present data. It is platform
agnostic but everything here can be easily
applied to Tableau
•Makeover Monday
• This is a global movement where every
Monday a new dataset along with
visualization is released and participants are
asked to redesign the visualization usually
using Tableau
•The Big Book of Dashboards
• Probably the best resource I know to help
build Tableau dashboards specifically. Has
lots of tips on specific improvements you
can make to your dashboards.
Data
Over time you’ll find that a large proportion of your
time will be dedicated to gathering a cleaning data.
Developing a process to do this effectively and
efficiently will save you a lot of time in the future.
This is a checklist of things to look out for when
forming your data sources.
This list is my no means comprehensive but hopefully will
help save you some time and improve the quality of
your deliverables