The Ultimate Tableau Workbook v0.4 - Compressed

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 214

The

Ultimate
Tableau
Workbook
Get Up Get Started In 4 Hours
By: Shashank Kalanithi
Framing the Course
Thank you for downloading my
Tableau Workbook. My hope is
that by the time you’re done
reading this you’ll know enough
to start figuring out Tableau on
your own. This course requires
no prior experience and is
designed to get you off the
ground.
This course comes with an
accompanying video located
here
https://www.youtube.com/watch?
v=Gl 2lg-TtRJo&t.
I believe that there is no better
way to truly learn a concept than
to apply it as soon as you learn
it. To help you with this, we’ll be
applying the concepts we learn
step by step as we go along.
Framing the Course
This course won’t make you
an expert at Tableau but
should get you off the
ground.
Like a plane, I believe that one
of the hardest parts of learning
is just starting, and I hope to
help you take off and become
self-sufficient.
Ke
You’ll notice a colored bar at
the top of each page. The color
of this bar represents the
purpose of the page as shown
below.

Orange is for information about the


course

Blue is for instructions in the


guides

Yellow is for warnings and


extra information you
should consider

Red is for nerding out, the content here is purely


for your enjoyment. You will not miss anything
Ke mandatory by skipping these sections.
About the
Although there are several
people who have helped me
on my journey, I would like to
dedicate this book to my Mom
who taught me that no matter
your experience it’s never to
late to start a new journey in
life.

As for myself, I’m a graduate


from Emory University
currently working as a Senior
Data Analyst in Dallas, TX. I
hope to become a data
scientist in the near future.
What is Tableau
Tableau is a software company that has
created a host of tools to explore, visualize,
and present data. Their most popular product
is Tableau Desktop which is a tool that allows
you to quickly connect, visualize, and
distribute any data you might have. At their
core, the tools are supposed to help you
derive insights from your data.
Tableau’s current product offerings include:
• Tableau Desktop: Tableau’s core product and
what most people are referring to when they
say “Tableau”. This tool allows you to connect
to to most types of data and create
professional-grade visualizations
• Tableau Desktop Public Edition: A free version of the
desktop software that allows you most of the
functionality of the full version of Tableau Desktop,
but all visualizations have to be published on
Tableau’s online public library
• Tableau Prep: A data cleansing software, Prep
was designed to allow users to easily and
visually inspect datasets and clean them
systematically before they are visualized in
Tableau Desktop
• Tableau Server/Online: Allows companies to
distribute visualizations made on Tableau
Desktop throughout the organization. Also
has limited visualization creation abilities
inbuilt
• Tableau Data Management
• Tableau Server Management
The Importance of Data
Visualization
The importance of data isn’t lost on
anyone. Read the news, talk to your
superior, or just look at the most valuable
companies in the world today. It is not an
exaggeration to say that Data is the Oil of
the 21st Century. Many an individual take
this as gospel and leave it at that without
more deeply considering the implications
of this metaphor.
Like oil, data is not particularly valuable
until it has been enriched. John D.
Rockefeller did not become the richest man
alive simply by mining oil, in fact that
formed a small portion of his business
portfolio. He realized that mining oil was
hard work and heavily based on the luck of
finding a big enough gusher before your
capital ran out. The real money to be made
was in refining oil, he could get oil from any
number of suppliers, let them assume the
risk of not finding anything, refine it, and
sell it at much higher prices than he bought
it for. Data is much the same way. Data
without insights is as useless to you and me
as crude freshly pumped from the ground.
https://commons.wikimedia.org/wiki/File:Anscombe%27s_quartet_3.svg#/media/File:Anscombe's_quartet_3.svg
The Importance of Data
Visualization
This is where data visualization
comes in. Humans are hard-
wired to look for patterns in
nature, it’s what’s helped us
simplify the infinitely complex
world around us into
something that we can
actually manage to understand
and control. Data visualization
is simply the process of
abstracting away the
complexity of raw data and
organizing it into a form which
we can more easily consume.
Shashank Kalanithi
https://commons.wikimedia.org/wiki/File:Anscombe%27s_quartet_3.svg#/media/File:Anscombe's_quartet_3.svg
The Importance of Data
Visualization
To more clearly illustrate my point, look at the
illustration below. Also known as Anscombe’s
quartet, all of the below graphs have the same
values for measures of central tendency (mean,
median, mode). If one were to simply calculate
these metrics and not visualize the data, one
could easily come to the wrong conclusion that
the distributions for the data would all be the
same.

Shashank Kalanithi
https://commons.wikimedia.org/wiki/File:Anscombe%27s_quartet_3.svg#/media/File:Anscombe's_quartet_3.svg
What Makes Tableau so
Great
Tableau is operating in a crowded space and
needs to differentiate itself as a best-in-class
tool in order to stand out from the crowd.
Gartner has placed Tableau in the “Leader”
section of their Magic Quadrant for the last
eight years consecutively with consistent marks
given to its “Ease of visual exploration and data
manipulation” and “Customer enthusiasm”.
• Customer Enthusiasm: You only need to attend one Tableau
user group to see how well-liked the tool is by its users. This
enthusiasm ensures that answers to questions on the forum
are forthcoming and that support for problems is easy to
obtain. Tableau Conference 2019 had over 20,000 attendees
from all over the world who came to learn about the newest
release, connect with other Tableau users, and apply new
and innovative tricks and tips to visualize and manage
their data.

Shashank Kalanithi
https://www.gartner.com/doc/reprints?id=1-1YBTIWVR&ct=200211&st=sb
Career Opportunities
with Tableau
Tableau is a tool that is widely used
across the world and job opportunities
that directly work with Tableau are
plentiful and well-paying.
Who I’ve Seen Use
Tableau
Account Manager
• Explore areas of biggest concern when
servicing an account
Supply Chain Director
• Automate the weekly report out to a C-Level
executive
Warranty Manager
• Monitor a specialized warranty metric to ensure
product quality was maintained
Data Analyst
• Drive insights, clean data, explore data,
communicate findings

Shashank Kalanithi
Ease of visual exploration
and VizQL
Tableau was formed in the early 2000’s as
a spinoff of a project given to Stanford by
the Department of Defense. The project
yielded a new language called VizQL™.
VizQL™ forms the core of Tableau and
was revolutionary in how it was able to
translate dragging and dropping
elements on the screen into fully
interactive visualizations that allowed
the user to more easily translate thought
into action. As you create a Tableau
visualization,
you’ll realize that the software is designed
to easily allow you to dive deeper and
deeper into the data and uncover patterns
that you might have not seen before.
Additionally VizQL™ unifies the process of
creating visualizations so that you can
quickly change how you want to visualize
the data without fundamentally changing
anything about the structure of your data.
Shashank Kalanithi
The Hyper Database
In version 10.5 Tableau released the Hyper
database system. This was the product of an
acquisition that Tableau had made in 2016
from the Technical University of Munich.
Traditionally database engines (the underlying
software that reads, deletes, creates, and
updates the database) optimized for retrieving
the data from the database because that was
by far the slowest part of any data process.
As modern computer hardware had gotten
exponentially cheaper (CPUs and memory
specifically), the importance of both retrieving
and processing the data has become more
important. Hyper takes advantage of the
dozens of cores on modern computers and
more efficient query optimization to read,
write, update, and process data faster than
most databases out there.
As a tool, Tableau not only allows for the
presentation of data, but for the analysis of
data. For anyone who has worked in analytics,
pulling and manipulating considerable
amounts of data can not only slow down the
analytics process but also mess up your train
of thought. Having a fast database makes
analysis not only fast, but enjoyable
The Hyper Database
Shashank Kalanithi
The Hyper Database
Hyper has four main design principles:
•One System
• Is a general purpose database system that
combines transaction processing, data
ingestion, and data analytics
•One State
• Transaction processing, data ingestion, and
analytics are all on the same state so real-
time analytics can be performed
•No Tradeoffs
• Has SQL language support
• Has transactional guarantees
•No Delays
• Scales with available hardware resources

Shashank Kalanithi
The Hyper Database
Hyper compiles SQL code to Machine code and
executes the machine code
• Traditional databases use interpreters
instead of compilers
• You would normally write in Machine Code using C,
but compiling C code to Machine Code can be slow
• Tableau solves this by generating LLVM (low
level virtual machine) code instead which
compiles to Machine Code more quickly
• Tableau writes its own VM for short running queries
• Compilation is attractive now because historically,
memory was so expensive that just retrieving the
data was the slowest step, therefore efficient
processing of the data once it was obtained was
pointless
• Query optimization is important but is inherently
hard to figure out
• It's hard to compute
• Very few researchers globally specialize in
query optimization because it's such a hard
field to do research in
• In buying Hyper, Tableau had access to Dr. Thomas
Neumann who was one of the few people in the
world doing research in query optimization
• Modern CPUs have many cores
• Traditional parallelization only scales to a few cores
• They're not good at load balancing
• Hyper uses morsel-driven parallelization to
utilize a large number of cores
• Hyper divides the processing task into very small
components (morsels) and each core just takes on
a morsel whenever it can so you're always using as
many cores as possible

Shashank Kalanithi
How to Install Tableau

There are two versions of Tableau


Desktop that you can download in order
to complete the exercises in this book:
• Tableau Desktop: This is the full version of
Tableau’s Desktop app and can be found
here:
https://www.tableau.com/products/desktop/d
ownl oadYou can use it as part of a two
week free trial, or activate it by connecting
to your organization’s Tableau Server or
inputting your license key (usually provided
by your IT department)
• Tableau Desktop Public: This is the
completely free version of Tableau desktop.
It is limited in the variety of data sources
that it can access and that all visualizations
created on it can only be saved
on Tableau’s public gallery. It can be
downloaded
here: https://public.tableau.com/en-us/s/download
Shashank Kalanithi
Objectives: Section 1
For the first two parts of this
course we will be analyzing
data from the Chicago
Department of Transportation
to see where most filming is
done in the city.

The dataset is available here:


https://bit.ly/shashanktableaucourse

Shashank Kalanithi
Connect to
When you open Tableau, you will be greeted with a screen similar
to the one shown below
1. This is a sample of the data sources that you’ll be able to
connect to, look for the type of data source that you want
and click on it to connect to it
• You can also copy data from another application and paste it here to
quickly create a data source on the fly. (Just hit CTRL + V while on this
screen or on a blank sheet)
2. These are your previous workbooks. As you create more,
they will be shown here for easy access
3. Content to get you started with Tableau. There is a
limitless supply of tutorials for free at kb.tableau.com
4. This is where the “Viz of the Day” normally appears. These are
curated
visualizations that are made in Tableau and released on Tableau
Public
You can return to this screen or connect to another data source
at any time by clicking on the Tableau logo on the upper left of
the screen

2 3

4
Connect to
1. Click on “Microsoft Excel” under the
“Connect” menu
2. Select the workbook you want
to connect to: Chicago Filming
Dataset

1
2
Connect to
A Tableau data source can only be constructed
a table at a time. As such, when you import an
Excel document with multiple sheets you’ll be
asked to select one at a time to import. For now
we’ll just import a single sheet.
3. Drag the sheet “Chicago Filming Permits onto the
area marked “Drag sheets here”
• When we start creating more complicated data sources
the area in orange is where you’ll import all of your data
and form your joins and unions.

3
Tableau Data Source
Page
When you connect to most data sources in
Tableau you’ll be brought to this screen, this
is the Data Source Page.
Creating a data source is the first step of any
visualization and importing data is the first
step of creating a data source. If you’re
lucky, most data you import will be in a
format you can immediately use to start
working, but if that’s not the case, this is the
page you will use to get your data to a
workable format.
Tableau Data Source
Page
1. This is a list of your connections. A
connection would be something like a
database or an Excel workbook. This
is the first step in importing data to
create your data source.
2. This is a list of sheets associated with
your
current connection
4
3 6
1

5
2

7
Tableau Data Source
Page
3. This refresh button allows you to
refresh a data source that is live or
that has been modified
4. This dropdown allows you to toggle
through your connections
5. This is your canvas which allows
you to create your data source by
bringing in different sheets and
combining them
4
3 6
1

5
2

7
Tableau Data Source
Page
6. This is where you set your “Data
Source Filters”. These are filters
that limit your data before you
perform any other calculations on
it and can help improve
performance
7. This is the Data Grid and by default
displays the first 1,000 rows of your
data. You can change the textbox in
blue to display more data
4
3 6
1

5
2

7
Connect to
Now that we’ve connected to our first data source,
let’s
go to our first sheet and make a visualization
4. Select “Sheet 1” near the bottom of the screen to
go
to your workspace

4
Connect to
Your screen should look something like this.
Tableau
The Tableau workspace can be a little
intimidating at first glance but is
actually quite easy to understand. Here
I’ll give you a quick overview of what is
what on the Tableau Workspace.
Tableau
1. Here is your list of columns split into two
categories “Dimensions” and “Measures”
2. This is where your sheets and
dashboards will show up
3. Here is a list of all of the data sources
you’re connected to

4 8
3 6

1 7
2
Tableau
4. Here are some of the basic controls for
your Tableau workspace
•Open the Connect Menu to connect to other
data
sources
•Undo and Redo. Tableau has unlimited
undos and redos for a given work session
(every time you open a workbook)
•Add a data source
•Create a new sheet/dashboard
•Duplicate your current sheet or dashboard
4 8
3 6

1 7
2
Manipulating our Data
Let’s get back to our data.
Although Tableau intelligently handles data
imports, sometimes data is improperly
recognized. When this
happens it’s important for you to understand
your data well enough to recognize the error so
that it can be corrected.
1. Drag the Measure [Wards] from the
Measures section to the Dimensions
section (highlighted in blue)

1
Dimensions vs. Measures
-4 Kinds of Data-
What did we just do? We changed one of
the fields (columns) in our dataset from
a Measure to a Dimension.
Data in Tableau is classified as either a:
• Dimension
• Dimensions are any qualitative data
• Examples of qualitative data include names, colors, and cities
• This is mostly data that cannot be or is not numerical
• Measure
• Quantitative data, anything that can be numerically measured
• Population or profit

Additionally, data can be either:


• Discrete
• Represented by a blue pill or font in Tableau
• Discrete data is data that cannot be infinitely divided like names or
zip
codes, there is no 75251.5 zip code
• Continuous
• Represented by a green pill or font in Tableau
• Continuous data is data that can be (theoretically) infinitely divided
such as profits or latitudes/longitudes
Dimensions vs. Measures
-4 Kinds of Data-
Given that we can divide data into
these four types, we therefore have:

Discrete Dimensions
• Discrete
• Continuous
Continuous Measures
Measures s
• Discrete
Continuous Dimensions

Discrete Measures

The most common forms of data are


Discrete Dimensions and Continuous
Measures. If you think about it this makes
sense, most things that can be measured
like profits and latitudes can be divided
infinitely while most dimensions, which
can be thought of as ways to categorize
data are not infinitely divisible.
Making Your First
Visualization: Table
The most basic visualization in Tableau is the
Table. If you’ve ever used a pivot table in Excel
then you’ll notice that the interface in Tableau
actually will imitate certain behaviors of a pivot
table.
1. Drag the [Application Start Date] Dimension to ‘Columns’
2. Drag the [Wards] dimension to ‘Rows’
3. Drag the [Number of Records] Measure to ‘Text’
under the ‘Marks Card’
Congratulations you’ve made your first
visualization in Tableau. Although tables are
necessary and very useful, they are generally not
the best way to visualize your data. Next I’ll show
you how easy Tableau makes it to iterate though
multiple visualizations.

3
Making Your First
Visualization: Bar Chart
Although basic, Bar Charts are some of the most
effective visualizations you’ll have in your toolbox.
Don’t skip on a bar chart just because it might
not be as visually stunning as another
visualization you have
1. Double-click “Sheet 1” and rename to “Applications
by
Ward Crosstab”, then right-click and click “Duplicate”
2. Rename your new sheet to “Applications by
Ward Bar Graph”
3. Click the “Show Me” button
4. Click on the bar graph option
5. Sort the data in descending order by the year 2019

3
5
4
1 2
Making Your First
Visualization: Bar Chart
Congratulations, in a few clicks you’ve created a
completely different type of chart. The ability to
easily duplicate sheets and see what charts
can be made will allow you to try all the
different charts in Tableau in rapid succession
to find what works best for you.

3
5
4

1 2
Making Your First
Visualization: Geospatial
1. Create a new sheet by clicking on the
new sheet button
2. Drag [Latitude] and [Longitude] into rows and
columns respectively
• Don’t use [Latitude (generated)] or [Longitude
(generated)]
3. Drag [Application Number] into the Details box on
the Marks card
4. Select “Add All Members”

2 1
Making Your First
Visualization: Geospatial
Once you have something like this you’re ready
to move onto the next step, if your graph looks
different the next few pages can help you
troubleshoot
Making Your First
Visualization: Geospatial
If you see something like the graph below,
don’t worry, that most likely just means that
you reversed the [Latitude] and [Longitude].
There are two ways to fix this:
1. Click on the “Swap Rows and Columns” button on
the toolbar.
This will swap the [Latitude] and [Longitude] in
the Rows and Columns shelves which will allow
Tableau to recognize that you’re trying to make
a map not a scatterplot

1
Making Your First
Visualization: Geospatial
This is the second way to fix the graph.
1. Click on the “Symbol Maps” visualization
type under the “Show Me” menu

1
Making Your First
Visualization: Geospatial
2. Drag the [Application Number] Measure
from the Rows Shelf where Tableau put
back to the Details Card where we want it
This will explicitly tell Tableau what
visualization you’re looking for and it will
reorganize your data to produce that
visualization

2
Making Your First
Visualization: Geospatial
Make sure your graph now looks like this
Making Your First
Visualization: Geospatial
5. Drag [Wards] into colors
• Again click on “Add All Members” to bring in all
50
wards
6. Rename this sheet to
“Application by Ward Map”

2
Making a
1. Click on the “New Dashboard” button
2. Navigate to the new
Dashboard you created

1
Making a
You might notice that the space dedicated
to your dashboard (outlined in blue) might
be small or awkwardly sized (depending on
how Tableau was used/configured on your
machine. This is how you change that:
3. Under “Size” > “Range” select “Automatic”
This will dynamically resize the space your
dashboard
takes depending on the size of the device it’s viewed
on

3
Making a
4. Double click the “Applications by
Ward Crosstab” and Applications
by Ward Map” sheet on the left
side of the window to add them
to the dashboard

1
How to Save Your Workbook
Nice! You’ve created your first
Dashboard. Now you might want to
save your work, and if you’re using the
full edition of Tableau it’s as easy as
clicking on “File” > “Save As” and
saving your work. On Tableau Public,
you only have the option of saving to
Tableau’s Public cloud. This means that
anything you create on Tableau Public
can be seen and shared with anyone.
Sheets, Dashboards,
and Stories oh my!
You might have noticed the three
buttons near the bottom of the screen
when you were creating your
dashboard. These are the three basic
workspaces that you can use in
Tableau
Sheets, Dashboards,
and Stories oh my!
• Sheet
• Used to create a single metric
• You must create sheets in order to use Dashboards
• Dashboard
• Houses multiple Sheets
• Great for tracking metrics
• Story
• Houses multiple Sheets and/or Dashboards
• Usually used for your final product
• Acts as a presentation, is used to show a data story
• Helps you present one of Seven Data Stories
Seven Data Stories
Change Over Time
• One of the most common stories, great for
tracking performance
• Company profit over time
Drill Down
• Focuses on a potential cause of a phenomenon
Zoom Out
• Explains how something affects the bigger picture
Contrast
• Shows the difference between two categories
Intersection
• How do two seemingly unrelated categories relate
• Is there a correlation between the
temperature in a region and the number of
batteries sold
Factors
• Divides a phenomenon into categories
• Sales by battery category
Outliers
• Helps show and/or explain an anomaly
• Does a certain distributor buy more of a
certain battery or not

https://help.tableau.com/current/pro/desktop/en-us/story_best_practices.htm
Section 1:

Congratulations! You’ve
completed the first part of this
course. I believe there is no
better way to learn than to do
and hopefully the exercises
you’ve done will give you a
feel for how to use Tableau.
As you use it for your own
purposes you’ll start to
formulate questions that can
generally be answered by
going to:
kb.tableau.com
This is the official Tableau
knowledge base and contains
all of their documentation for
Section 1:
users.
Section 1:
This is everything that we’ve learned so far!

• What is Tableau
• The Importance of Data Visualization
• What Makes Tableau so Great
• Ease of Visual Exploration and VizQL
• Hyper Database System
• How to Install Tableau
• Connecting to Static Data
• Anatomy of the Tableau Data Page
• Anatomy of the Tableau Workspace
• Cleaning Data
• Dimensions vs. Measures
• Tables
• Bar Charts
• Geospatial Data
• Basic Troubleshooting
• Resizing Dashboards
• Saving your Work
• Sheets, Dashboards, Stories
• Seven Data Stories
Objectives: Section

We’re now going to try and


create
more advanced visualizations.
We’ll continue to use the same
dataset.
In Chicago, independent films
only need to pay a permit
price of $25 per day per
location, whereas big budget
films need to pay $250 per day
per location. We’re going to
use this information to try and
classify permits as either “Big
Budget” or “Independent”.
Objectives: Section
Shashank Kalanithi
Objectives: Section

This section assumes you’ve


completed Section 1 and
have your workbook saves
from there. If you don’t you
can download a copy I have
from here:
https://public.tableau.com/
profile/s hashank.kalanithi

!/vizhome/Shash
ankTableauCourseSection1/Das
hbo ard1
Objectives: Section
Shashank Kalanithi
Calculated
There isn’t a field in the dataset that
tells
us if a film is a Big Budget film or an
Independent film, so we’ll have to
make our own field to determine
this. This new field we’re creating is
called a “Calculated Field” because
it is populated by values calculated
from other fields.

3
4

5
Calculated
We know that Big Budget films get charged
$250 per day of shooting while
Independents get charged $25 per day
of shooting. We also know what fees a
movie was charged through the
column [Total Fees]. Films can also
have some fees waived which must be
taken into account through the
[Waived Fees] column. What we need
to figure out now is how many days a
movie was shooting for.

3
4

5
Calculated
1. Create a new sheet
2. Under “Analysis” click “Create Calculated
Field”
3. Name your new calculated field, “Days
Spent Shooting”
4. Use the following formula
• DATEDIFF('day', [Application Start Date],
MAX([Application End Date], [Application
Expired Date])) + 1
5. If you don’t see a gray pop-up on the
side of the Calculated Field window, then
click this arrow to reveal it

3
4

5
Calculated
You can see that a pop-up with the
description of “DATEDIFF” pops up.
DATEDIFF is a function that is designed
to calculate the difference between two
dates using whatever time period you
want. (‘years’, ‘months’, ‘days’ etc.)
You can see the bolded text in the
description below shows you the
different parameters you need to fill in
this function with.
Calculated
DATEDIFF('day', [Application Start Date],
MAX([Application End Date], [Application Expired
Date])) + 1

The yellow represents the base


formula, all of the other components
are there to serve this overall formula.
In our dataset, go row-by-row and
find
This the
what difference between
the difference two our
between
dates
This and
startisdate
the add
and
start 1 to that.
end
datedate
of We
should
our addbe 1
calculation,
because
calculated
the day if in.
a film
For was filmed
us, we’d only
like to on
th the application is to start
the 8 is
which , then
usuallythethestart and
first end
day of date
th
shooting. , and the DATEDIFF is 0
are the 8
Calculated
DATEDIFF('day', [Application Start Date],
MAX([Application End Date], [Application Expired
Date])) + 1

This part represents our end date,


the date that shooting stopped. In
this dataset like many you’ll
encounter, the information is not
clear and the ending date is
whichever comes later,
[Application End Date] or
[Application Expired Date].
Awesome! You’ve created your
first calculated field. We have two
more to create just remember
that calculated fields are some of
the most powerful parts of
Tableau and any time spent
learning new formulas and tricks
with calculated fields is time well
spent.
Calculated
Now that we know the number of days
that were spent shooting and the fees
we need to figure out the fees per day.
1. Create another calculated field
2. Name it “Fees Per Day”
3. Use the following formula
• ([Total Fees] - ZN([Waived
Fees]))/[Days Spent Shooting]
4. Click “OK”

2
3

4
Calculated
([Total Fees] - ZN([Waived Fees]))/[Days Spent
Shooting]

Alright onto number two!


The basic formula here is to divide
the fees charged by the number of
days spend shooting which we
calculated in our first calculated field.
This ZN formula is a weird but very
useful one. Basically it means Zero if
Null. For the field [Waived Fees]
there are what we call Null values. A
Null value means no data
exists in that cell (record). This isn’t the
same as a 0 or a blank as those are
statements of value (0 doesn’t mean no
data). The reason we need the ZN
formula is because if a record in the
[Waived Fees] column is Null and we
try and subtract it from [Total Fees]
then we’ll get a Null value, ZN will
replace the Null with 0 so we get the
real value we’re looking for.
Calculated
([Total Fees] - ZN([Waived Fees]))/[Days Spent
Shooting]

An example of this would be:


[Total Fees] = 100
[Waived Fees] =
Null
[Total Fees] – [Waived Fees] = Null
This is not what we’re going for. If
the [Total Fees] is 100 and there are
no [Waived Fees] then we need an
answer of
0.Therefore our formula
evaluates as shown below:
[Total Fees] – ZN([Waived Fees]) = 100
Calculated
Finally we need to take the information from the last
calculated field and determine what type of movie
we’re looking at.
1. Create another calculated field
2. Name it “Independent or Big Budget”
3. Use the following formula
• IF [Fees Per Day] = 25 THEN "Independent“
ELSEIF [Fees Per Day] = 250 THEN "Big
Budget" ELSE NULL
END
4. Click “OK”

2
3

4
Calculated
IF [Fees Per Day] = 25 THEN "Independent“
ELSEIF [Fees Per Day] = 250 THEN "Big
Budget" ELSE NULL
END

Now we need to output a label to


help us determine whether a film is
a Big Budget or an Independent
film. An If-Then statement is a
simple way to achieve the desired
result.
If-Then statements follow this
basic structure:
If (CONDITION1) Then (ACTION1)
ElseIf (CONDITION2) Then
(ACTION2) Else (ACTION3)
End
Calculated
IF [Fees Per Day] = 25 THEN "Independent“
ELSEIF [Fees Per Day] = 250 THEN "Big
Budget" ELSE NULL
END

You can have any many “Elseifs” in your


statements although you’re
discouraged from making these too
complicated as that can make edits to
your logic hard to execute on.
This part is saying that “If the [Fees
Per Day] are 25 then this calculated
field equal “Independent””
“If the [Fees Per Day] are 250 then this
calculated field equal “Big Budget””
You should always use a final “Else” stateme
This tells Tableau you’re done with
your statement and that there are no
other conditions.
Calculated Fields
What did we just do? We created three
calculated fields.
Calculated fields are one of the most
powerful tools in Tableau, they essentially
allow you to add another column to your
dataset that you can use to expand what your
visualization actually shows
When you create a calculated field, the little
icon that denotes the field type will have a
little “=“ icon next to it which tells you that
it’s a calculated field.
Calculated Fields
To create a calculated field in the menu bar
select “Analysis” then “Create Calculated
Field…”, you’ll get an editor like the one
shown below

1. This is where you change the name of your


calculated field
2. This is the editor that you put your
calculation into
3. This will tell you if you calculation
was done correctly or if it has errors
4. This area gives you a list, description, and
example of every function available in
Tableau. If you don’t see this section, click
on the arrow inside the blue box below

1 4
2

3
Exercis
Using what you’ve learned, recreate
the
sheet shown below
Exercis
How to Filter
Click on the 2019 value under
“Application
Start Date” and select “Keep Only”
How to Filter
You could also create a filter by
dragging
the 2019 value from the “Dimensions”
shelf to the “Filter” card and
selecting the blue “year” and
“2019-”

1 2
How to Filter
You could also create a filter by
dragging
the 2019 value from the “Dimensions”
shelf to the “Filter” card and
selecting the blue “year” and
“2019-”
Changing Data
1. Right-click the date pill under
Columns
2. Select the first Month option

2
Line Graph
1. Under “Show Me” click on the first
line
graph option

1
Warning! Dates and
Line Graphs
Be careful which set of dates you use when
you’re creating line graphs.
If you look carefully you might notice
that there are two sets of date values
when we right- clicked on our date pill.
The top half highlighted in blue are
discrete values and the bottom half
highlighted in red are continuous.
Warning! Dates and
Line Graphs
Why is this important? To illustrate:
1. Duplicate this sheet so you don’t lose
your
work
2. Right-click your filter
3. Select “Edit Filter…”
4. Select the year “2018” so that you
include
data from 2018 and 2019 now
5. Select “OK”
Warning! Dates and
Line Graphs
Why is this important? To illustrate:
1. Right-click your filter
2. Select “Edit Filter…”
3. Select the year “2018” so that you
include
data from 2018 and 2019 now
4. Select “OK”
Warning! Dates and
Line Graphs
You might notice that the x-axis of the graph
has not changed at all. How can this
be? We’ve added an entire year’s worth
of values to our graph. Right-click the
date pill and you’ll see why.
Warning! Dates and
Line Graphs
Like we mentioned earlier, the Blue and
Red boxes represent whether Tableau
graphs the dates as Discrete or
Continuous respectively. But what about
the black box?
The black box just changes how the values
are represented on the axis. To more clearly
illustrate, let’s change the value in the
black box to be “Discrete”
Warning! Dates and
Line Graphs
You’ll notice that only the axis changed, we’re
still only seeing 12 x-values for two years of
data. This is because we chose “Month” from
the “Discrete” section therefore Tableau is
creating the months as a categorical
variable, meaning it’s adding all of the
January’s together and all of the February’s
together without regard to year. As you can
expect this isn’t what we really want. On this
sheet continue to play around with these
settings to see if they help you understanding.
Line
1. When you’re done experimenting go
back to the sheet “Big
Budget vs Independent
Movies by Year”
2. It would be helpful if the axis
had labels corresponding to the
names of months. Right click the
date pill and select “Discrete”

2
Line
3. As an option if you want you
can use this dropdown to tell
Tableau how to fit this
visualization onto the screen

3
Section 2:
Great job! It looks like over the year 2019
Chicago brought in some good business
with hundreds of permits being given to
Big Budget movies. This concludes Section
2 of the course. Go ahead and save your
work. I will leave a copy of this workbook
here:
https://
public.tableau.com/profile/shashank.kalani
thi!/vizhome/ShashankTableauCourseSecti
on2/B igBudgetvsIndependentMoviesbyYear
Section 2:

This is everything that we’ve


learned in Section 2!

• Calculated Fields
• Date Calculations
• If-Then statements
• Filtering Data
• Line Graphs
• Continuous vs. Discrete Dates
Section 3:

For this part of the course


we’re going to be looking at
some player stats from the
English Premier League.
The English Premier League is
the highest level of soccer
(football) played in England
and one of the most watched
sports leagues in the world.
We’re most interested in plotting
data on player salaries.
Connect to
First, let’s see if we can get some basic
information on players.
1. Open a new Tableau workbook
2. In the connect menu, click on
“Microsoft Excel”
3. Import the “Player Data” workbook
Connect to
It looks like the data that we have has
split players that are from England with
those who aren’t from England, let’s see
if we can combine the two datasets.
4. Drag in the sheet labeled “English Players”
5. Right-click on the sheet you just
dragged or click on the downward
facing arrow
6. Click on “Convert to Union”

5
6
4
Connect to
7. Drag in the sheet labeled “Non-
English Players”
8. Click “OK”

3
Potential
If you find that the window you’re
supposed to drag “Non-English
Players” to disappears when you
try and drag the sheet in, it’s just
hiding behind the main Tableau
window. Move that window and
you’ll be able to drag it in as
expected.

2
Potential 3
Union
What we just did was union our data. You will start to notice that
Tableau borrows the terminology for a lot of its concepts from SQL
and Excel. This makes the tool really easy to pick up if you’re
familiar with these concepts and the skills you learn here easy to
transition to SQL and Excel if you’re just learning them now.
Put simply, a union (SQL Union) is the process of stacking two
tables on top of one another. You will usually do this when your
data is split up into multiple sections like an excel spreadsheet of
a year’s sales split by month.
As you can see from below, Tableau stacks the columns based on
column name, and will put in null values wherever a value for a
given column doesn’t exist

Style Color Model Style Color Model Car

Big Blue 30X Medium Chartreuse 123H Atoyot


Small Red 50X
Puny Gamboge 313J Drof

Union

Style Color Model Car


Big Blue 30X Null
Small Red 50X Null
Medium Chartreuse 123H Atoyot
Puny Gamboge 313J Drof
Union
The Premier League is very
diverse and although it’s an
English league it has players
from all over the world.
Because of this the data for
both player sets is separated
and we have to combine them
somehow. Because the table
structures are almost exactly
the same we can easily
combine the two groups
together through a Union.
Inspect the
Before you ever start creating
visualizations
you should always inspect your data.
If you look under your Dimensions you
might notice a Dimension that should
probably a Measure
At this end of this booklet I’ve compiled
a small checklist that will help you
inspect your data for common errors
that I have encountered in my career.
Inspect the
If you guessed [Height] then you were
right! It’s a string a Dimension which
doesn’t sound right. Drag it into the
Rows shelf so we can take a look at it.
You’ll notice that it’s in a format that
won’t let us do much with it. Let’s
convert it to a Measure, this will
require us to convert it to either feet
or inches. Let’s go with inches
1. Click on the “Data Source” tab
Splitting
If you scroll to the right side of your data
pane, you’ll notice that the “height”
column is formatted in a way that will
make it difficult to analyze
As a string we can’t perform normal math
on
this column
Because there are two units in one column
“feet and inches” it’ll make it even more
difficult to perform math on this data,
we’ll want to convert this to just a
single unit column
Splitting
1. Right click on the “Height” column
and
click “Custom-Split”
Splitting
2. In the textbox input an apostrophe
as
the separator
•This is telling Tableau to split this
column when it sees an apostrophe
3. Select “All” under “Split Off”
•This tells Tableau to split off as
many columns as there are
apostrophes per record
4. Click “OK” and return to Sheet 1
Splitting
8. Right-click the Height – Split 1
• Do the same for Height – Split 2
9. Change the data type to
“Number (whole)”
• Do the same for Height – Split 2
10. Drag Height – Split 1 from
Dimensions to Measures
• Do the same for Height – Split 2
Data
You might have noticed I the little
icons next to the names of your
Measures and Dimensions. Here is
an overview of what the different
datatypes are and when they
should be used

String A string is a catchall


datatype because it can
Data include just about any
textual data
(Text)
Numeric Numerical data is any data that math
can be performed on. Be warned, that

al Data
although an ID number is numerical,
you can’t perform math on it so you
should classify it as a string

You could almost have an


Date Data entire course discussing the
intricacies of dates. For more
info click here.

Date Time Date with time included as


well, you generally want to
Data avoid these if they aren’t
necessary.

Boolean True/False values. Well designed


datasets will have these for
Data common filters like Current FY
vs prior FY
(True/False
Data
)
Geospati This can be in the form of place
names (postal codes, states,
al Data countries, cities) or geospatial
objects
Manipulating
We have converted and brought in the
constituent parts that we’ll need to create
a
single field for height
1. Click on “Analysis”
2. Click on “Create Calculated Field”
3. Name the new field “Height (Inches)”
4. ([Height - Split 1] * 12) +
[Height - Split 2]
5. Click on “OK”
1

5
Aliase
Sometimes data we import is
encoded in a way that won’t make
sense for a visualization. When this
happens a solution is to change the
aliases of the data. Changing aliases
does not change the underlying
data, just how it appears in your
visualizations.
1. Right-click “Skill Moves” and
click “Aliases…”
Aliase
2. Use the “Value (Alias)” column to fill in the
aliases
as shown below:
1. Weak Skill
2. Below Average
3. Average Skill
4. Above Average
5. High Skill
3. Click “OK”
Now whenever the value 1 appears in relation
to [Skill Moves] Tableau will instead show “Weak
Skill”, i.e. an Alias

2 3
Aggregation
3. Drag:
•Height (Inches) into Columns
•Weight into Rows
•Right-click the measures
4. Change the aggregation of
both your Height and Weight
from Sum to Average

1
Aggregatio
Aggregations are one of the most
important and most complicated
data- related concepts for newbies
to understand. The basic idea is
that you might want to collapse or
expand a table in different ways
and aggregations tell Tableau what
to do with your Measures when you
do that. Let’s look at a simple
example.

Year Month Sales Revenue

2020 January 10 $100


2020 February 12 $120
2021 January 14 $150
2021 January 13 $120
Aggregation:
Take a look at the table below, you’ll notice
that it has numerical data by year and month.
Say we wanted to look at the data by year,
we’d need to tell Tableau how
to group (Aggregate) the numerical values. The
default way is to SUM it, meaning just add all
the values within their respective groups, in
this case the [Year] column.

When we group by [Year] then we will need


to sum all of the corresponding [Sales] and
[Revenue] columns.

Yea Month Sales Revenue


r
2020 January 10 $100
2020 February 12 $120

2021 January 14 $150


2021 January 13 $120

Yea SUM(Sales) SUM(Revenue


r )
Aggregation:
2020 22 $220
2021 27 $270
Aggregation:
There are numerous ways to aggregate your
data, over the next few pages I’ll go over the
most common ones I’ve used.

AVG or average, just takes the average of all the


data
you’re aggregating.

Here we’ll be averaging the [Revenue] column and


summing the [Sales] column

Yea Month Sales Revenue


r
2020 January 10 $100
2020 February 12 $120
2021 January 14 $150
2021 January 13 $120

Yea SUM(Sales) AVG(Revenue


r )
2020 22 $110
Aggregation:
2021 27 $135
Aggregation: MAX/MIN
MAX and MIN look for the smallest value per
each of your groupings and outputs that value.

Here we’ll be maxing the [Revenue] column and


minning the [Sales] column

Yea Month Sales Revenue


r
2020 January 10 $100
2020 February 12 $120
2021 January 14 $150
2021 January 13 $120

Yea MIN(Sales) MAX(Revenue


r )
2020 10 $120
2021 13 $150
Aggregation:
COUNT/COUNTD
COUNT, counts the number of rows in a group
while COUNTD will count the number of unique
rows in a group.

Here we’ll be counting the [Month] column and


also countd’ing (pronounced “count-dee-ing”) the
[Month] column

Yea Month Sales Revenue


r
2020 January 10 $100
2020 February 12 $120
2021 January 14 $150
2021 January 13 $120

Yea COUNT(MONTH) COUNTD(MONTH


r )

2020 2 2
2021 2 1
Aggregation: ATTR
ATTR is a very special aggregation function. It
will return a value if there is only one for every
row in a group, otherwise it returns an
asterisk“*”. This is a really useful function to
use in calculated fields where you cannot
normally combine aggregated and non-
aggregated values.

Here we’ll be taking the attribute of the [Month]


column.

Yea Month Sales Revenue


r
2020 January 10 $100
2020 February 12 $120
2021 January 14 $150
2021 January 13 $120

Yea ATTR(MONTH)
r
2020 *
2021 January
Scatterplot
Let’s see if there is any relation
between
player position and compensation.
1. Drag “Position” into “Colors” on the
Marks card
• If you get a warning, select “Add all
Members”
Scatterplot
We have our scatterplot, but it’s almost
impossible to tell what’s what. Let’s see
if
we can change the axes to help with
that
Editing Axis
1. Right click the y-axis and click on the
“Edit Axis” option
2. This dialog box allows you to
manipulate the axes as you please,
uncheck the “Include Zero” box
3. You can also change the title here
if you want to
4. Go ahead and follow the same
process for the x-axis to remove
the 0 value

3
Axes Don’t Have to Include
There’s a common misconception that the y-
axis needs to start at 0. While messing with
the y-axis is one of the easiest ways to Lie with
Statistics, oftentimes you need to adjust the
axes depending on the audience. The Fox
News example below is particularly egregious
example of fiddling with the y- axis to prove
a point (They even moved the axis to
the left hand side so you’re likely to miss it
entirely).
The axis starts at 34% to make the ~5%
marginal increase seem much larger than it
actually is.
Axes Don’t Have to Include
You should always make an effort to represent
your data accurately, but it can be just as
important to consider the context surrounding
your visualization.
An example of this would be a visualization I
needed to put together to illustrate how profits
changed during the 2019-2020 Coronavirus
Pandemic for our execs.
While profits did in fact drop, you could only
visualize the difference and analyze it if you
zoomed in on the y- axis, if I pegged the axis to
0 then you wouldn’t think the pandemic did
anything to our profit.
The two graphs below show the exact same
data, but to the informed audience (such as
your company execs) the one on the right-hand
side gives you more information. As a
visualization professional, you need to digest
data, and communicate information.

2 uary February March


0 April
1
Axis
0
5
J
4 a
3 n
Axes Don’t Have to Include
A
4.2 dj
4.1 us
4 te
3.9 d
3.8 A
3.7 xi
s

Ja
nu
ar
y

Fe
br
ua
ry

Ma
rc
h

Ap
ril
Moving the Graph
While conducting an exploratory analysis, you
might find that you need to manually
manipulate the graph. Here are a few
commands that allow you to do so.
• You can use Ctrl + Scroll to zoom in and out of a
viz
• You can use Shift + Drag to pan around a viz
• Double-click your current sheet and rename it to
“Average Measurements by Position”
• You can also right-click any axis and click “Edit
Axis”
Grouping Your
Sometimes the level of detail you
have is too granular, in these
situations grouping the data might
help you out. In the example below, all
these positions listed individually don’t
really help us out that much.
Grouping data adds a column that
categorizes
the data the way you want
1. Right-click [Position] and click “Group…”
Grouping Your
2. Ctrl + Select the positions that go
together and click on “Group”
•Make sure you check the “Include Other”
checkbox
3. Click “OK”
4. Click on “Data Source”

3
4
Joins and Importing
Unusual Data
They say that Data Scientists spend about 70-
80% of their time just gathering and cleaning
data. Oftentimes this happens just because
data comes in inconvenient formats. Today,
we’ll be working with one such format in the
form of a table inside a PDF. Tableau has the
ability to help us parse through this kind of data
quite easily.
We don’t have access to some very basic player
compensation info. Unfortunately this data is in a
PDF format which would normally make it very
hard to work with, but not with the power of
Tableau!
Joins and Importing
Unusual Data
1. Using the “Add” button above our first data
source, import the PDF “Data to Join”
2. On the dialog box that pops up, click “All” to
import
all of the pages in the PDF
Joins and Importing
Unusual Data
3. Click on the first Page, hold onto
the Shift key on your keyboard
and scroll down to select the last
Page, drag all of these Pages to
the right of “English Players+”
Joins and Importing
Unusual Data
4. Click on the Venn Diagram
between the two data
connections
5. Select the Left Join option
6. Go back to the sheet we were
working on

6
Joins
What we just did was join our data. Joins are
another concept that Tableau borrows from
SQL. In the way that Unions combine two tables
vertically, Joins combine two tables horizontally.
For a join, like a Union you have to have at least
two tables, what we call our Left Table and our
Right Table. You (mostly) have to have at least
one matching column between the two tables,
and you will match rows from these columns.
The most common way to visualize the types of
Joins are through Venn Diagrams.

ID Left Table
Color Model ID Right
Car Table

1 Blue 30X 1 Atoyot

3 Red 50X 2 Drof


Joins
There are four basic joins that you can use.

Inner Join

Left Join

Right Join

Full Join
Joins
You’ll mostly be sticking to Left and Inner Joins.
It’s worth your time to learn more about Joins
because they are some of the most powerful
tools you can use to manipulate data. I use
Joins basically every single day in my work. For
this course we’re going to stick with relatively
simple Joins.
Joins: Inner Join
We’re now going to do something called an
Inner Join on the [ID] column which will only
output exact matches from the [ID] column in
our output.

ID Color Model ID Car

1 Blue 30X 1 Atoyot


Join
3 Red 50X 2 Drof

ID Color Model ID Car


(Right)
1 Blue 30X 1 Atoyot
Joins: Left Join
A Left Join keeps all of the data from your Left table
and whatever matches from the Right table.

ID Color Model ID Car

1 Blue 30X 1 Atoyot


Join
3 Red 50X 2 Drof

ID Color Model ID Car


(Right)
1 Blue 30X 1 Atoyot
3 Red 50X Null Null
Joins: Right Join
A Right Join does the exact opposite and keeps
everything from your Right table while only
bringing in the matches from the Left table.

ID Color Model ID Car

1 Blue 30X 1 Atoyot


Join
3 Red 50X 2 Drof

ID Color Model ID Car


(Right)
1 Blue 30X 1 Atoyot
Null Null Null 2 Drof
Joins: Full Join
A Full Join brings in everything from both
tables and matches whatever will match from
the columns you specify.

ID Color Model ID Car

1 Blue 30X 1 Atoyot


Join
3 Red 50X 2 Drof

ID Color Model ID Car


(Right)
1 Blue 30X 1 Atoyot
Null Null Null 2 Drof
3 Red 50X Null Null
Joins: Common Join
Joins can get a bit tricky because of the potential
for gotchas when joining two tables. The most
common one is row duplication where you
accidentally duplicate
rows because the columns you’re matching on have
multiple potential matches .
In the example below we’re going to try an
Inner Join. You’ll notice the columns in Orange
were duplicated.

ID Color Model ID Car

1 Blue 30X 1 Atoyot


Join
3 Red 50X 1 Adnoh

ID Color Model ID Car


(Right)
1 Blue 30X 1 Atoyot
1 Blue 30X 1 Adnoh
Joins: Common Join
This isn’t an error per se but it is something to
watch out for as it can cause you to duplicate
data you don’t intend to duplicate.

ID Color Model ID Car

1 Blue 30X 1 Atoyot


Join
3 Red 50X 1 Adnoh

ID Color Model ID Car


(Right)
1 Blue 30X 1 Atoyot
1 Blue 30X 1 Adnoh
Joins and Importing
Unusual Data
7. Value, Wage, and Release Clause
are all amounts in Euros so
make sure to change them to
numbers and move them to
Measures
8. Right click one of the new
values and click Default
Properties > Number Format

8
Changing the Default
Number Format
9. Click on “Currency (Custom)”
10. Enter a € symbol under “Prefix”
• You can get this symbol by opening
a word document and clicking Ctrl +
Alt + E and copying the Euro symbol
into Tableau
11. Repeat this for the “Wage” and
“Value”
columns of data

9
Exercis
Now let’s see if we can visualize
how much each [Club] spends on
their players by [Position (group)].
Try and recreate this visualization
Exercis
Here’s how you would do it
Exercis
You’ll notice that the aggregation
of our data is CNTD, or Count
Distinct. This means that the
length of the bars corresponds to
the number of distinct values in
that group. The default
aggregation of this data is CNTD
instead of SUM because we
converted it from a string. Right-
click and change it to SUM
Exercis
Then you just need to sort the data
in descending order using the
button at the top
Exercis
Let’s rename the visualization to:
Player
Value by Club
Marks
The Marks Card is one of the key ways you’ll
change your visualizations in Tableau. Put
generally, the Columns and Rows shelf is where
you place your data to create the basic form of
your visual, and you change the details using
the Marks Card.
Marks are how your rows are represented in
Tableau, they are the individual “points” on your
visualization. If you want to get a better idea of
what a Mark is, try changing the Mark type
using the dropdown highlighted in blue.
Marks
If you want to change any individual
aspect of your visualization, the
Marks Card should be your first
stop. From here you can change
colors, how much detail is in your
visualization, whether labels show
up, and what the tooltip that comes
up when you hover over a mark
says.
End of Graph
Bars are great, but this graph would
benefit from having labels on it at least
for the full value of the bar
Using what we learned about the Marks
card
you might want to add labels to the chart
Dragging the [Value] measure to the
Label Marks Card (and changing the
aggregation) gives us something that
looks like the below.
This isn’t particularly useful, let’s see if we
can get the labels at the end of the graph.
End of Graph
1. Click on the “Analytics” tab on the
upper
left of the interface
2. Double-click on “Reference Line”
3. Select the “Per Cell” option
4. Aggregate by Sum
5. Fill the label with Values
6. Hit OK
7. Remove the [Value] measure
from the Marks Card
1

3
4
2
5

6
End of Graph
Here’s what we end up with.
When we put the [Value] column in the
“Label” part of the Marks Card, Tableau
tried to label at the lowest level of
aggregation on the chart which was the
player [Position (group)] level. What we
did to fix this was to add a line at the end
of the graph that was aggregated at the
[Club] level and then labeled that line.
Adding an Average
This visualization is great and
clearly communicates where teams
are relative to each other, but
what about relative to the
average?
1. Drag “Average Line” to the
“Table” section of the menu
(Under the “Analytics” menu

1
Adding an Average
2. Right-click the average line and
select
“Edit…”
Adding an Average
3. Select “Custom”
4. Type in: <Computation>: <Value>
5. Hit OK
6. Click on the Connect Menu icon

3 4

5
Import Spatial
We happen to have some data on
stadium locations. Let’s see if we
can bring this in and correlate it to
the numerical data
we’ve compiled.
1. Select “Spatial File”
2. Import the “Data to Blend” dataset
Import Spatial
3. Right-click on Name
4. Click Rename and name it “Club
• Rename Description: Stadium
5. Create a new Sheet and navigate to
it
Spatial
The file we just imported is
something called a .KML file.
This is a spatial file type that
stores information about
geography.
Spatial files can store three
basic geospatial data types:
•Points
•Lines
•Polygons
You can use these to create
visualizations that would
otherwise be very difficult to
create.
Data
Now we need to relate our spatial
dataset
to our player data.
1. Click on Data on the menu tab and
select “Edit Relationships”
Data
You’ll see that from your primary
source to your secondary source,
they are automatically linked on the
[Club] column.
This is the menu where you would
define what columns are linked
together if you needed to.
• Press “OK” or exit out of this menu
Data
Like Joins and Unions, the Data Blend is a way to
combine data from multiple sources.
A Data Blend emulates a traditional Left Join, the
difference is that a Join will combine the data
row-by- row then aggregate the resultant table,
whereas in a Data Blend, the tables will be
aggregated as necessary for the view, then
combined.
One important difference with a Data Blend vs a
Left Join is that a Data Blend will not duplicate
results, instead, if there is a many-to-one
relationship between the two data sources that
you’re combining, then Tableau will output an
asterisk “*”.
A Data Blend will never output more rows than exist
in
the “Left” or Primary table.
The next page will compare a Left Join to a
Data Blend where we have multiple matches to
illustrate the differences between the two.
Data
You’ll notice in this Left Join the data duplicates as
we’d
expect it to. (Highlighted in Orange)

ID Color Model ID Car


Left Join
1 Blue 30X 1 Atoyot

3 Red 50X 1 Adnoh


ID Color Model ID Car
(Right)
3 Ustahiad
1 Blue 30X 1 Atoyot
1 Blue 30X 1 Adnoh
3 Red 50X 3 Ustahiad
Data
In this Data Blend, because we don’t have a
single clean match for [ID] = 1 in our right
table, Tableau doesn’t know which value to use
and will output an asterisk instead.

ID Color Model ID Car


Data Blend
1 Blue 30X 1 Atoyot

3 Red 50X 1 Adnoh


ID Color Model Car
3 Ustahiad
1 Blue 30X *
3 Red 50X Ustahiad
Data
In Tableau you’ll know what fields
you’re linking a blend on based on a
chain symbol next to the field on
the secondary source. A red link
means that that fields are linked, a
grey link means they aren’t being
actively linked now. You can click
on this icon to toggle this setting.
Geospatial
1. Double-click [Geometry]
2. Drag [Club] to the Details
section of the Marks card
3. Under the Show Me menu,
change the graph to be a
symbol map

3
2

1
Geospatial
4. Select the English Players+ data
connection
• Look at the blue box and you’ll see that
Tableau automatically linked this data
source using the [Club] field
5. Drag [Value] to “Size” under the Marks
Card
• Make sure the aggregation is a SUM
6. Click on Size and edit the slider until
you
find a size you’re comfortable with
4

5
Geospatial
You can use the Color section of
the Marks card to edit the opacity,
colors, and the border of the Marks
on your visualization
Generally I like to use 75% opacity
and make my borders black. I find
this tends to make your
visualizations pop a little more but
experiment and see what works
best for you.
Parameters and
Parameter
1. Drag [Club] under the English
Players++ dataset to the
Filters card
2. Select the “Top” tab
3. Select the “By field” radio button
4. On the second dropdown, select
“Create a New Parameter”

2
3 4

1
Parameters and
Parameter
5. Change the
Parameter’s Value Display As
name to “Top
Teams by 1 Top Team
Player Value”
6. Change “Allowable 5 Top Five Teams
values:” 10 Top Ten Teams
7. Copy the
values in this 20 Top Twenty Teams
table into the
“List of values”
section
8. Click “OK”

7
Parameters and
Parameter
9. Select [Value] and “Sum” on the
next
two drop downs
10. Click “OK”

9
Parameters and
Parameter
11. Right click your new
Parameter (below Measures), and
make sure that “Show Parameter
Control” is selected
12. Name your new sheet “Top
Teams by Player Value Map”

10
Parameters
Parameters are constants that
can be used all over Tableau.
You can use it to dynamically
change a calculation, to filter
data by measures, or allow your
audience to define certain criteria
when exploring your data.
In the example we just finished,
we used a Parameter to allow
the user to change which teams
they saw on the visualization.
When setting our filter normally,
we’d have to select a constant to
filter on, but because we
connected it to a Parameter, we
change that constant as we
please.
Choropleth
1. Create a new sheet
2. Right-click [Nationality] and
select Geographic Role and
Select
“Country/Region”

1
Choropleth
3. Double-click on [Nationality]
4. Under Show Me click on
the Choropleth Map
5. Drag [Value] to Colors
6. Change the aggregation to SUM for
colors if you need to

3
Correcting
1. Click on the “6 Unknown” icon on
the
bottom right
2. Click on “Edit Locations”

1
Correcting
Tableau is very strict in how it
recognizes geographic locations so
this is something you’ll find yourself
doing often
3. Correct Macedonia and
Korea Republic
4. Click “OK”

4
Correcting
5. Name your sheet “Map of
Countries by Player Value”
6. Create a new sheet

5 & 6
Exercise
Now let’s see if we can create a
visualization of the top five
players by [Position (group)]
Recreate the visualization below.
Using Index to Rank
1. Create a calculated field called
“Index”
2. The formula is “Index()”
Using Index to Rank
3. Right-click [Index] to convert the
measure into a discrete value
Using Index to Rank
4. Place [Index] in between [Position]
and [Name] in your rows shelf
Using Index to Rank
5. Right-click [Index] in the Rows shelf and select
“Edit Table Calculation…”
6. Under “Compute Using” click on
“Specific Dimensions”
7. Choose “Restarting every” [Position (group)]
8. Make sure all of the columns are selected
9. Under “Sort Order”, select “Custom” and sort in
descending order by [Value] and aggregate by
SUM

8
7

9
Using Index to Rank
10. Drag [Index] from the Rows Shelf
to
the Filters Card
11. Select numbers 1 through 5 and
hit
“OK”

10

11
Using Index to Rank
What we just did was sort all of
the data in our visualization in
descending order by [Position
(group)] and then attach an index
which counted each position. We
then filtered on this index which
allowed us to only include the top 5
values per [Position (group)]

10

11
Dashboard
1. Go to the “Map of Countries by
Player
Value”, right click the sheet
tab and select “Duplicate as
Crosstab”
Dashboard
2. Bring the [Club] and [Name] of
the players into the “Rows”
shelf in the order shown
below
Dashboard
3. Create a new Dashboard and set the
size to “Automatic”
Dashboard
4. Recreate the Dashboard below by
dragging in the map and the crosstab
Dashboard
5. Click “Actions” option under the
“Dashboard” menu item
Dashboard
6. Under “Add Actions” select “Filter”
Dashboard
7. Under the “Source Sheets” select
the
“Map of Countries by Player Value”
8. Under the “Target Sheets” select
the “Map of Countries by Player
Value (2)”
9. Make sure the action runs on
“Select”
10. Under “Target Filters” select
“Selected Fields” and use the
[Nationality] as the column to
filter on

7 9
10

10
Dashboard
You’ll notice that as we click around
to different countries on the map
the values in the table change
Dashboard
In Tableau, Dashboards are a
collection of sheets that when
viewed together give the user
a fuller picture of the data
being viewed. As such, it
makes sense that actions
taken on one sheet in a
dashboard would affect other
parts of the dashboard. This is
where dashboard actions come
in handy.
Dashboard actions allow us to
define how an action such as
selecting or highlighting a
mark will affect the
visualization such as through
filtering.
Dashboard
You’ll notice that as we
click around to different
countries on the map the
values in the table
change
Differentiating
You might notice that the colors on
this map are not well
differentiated, let’s see if we can
change that
1. Click on the drop-down for the color
legend and select “Edit Colors…”
Differentiating
The problem here is that the range of SUM[Value]
is too great here. We’ll address that in a moment,
for now let’s change the following settings to see
if any change takes place
2. Click on the “Advanced” option
3. Set the Start and End values to be
10,000,000 and 500,000,000 respectively.
The Center value will automatically reset
4. Select the “Stepped Color” option and click “Apply”
You might notice that there isn’t a large difference
in the graph. We can more clearly see the nations
that have a lot of high payed players (Brazil,
France, Spain), but most of the map is still
highlighted ineffectively.

4
2

3
Creating a
5. Go back to “Map of Countries by
Player Value”
6. Right-click [Nationality], and select
“Create”>”Set…”

5
Creating a
7. Name the Set “Top 10 Countries by
Salary Value” then the “Top” tab
8. Select the “By field” option.
“Top” “10” [Value] “Sum” and
hit “OK”

8
Creating a
9. Drag your newly created Set to the
filter card
10. Go back to the “Edit Colors” menu

10

9
Creating a
9. Drag your newly created Set to the
filter card
Set
In Tableau there are generally
multiple ways to accomplish a
desired result. A set provides
a very easy way to add a
custom field that defines a
subset of data as either “In” or
“Out” of the set. In this case
we use it to limit our map to
only countries with large
combined player salaries. As
you might have figured, we
could have also accomplished
this result using an Index, a set
is just another way to do so.
Section 3:
Congratulations! This is the end of
Section 3. You now have several
visualizations you can use to
analyze player salaries. The point of
this section was not so much to
answer a question as it was to
show you how you could do the
same thing multiple ways in
Tableau.
Section 3:
In Section 3 you learned how to:
•Connect to PDF data
•Union Data
•Join Data
•Blend Data
•Alias Data
•Split Columns
•Define Data Types
•Aggregate data
• SUM
• AVG
• COUNT and COUNTD
•Edit Axes
•Make Scatterplots
•Group Data
•Use the Marks Card
•Add Labels to the end of a Bar Graph
•Create Parameters and Parameter Actions
•Correct Locations
•Create Dashboard Actions
•Create Sets
End of
Section 1 Stories
Section 2:
• What is Tableau • Seven Data
Stories • Calculated Fields
• The Importance of • Date Calculations
Data Visualization • If-Then statements
• What Makes Tableau • Filtering Data
so
Great • Line Graphs
• Ease of Visual • Continuous vs.
Exploration and Discrete Dates
VizQL
• Hyper Database
System
• How to Install
Tableau
• Connecting to Static
Data
• Anatomy of the
Tableau Data Page
• Anatomy of the
Tableau
Workspace
• Cleaning Data
• Dimensions
vs. Measures
• Tables
• Bar Charts
• Geospatial Data
• Basic Troubleshooting
• Resizing Dashboards
• Saving our Work
• Sheets, Dashboards,
End of
Section 3:
• Connect to PDF data
• Union Data
• Join Data
• Blend Data
• Alias Data
• Split Columns
• Define Data Types
• Aggregate data
• SUM
• AVG
• COUNT and COUNTD
• Edit Axes
• Make Scatterplots
• Group Data
• Use the Marks Card
• Add Labels to the end of
a Bar Graph
• Create Parameters
and Parameter
Actions
• Correct Locations
• Create
Dashboard
Actions
• Create Sets
Resource
•kb.tableau.com
• Tableau’s online documentation. Information
on all the features and examples are
available here. Most questions about
Tableau can be answered here or on
Tableau’s forums
•Storytelling With Data
• A great book to get you started with
learning how to present data. It is platform
agnostic but everything here can be easily
applied to Tableau
•Makeover Monday
• This is a global movement where every
Monday a new dataset along with
visualization is released and participants are
asked to redesign the visualization usually
using Tableau
•The Big Book of Dashboards
• Probably the best resource I know to help
build Tableau dashboards specifically. Has
lots of tips on specific improvements you
can make to your dashboards.
Data
Over time you’ll find that a large proportion of your
time will be dedicated to gathering a cleaning data.
Developing a process to do this effectively and
efficiently will save you a lot of time in the future.
This is a checklist of things to look out for when
forming your data sources.
This list is my no means comprehensive but hopefully will
help save you some time and improve the quality of
your deliverables

 What is the question  Should I Alias any fields


I’m  What filters can I
trying to answer apply at the get go
 Is all my data in the  What calculated
right fields do I need and
format do I have all the
 Do I have too much information to make
data them
 Is there any  Is my data
information I can Join aggregated at a level
or Blend to improve that allows me to
my analysis provide useful
 Do I need to split insights
any columns
 Are all my fields
logically
named

You might also like