Tableau Performance Checklist
Tableau Performance Checklist
Tableau Performance Checklist
Who is concerned?
workbook optimizer
Server one-shot
workbook optimizer
reporting
How
To start recording performance, follow this step:
Help > Settings and Performance > Start Performance Recording
To stop recording and view a temporary workbook containing results from the recording session, follow this
step:
Help > Settings and Performance > Stop Performance Recording
You can now view the performance workbook and begin your analysis.
http://10.32.139.22/#/views/Coffee_Sales2013/USSalesMarginsByAreaCode?:record_performance=yes&:iid=
1
Click the Refresh button in the toolbar.
Load the view.
View a Performance Recording
Click Performance to open a performance workbook. This is an up-to-the-minute snapshot of performance
data. You can continue taking additional snapshots as you continue working with the view; the performance
data is cumulative.
Move to a different page or remove :record_performance=yes from the URL to stop recording.
At the end of the "Publish Workflow" window or at the end of the Publish menu.
the http_requests (and the associate view) from the workgroup db in the Tableau Server Repository can help.
It's used in the Tableau Server Insights datasource available here : https://github.com/tableau/community-
tableau-server-insights
Pro Cons/limit References
very easy to use, with a good details "one-shot" report. Be careful : you won't https://help.tableau.com/current/
see the details if the datasource is published
on a server (in this case, download the
datasource and open it on your desktop)
A good starting point to have insights Guidelines necessarily general and may not
apply in every situation. These suggestions
are a starting point only; always frame your
decisions in the context of your
environment and the goals of your
workbook.
Item
Machine
Server
Desktop
Cloud
Tableau Version
Respect of requirement
Consider to upgrade Tableau to the last release, especially if you're more than
1 year late. Optimizations are frequently made by software editors (like cache
management, improved queries...).
please check Tableau requirements
https://www.tableau.com/products/techspecs
https://help.tableau.com/current/server/en-us/server_hardware_min.htm
type
Not all SSD are equals, the PCIE 6 bandwidth is x8 higher than PCIE 3
Network can affect very badly the performance of your dashboard in several
ways, from firewall to vpn bottleneck and in upload (the query you send) like
in download (the result of query you receive). If you use a VPN, try to
reproduce without it. If you're on a desktop, try to reproduce on server, etc,
etc. Some tools are designed especially for it but even a ping can help.
Relationship/join/union
blending
Number of connections
Custom SQL
Row-Level Security
This part is probably the hardest because your data model is above all linked to a functionnal need. In this document, we will have
general advices that must be followed with precautions.
Comment
Depending of your datasource, extracts in hyper (avoid .tde) can improve your performance by several orders of
magnitudes. However, other considerations should be taken such as data duplication (and the ressources needed),
the management of reload tasks, etc. Rule of thumb : if "non-analytic/non-MPP/ transactional" database or file,
extracts will be great. If you have a dedicated database for analytic, try to use it for architecture, governance and
monitoring.
Union : try to precompute it as much as you can (before Tableau). Relationships are computed only when needed,
contrary to Joins (that are always computed), so would provide better performance (However, Joins can be needed
for functional reason). Be careful about the field types you use for join (basically boolean is faster than integer that is
faster to string, etc, etc..) and try to use only one field as key for join (e.g : you can precompute a mono field key)
If you know the shape of your data, using these options can allow Tableau to optimize the query (like by deleting
useless joins). Do not use if you're not sure, it can leads to wrong results.
General idea : the more joins you have in your model, the more you take CPU time to calculate the joins; the less joins
you have, the more you take memory (it can also lead to more time for calculation depending of your calculation and
the database you use...);
Rule of thumb : Snowflake schema is unefficient. The Star schema/big fact table question cannot be answered
without knowing the exact database and calculations and even with that, tests can be required. Some ideas : on a
column-store database without "row calculation" on dimension (like a calculation on the product to make a group) ,
your model will work fine with a unique fact table.
Mixing data from several sources is usually to be avoided since Tableau will have to do the mix every time you do an
action on the machine, so having to "download" the keys of the blending;
For the same reasons than blending. Try to avoid it as much as you can, especially with Live Query.
Try to avoid it as much as you can. Tableau will use it in sub/nested queries. Alternative : views on the database.
If you can't avoid it, write an efficient query, without useless clauses (often : order by)…
In case you use a referential table in order to link your data and users/group of users, don't forget this table is… data
and the same rules seen above apply. Especially, join on number and having your permission table in the same
base/type than the data, especially in live query (e.g. : on live query, don't use an excel permission table with a Hive
table, load the permission table to Hive instead)
References
https://help.tableau.com/current/pro/desktop/en-us/datasource_relationships_perfoptions.htm
Items
Field type are ok
try to deport as much as you can the calculation before Tableau Viz (in database, in file…
with help of ELT/ELT/Dataprep tools). E.g : left(date,4)=> create a Year field in the data.
Unit Price*Quantity => create an Amount field.
Please note that you can retrieve calculations :
-by using c: in the search bar of your field panel (available everyhere you're editing a viz)
-downloading your workbook and then using this python script :
https://github.com/scinana/tableauCalculationExport/
-extracting it with Alteryx cf https://www.theinformationlab.co.uk/2016/06/07/extract-
calculated-fields-tableau-alteryx/
if are avoided as much as you can (try to use boolean logic, case when syntax…, etc) and
when not avoided, the condition must be fast to calculate
LOD are among the worst calculation in terms of performance. Try to avoid it as much as
you can and when you have to do it, with only the fields required and only for the rows
needed. Alternative : table with precomputed several level of granularity.
even if it seems practical, Tableau "groups" often lead to poor performance, worse than
a dedicated calculated field (you can do the same with a "case myfield when value then
my_group syntax) or even better a dedicated field.
in terms of rows and column, limit the data to the strict minimum : select columns, filter
data… and aggregate it the good level !
These optimizations are mandatory when you're in LiveQuery but it can also impact whe
Database General Item
Database version
Statistics
Partitions
Database specific
Most index
Vertica projections
use of Hive/Impala/…
external/internal?
bucket
size of container
several queue
when you're in LiveQuery but it can also impact where you're in Extract.
Comment
A lot of improvement are made by database developers about performance. Try to have the
most recent version or at least not more than 1 year late.
Usually, the newer the faster
there is only a small probability you can choose your database. However MPP and Column-
Store database are known to be very efficient for dataviz-like queries (huge amount of rows,
high aggregation). Vertica, MonetDB, Clickhouse…
database vision. Same than data and calculation vision but with a focus on length added.
(ex : varchar(xxx), int/bigint…)
usually, several kinds of statistics : table, column, even partition. Statistics can be used to
store some metrics and also will help query execution plan. Ideally, update statistics every
time the table is updated (of course, when you use a dedicated table, not when it's a
transactionnal table)
must be related to dimension fields that are used and not to technical field (it's not an "etl"
table !). E.g : if you always filter by date, a partition by date can be a good idea.
PK/FK constraints will help to optimize query plan, especially join operations.
Caution : it was an old performance trick, not sure it's up to date.
Context filter
Tooltips
Design phase
Sample
By default, a quick filter will query all values in database. You can use Tableau Order of
Operation to reduce that, by choosing "all values in context" or "only relevant value".
(please note that may cause two queries instead of one.. So test it !). Try to limit the
number of quick filters, especially with high cardinality.Alternative : action filters.
Use context filters when filters reduce the data by at least 1/10 (rule of thumb). e.g. :
you always filter by date, and the date is even a partition key. Be careful, it can has
impact on LOD, especially fixed, top N, etc (cf Tableau Order of Operation).
Instead of doing two tables, can you merge it? Instead of doing two separate pie charts
on two sheets, can you do one pie chart with a column field in order to have only one
sheet? Can you split your dashboard in several dashboards? The idea is to reduce the
number of queries on one dashboard.
You can design your viz with a subset of data in order to design faster without being
impacted by performance. However, do not forget to stress test your work with real
volumetry.
References
https://help.tableau.com/current/server/en-us/data_acceleration.htm
https://help.tableau.com/current/pro/desktop/en-us/extracting_data.htm
https://help.tableau.com/current/pro/desktop/en-us/datasource_relationships_perfoptions.htm
https://interworks.com/blog/bfair/2015/02/23/tableau-performance-checklist
https://help.tableau.com/current/pro/desktop/en-us/perf_record_create_desktop.htm
https://help.tableau.com/current/pro/desktop/en-us/wbo_streamline.htm
https://help.tableau.com/current/server/en-us/data_acceleration.htm
Disclaimer
Change log
The informations and principles in this document come mostly from theory but also from experience.
Despite all precautions, we do not ensure the advices given here are always true and will stay true in
the future. Take some distance, test, iterate. And do not hesitate to get back to us with your ideas !
what
Hard drive