Multivariate Forecasting in Tableau With R - R-Bloggers

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

R news and tutorials contributed by (750) R bloggers

Home
About
RSS
add your blog!
Learn R
R jobs
Contact us

Welcome!
Follow @rbloggers 50.1K followers

Here you will find daily


news and tutorials
about R, contributed by
over 750 bloggers.
There are many ways to
follow us -
By e-mail:
Your e-mail here
Subscribe
On Facebook:
R blog
67k likes

Like Page

Be the first of your friends


to like this

If you are an R blogger


yourself you are invited
to add your own R
content feed to this site
(Non-English R
bloggers should add
themselves- here)

Jobs for R-
users
Research and
Statistical Analyst
Housing @
London, England,
U.K.
Data Scientist @
Garching bei
Mnchen, Bayern,
Germany
Software
Developer
Senior
Quantitative
Analyst, Data
Scientist
R data wrangler
Search & Hit Enter

Popular
Searches
googlevis
heatmap
twitter
sql
latex
coplot
Jeff Hemsley
web Scraping
eof
hadoop
random forest
3 d clusters
anova
blotter
boxplot
decision tree
discriminant
financial
ggplot
background grid
colour
how to import
image file to r
maps
purrr
rattle
Trading
bar chart
barplot
Binary
climate
contingency table
data frame
data science

Recent Posts
Superstorm Sandy
at Barnegat Bay
Revisted
Data Visualization
with googleVis
exercises part 2
Exit poll for June
2017 election
(UK)
Machine Learning
Powered
Biological
Network Analysis
smooth package
for R. Common
ground. Part I.
Prediction
intervals
Likelihood
calculation for the
g-and-k
distribution
R Weekly Bulletin
Vol XI
Density-Based
Clustering
Exercises
Engaging the
tidyverse Clean
Slate Protocol
UK 2017 General
Election Results
Data
Who is talking
about the French
Open?
Schedule for
useR!2017 now
available
Big Data
Manipulation in R
Exercises
Managing
intermediate
results when using
R/sparklyr
Unconf projects 5:
mwparser, Gargle,
arresteddev

Other sites
Jobs for R-users
SAS blogs

Multivariate Forecasting in
Tableau with R
August 2, 2016
By Bora Beran

Like 0 Share Share

(This article was first published on R Bora Beran, and kindly contributed to R-bloggers)

Share Tweet

Since version 8.0 it is very easy to generate forecasts in Tableau using


exponential smoothing. But in some cases you may want to enrich
your forecasts with external variables. For example you may have the
governments forecast for population growth, your own hiring plans,
upcoming holidays*, planned marketing activities which could all
have varying levels of impact on your forecasts. Or sometimes you
may just want to simulate different scenarios and ask what if
questions e.g. what would my sales look like if I hired 10 more sales
representatives? If interest rates went up by X percent, how would that
impact my profits?

Lets see how we can tackle both uses cases with the help of
Autoregressive Integrated Moving Average with eXogenous variables
(ARIMAX) models in Rs forecast package.
*In some cases seasonality may be sufficient to capture weekly cycles but not for
moving events like Easter, Chinese New Year, Ramadan, Thanksgiving, Labor day etc.

Handling special events

In the image below the observed/historical demand is shown in blue.


At the bottom you can see the holidays as green bars with height of 1
and non-holidays as bars of height 0. This representation is often
referred to as dummy encoding.

Our data contains 2 holidays that happened in the past and 1 upcoming
holiday. One can clearly see that the holidays are causing noticeable
spikes in demand. For a store owner who doesnt want to miss the next
holiday opportunity by running out of stock early, it is very valuable to
incorporate this important piece of information in the demand forecast.

The formula for the forecast shown with the red line (which doesnt
take holidays into account) looks like the following:

SCRIPT_REAL("library(forecast);
append(.arg1[1:99],forecast(ts(.arg1[1:99], frequency=12), h=21)$mean)",

First 99 points cover the historical data while last 21 are whats being
predicted. The data follows a 12 period cycle. Second line of R code
appends the predicted values to the reported values to generate the full
series.
The formula for the forecast shown with the green line (which
incorporates the holidays) looks like the following:

SCRIPT_REAL("library(forecast);
data <- data.frame(demand=.arg1,holiday=.arg2);
training <-data[1:99,];
model<-auto.arima(training[,1],xreg=training[,2]);
append(training[,1],forecast(model,xreg=.arg2[100:120])$mean)",

In this script, exogenous regressors are passed to the function using the
xreg argument. It can be clearly seen that both the spike due to holiday
and the regular seasonality in demand are represented by this model. If
there are multiple events with potentially different effects (positive vs.
negative, different magnitudes etc.), they can be handled using the
same method if added as separate variables.

What-If Analysis

Now that we covered handling events as additional regressors, lets


talk about we can apply the same methodologies to do what if analysis.

Below you can see three time series; Sales and 2 economic indicators.
Assume that the two indicators contain the current best estimates of
behavior for next few days but Sales for the same period is unknown
however can be estimated using the two economic indicators.
What makes this visualization more interesting is that you can also
adjust the value of economic indicators and the time frame these
overrides apply. For example in the image below, indicator X has been
increased by 15 units (dark blue peak) for the period April 13th and
April 25th while indicatory Y has been reduced by 20 units (dark
orange dip) for the period April 20th and May 1st. The impact can be
clearly seen in the dark green portion of the line in the first chart. By
adjusting the parameters in the dashboard one can perform what-if
analysis and understand impact of likely future events, best/worst case
scenarios etc.

What if analysis with time series forecasting

Lets take a look at the R script that is used to create this view.

SCRIPT_REAL("library(forecast);
training <- ts(data.frame(sales=.arg1,indX=.arg2,indY=.arg3)[1:100,],frequency=7);
model<-auto.arima(training[,1],xreg=training[,-1]);
whatifdata <- data.frame(indX=.arg2,indY=.arg3)[101:120,];
whatifdata <- data.frame(indX=.arg2,indY=.arg3)[101:120,];
append(training[,1],forecast(model,xreg=whatifdata)$mean)",
AVG([Sales]),AVG([Economic Indicator X]),AVG([Economic Indicator Y]))

As you can see, the formula is very similar to earlier examples. In this
demo dataset, the first 100 rows are used for model fitting while the
last 20 contain the sales forecast as well as the inputs for the sales
forecast that are the what-if values defined in Economic indicator X
and Y fields as a function of parameter entries.

You can find all of these examples in the Tableau workbook published
HERE.

In the sample workbook, I also provided a sheet that compares the


ARIMAX result to multiple linear regression to give a better sense of
what youre getting out of applying this particular method.

Related

How To Forecast smooth package for


With Tableau And R R. es() function. Part I
How To Forecast With Good news, everyone!
Tableau And R smooth package is
July 1, 2015 now available on Forecasting:
In "R bloggers" CRAN. And it is time to Exponential Smoothing
look into what this Exercises (Part-3)
package 14,
October can 2016
do and April 17, 2017
In "R bloggers" In "R bloggers"

Share Tweet

To leave a comment for the author, please follow the link and comment on their blog:
R Bora Beran.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such
as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation),
programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping)
statistics (regression, PCA, time series, trading) and more...

If you got this far, why not subscribe for updates from the site?
Choose your flavor: e-mail, twitter, RSS, or facebook...
Like 0 Share Share

Comments are closed.

Search & Hit Enter

Recent popular posts


Density-Based Clustering Exercises
Machine Learning Powered Biological
Network Analysis
Data Science Podcasts
Data Visualization with googleVis
exercises part 2
UK 2017 General Election Results Data

Most visited articles of the


week
1. How to write the first for loop in R
2. Installing R packages
3. Using apply, sapply, lapply in R
4. Tutorials for learning R
5. How to Make a Histogram with Basic R
6. How to perform a Logistic Regression
in R
7. Add P-values and Significance Levels
to ggplots
8. Freedman's paradox
9. Deep Learning with R

Sponsors
Contact us if you wish to help support R-
bloggers, and place your banner here.

Jobs for R users


Research and Statistical Analyst
Housing @ London, England, U.K.
Data Scientist @ Garching bei
Mnchen, Bayern, Germany
Software Developer
Senior Quantitative Analyst, Data
Scientist
R data wrangler
Senior Data Scientist
Manager, Statistical Consulting & Data
Science

Search & Hit Enter


Full list of contributing R-bloggers
R-bloggers was founded by Tal Galili, with gratitude to the R community.
Is powered by WordPress using a bavotasan.com design.
Copyright 2017 R-bloggers. All Rights Reserved. Terms and Conditions for this website

You might also like