Multivariate Forecasting in Tableau With R - R-Bloggers
Multivariate Forecasting in Tableau With R - R-Bloggers
Multivariate Forecasting in Tableau With R - R-Bloggers
Home
About
RSS
add your blog!
Learn R
R jobs
Contact us
Welcome!
Follow @rbloggers 50.1K followers
Like Page
Jobs for R-
users
Research and
Statistical Analyst
Housing @
London, England,
U.K.
Data Scientist @
Garching bei
Mnchen, Bayern,
Germany
Software
Developer
Senior
Quantitative
Analyst, Data
Scientist
R data wrangler
Search & Hit Enter
Popular
Searches
googlevis
heatmap
twitter
sql
latex
coplot
Jeff Hemsley
web Scraping
eof
hadoop
random forest
3 d clusters
anova
blotter
boxplot
decision tree
discriminant
financial
ggplot
background grid
colour
how to import
image file to r
maps
purrr
rattle
Trading
bar chart
barplot
Binary
climate
contingency table
data frame
data science
Recent Posts
Superstorm Sandy
at Barnegat Bay
Revisted
Data Visualization
with googleVis
exercises part 2
Exit poll for June
2017 election
(UK)
Machine Learning
Powered
Biological
Network Analysis
smooth package
for R. Common
ground. Part I.
Prediction
intervals
Likelihood
calculation for the
g-and-k
distribution
R Weekly Bulletin
Vol XI
Density-Based
Clustering
Exercises
Engaging the
tidyverse Clean
Slate Protocol
UK 2017 General
Election Results
Data
Who is talking
about the French
Open?
Schedule for
useR!2017 now
available
Big Data
Manipulation in R
Exercises
Managing
intermediate
results when using
R/sparklyr
Unconf projects 5:
mwparser, Gargle,
arresteddev
Other sites
Jobs for R-users
SAS blogs
Multivariate Forecasting in
Tableau with R
August 2, 2016
By Bora Beran
(This article was first published on R Bora Beran, and kindly contributed to R-bloggers)
Share Tweet
Lets see how we can tackle both uses cases with the help of
Autoregressive Integrated Moving Average with eXogenous variables
(ARIMAX) models in Rs forecast package.
*In some cases seasonality may be sufficient to capture weekly cycles but not for
moving events like Easter, Chinese New Year, Ramadan, Thanksgiving, Labor day etc.
Our data contains 2 holidays that happened in the past and 1 upcoming
holiday. One can clearly see that the holidays are causing noticeable
spikes in demand. For a store owner who doesnt want to miss the next
holiday opportunity by running out of stock early, it is very valuable to
incorporate this important piece of information in the demand forecast.
The formula for the forecast shown with the red line (which doesnt
take holidays into account) looks like the following:
SCRIPT_REAL("library(forecast);
append(.arg1[1:99],forecast(ts(.arg1[1:99], frequency=12), h=21)$mean)",
First 99 points cover the historical data while last 21 are whats being
predicted. The data follows a 12 period cycle. Second line of R code
appends the predicted values to the reported values to generate the full
series.
The formula for the forecast shown with the green line (which
incorporates the holidays) looks like the following:
SCRIPT_REAL("library(forecast);
data <- data.frame(demand=.arg1,holiday=.arg2);
training <-data[1:99,];
model<-auto.arima(training[,1],xreg=training[,2]);
append(training[,1],forecast(model,xreg=.arg2[100:120])$mean)",
In this script, exogenous regressors are passed to the function using the
xreg argument. It can be clearly seen that both the spike due to holiday
and the regular seasonality in demand are represented by this model. If
there are multiple events with potentially different effects (positive vs.
negative, different magnitudes etc.), they can be handled using the
same method if added as separate variables.
What-If Analysis
Below you can see three time series; Sales and 2 economic indicators.
Assume that the two indicators contain the current best estimates of
behavior for next few days but Sales for the same period is unknown
however can be estimated using the two economic indicators.
What makes this visualization more interesting is that you can also
adjust the value of economic indicators and the time frame these
overrides apply. For example in the image below, indicator X has been
increased by 15 units (dark blue peak) for the period April 13th and
April 25th while indicatory Y has been reduced by 20 units (dark
orange dip) for the period April 20th and May 1st. The impact can be
clearly seen in the dark green portion of the line in the first chart. By
adjusting the parameters in the dashboard one can perform what-if
analysis and understand impact of likely future events, best/worst case
scenarios etc.
Lets take a look at the R script that is used to create this view.
SCRIPT_REAL("library(forecast);
training <- ts(data.frame(sales=.arg1,indX=.arg2,indY=.arg3)[1:100,],frequency=7);
model<-auto.arima(training[,1],xreg=training[,-1]);
whatifdata <- data.frame(indX=.arg2,indY=.arg3)[101:120,];
whatifdata <- data.frame(indX=.arg2,indY=.arg3)[101:120,];
append(training[,1],forecast(model,xreg=whatifdata)$mean)",
AVG([Sales]),AVG([Economic Indicator X]),AVG([Economic Indicator Y]))
As you can see, the formula is very similar to earlier examples. In this
demo dataset, the first 100 rows are used for model fitting while the
last 20 contain the sales forecast as well as the inputs for the sales
forecast that are the what-if values defined in Economic indicator X
and Y fields as a function of parameter entries.
You can find all of these examples in the Tableau workbook published
HERE.
Related
Share Tweet
To leave a comment for the author, please follow the link and comment on their blog:
R Bora Beran.
R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such
as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation),
programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping)
statistics (regression, PCA, time series, trading) and more...
If you got this far, why not subscribe for updates from the site?
Choose your flavor: e-mail, twitter, RSS, or facebook...
Like 0 Share Share
Sponsors
Contact us if you wish to help support R-
bloggers, and place your banner here.