34
$\begingroup$

I'm trying to understand a paper about electric load forecasting but I'm struggling with the concepts inside, specially the SARIMAX model. This model is used to the predict the load and uses many statistical concepts that I do not understand (I'm an undergrad computer science student -- you can consider me a layperson in statistics). It is not necessary for me to understand completely how it works, but I'd like to at least understand intuitively what is happening.

I've been trying to split SARIMAX into smaller pieces and trying to understand each of these pieces separately and then putting them together. Can you guys help me? Here's what I have so far.

I started with AR and MA.

AR: Autoregressive. I have learned what a regression is, and from my understanding, it simply answers the question: given a set of values / points, how can I find a model that explains these values? So we have, for example, the linear regression, which tries to find a line that can explain all these points. An autoregression is a regression that tries to explain the values using their previous values.

MA: Moving Average. I'm actually quite lost here. I know what a moving average is, but the moving average model doesn't seem to have anything to do with the "normal" moving average. The formula for the model seems awkwardly similar to AR and I can't seem to understand any of the concepts I find in the internet. What is the purpose of MA? What's the difference between MA and AR?

So now we have ARMA. The I then comes from Integrated, which as far as I have understood, simply serves the purpose of allowing the ARMA model to have a tendency, either increasing or decreasing. (Is this equivalent to saying ARIMA allows it to be non-stationary?)

Now comes the S from seasonal, which adds periodicity to ARIMA, which basically says, for example in the case of load forecasting, that the load looks very similar everyday at 6 PM.

Finally the X, from exogenous variables, which basically allows external variables to be considered in the model, such as weather forecasts.

So we finally have SARIMAX! Are my explanations OK? Recognize that these explanations aren't required to be rigorously correct. Can someone explain me what MA does intuitively?

$\endgroup$
1

1 Answer 1

19
$\begingroup$

As you noted, (1) an AR model relates the value of an observation $x$ at time $t$ to the previous values, with some error: $$ x_t = \phi x_{t-1} + \varepsilon_t $$ Let's substitute in $ x_{t-1} $, and then $ x_{t-2} $: $$\begin{aligned} x_t &= \phi (\phi x_{t-2} + \varepsilon_{t-1}) + \varepsilon_t \\ &= \phi^2x_{t-2} + \phi\varepsilon_{t-1} + \varepsilon_t \\ &= \phi^3x_{t-3} + \phi^2\varepsilon_{t-2} + \phi\varepsilon_{t-1} + \varepsilon_t \end{aligned} $$ Taking that out to infinity: $$ x_t = \phi^nx_{t-n} + \phi^{n-1}\varepsilon_{t-n+1} + ... + \phi\varepsilon_{t-1}+ \varepsilon_t $$ You can write any (stationary) AR($p$) as an MA($\infty$), though of course you run into a giant pile-up of terms on top of one another with $p>1$.

Having seen that, let's rephrase our definition (1) now. An AR process relates the value of an observation $x$ at time $t$ to an infinite sequence of decaying error shocks $\varepsilon$ from prior time periods (that we don't directly observe).

So what an MA process is might be clearer now. (2) An MA($q$) process relates the value of an observation $x$ at time $t$ to just $q$ error shocks from prior periods (that we don't directly observe), of which coefficients are allowed to vary more than the exponential decay implicit in an AR model. As you note, it has nothing to do with the usual "moving average" concept.

With some conditions on the coefficients $\theta_1...\theta_q$ of an MA($q$) process, we can actually do something very similar to what I showed for an AR process above, that is, write the MA($q$) as an AR($\infty$). So it's just as valid to restate (2) to say an MA process relates the value of an observation $x$ at time $t$ to a decaying sequence of all prior values of $x$.

So an ARMA model just combines those two ideas, relating $x_t$ to both an infinite decaying sequence and a defined sequence. ARIMA just adds in differencing to the mix, that is, you run ARMA on $x_t - x_{t-1}$ (or further differences as it may be), to remove trend, as you noted.

$\endgroup$
3
  • $\begingroup$ Hi Affine, thanks for the fast reply! Can I say MA is like an AR for the error? $\endgroup$
    – Clash
    Commented Jun 19, 2013 at 6:53
  • $\begingroup$ Sort of. The key idea is that AR can be transformed into a decaying MA of infinite length, and vice versa. So any intuitive meaning you assign to one - AR = relate current observation to $p$ previous observations - can be assigned to the other - MA = relate current observation to all previous observations. Or as I approached it originally in my answer - AR = relate current observation to all previous error "shocks", MA = relate current observation to $q$ previous error "shocks". $\endgroup$
    – Affine
    Commented Jun 20, 2013 at 1:49
  • $\begingroup$ Isn't the footnote for epsilon should be $t-n-1$? $\endgroup$
    – Y. Z.
    Commented Aug 10, 2020 at 3:30

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.