UniTS

From ECLR
Jump to: navigation, search


Introduction

When dealing with time-series data we will often revert to modeling them in a univariate framework, meaning we will build one model for one time-series, without any reference to any other variables. It is, of course, understood that in an economic context no time-series does evolve without being influenced by any other series and therefore univariate models should not be understood as being, even a remotely, realistic representation of the true dynamics.

However, such univariate models have been found to be useful for at least two reasons:

  1. To be used as a forecasting tool (mainly in the short term).
  2. To describe some important dynamic features of the time-series. In particular the most common tests for stationarity are couched in a univariate time-series modeling framework.

ARMA models

Two basic linear models are commonly used in time-series modelling, autoregressive (AR) and moving average (MA) models. Combinations of the two are called, ARMA models. Below is the general form of a ARMA(p,q) model.

[math]y_{t}=\alpha +\phi _{1}y_{t-1}+...+\phi _{p}y_{t-p}+\varepsilon _{t}+\theta_{1}\varepsilon _{t-1}+...+\theta _{q}\varepsilon _{t-q}[/math]

For notational ease we shall collect all parameters in a [math]((p+q+1) \times 1)[/math] vector [math]\beta = (\alpha \ \phi_1 \ ... \ \phi_p \ \theta _{1} \ ... \ \theta _{q})'[/math]. It is easily possible to extend the above model to also include further (weakly exogenous) variables. Such models are often called ARMAX models and the MATLAB code used below allows for this extension in a very straightforward manner. Such models should be applied to time series [math]y_t[/math] that are stationary. The stationarity of a time-series should be established prior to to applying this modelling approach.

Parameter estimation

Parameter estimation of ARMA(p,0) models, i.e. AR(p) models is extremely straightforward as it can be achieved by standard OLS estimation. Say you have your favourite OLS estimation procedure (e.g. OLSest.m from this page). You will then need to define your vector with observations of the dependent variable (dep) and a matrix with explanatory variables in the columns (exp), which are used as input into your OLS function. All of these will have to be defined from your one vector of time-series data. Let’s say your time-series is saved in the vector data. Say that data vector has 315 observations and you want to estimate a AR(3) model:

[math]data_{t}=\alpha +\phi _{1}data_{t-1}+\phi _{2}data_{t-2}+\phi _{3}data_{t-3}+\varepsilon _{t}[/math]

You therefore have four explanatory variables, the constant, [math]data_{t-1}[/math], [math]data_{t-2}[/math] and [math]data_{t-3}[/math], meaning that your matrix exp should have four columns. It is important that the dep and exp variables have identical number of rows and that they are lined up appropriately, meaning that in the row in which we have [math]data_{10}[/math] in the dep variable, we should have a 1 for the constant and [math]data_9[/math], [math]data_8[/math] and [math]data_7[/math] in the exp variable.

This implies that the first row of dep should contain [math]data_4[/math] and the first row of exp should contain a 1, [math]data_3[/math], [math]data_2[/math] and [math]data_1[/math]. Altogether the input variables should look as follows:

[math]exp = \left( \begin{array}{c} data_4 \\ data_5 \\ \vdots \\ data_{314} \\ data_{315} \\ \end{array} \right); dep = \left( \begin{array}{cccc} 1 & data_3 & data_2 & data_1 \\ 1 & data_4 & data_3 & data_2 \\ \vdots & \vdots & \vdots & \vdots \\ 1 & data_{313} & data_{312} & data_{311} \\ 1 & data_{314} & data_{313} & data_{312} \\ \end{array} \right)[/math]

and are a [math](312 \times 1)[/math] vector and [math](312 \times 4)[/math] matrix respectively. In MATLAB we will achieve this as follows, assuming the variable data is a [math](315 \times 1)[/math] vector with all observations:

dep = data(4:end);
exp = [ones(size(dep)) data(3:end-1) data(2:end-2) data(1:end-3)];

All that is the left to do is to feed these variables into the OLS estimation function and the [math](4 \times 1)[/math] estimated parameter vector will be the OLS estimates for [math](\alpha \ \phi_1 \ \phi_2 \ \phi_3)'[/math].

Estimation of MA(q), and by implication ARMA(p,q) models where [math]q\gt 0[/math] is less straightforward as the [math]\varepsilon _{t}[/math]s are unobserved. This has the consequence that there is no neat analytical formula (unlike for OLS estimations) which can be used to obtain the optimal parameters. What is required is a recursive calculation of all error terms (this calculation will, of course depend on the parameter vector [math]\beta[/math] used. As there is no analytical solution we will have to revert to a nonlinear (or trial and error) optimisation strategy (see NonlinOptTheory for details).

MATLAB implementation

To estimate AR, MA or ARMA models we recommend the use of pre-written functions. In particular we recommend the use the functions provided through the MFE toolbox [1][1]. The particular function that is written to estimate any type of ARMAX(p,q) model is parameters = armaxfilter(data,1,P,Q). For all the details and possible outputs that can be provided by this function (e.g. log-likelihood, estimated residuals, AIC et.) see doc armaxfilter. The first input is the time-series vector, the second input is equal to "1" if you want to include a constant ("0" otherwise), the third input, P, is a vector containing all the AR lags you want to include (see example below), and the the last input, Q is a vector that contains all the MA lags you want to include.

Assume you have a time-series saved in y. In order to estimate an AR(3) model (as above), we would call

pars = armaxfilter(y,1,(1:3),[])

This code will return the estimated constant (in pars(1)) and the three AR coefficients (in pars(2:end)). You can see that the last input was an empty vector ([ ]) indicating that we did not want to include any MA term (AR(3) is equivalent to ARMA(3,0)). If you wanted to add MA terms than we would indicate which terms we wanted in that last term. For example, if you wanted to estimate an ARMA(3,2) model then the relevant function call would be

pars = armaxfilter(y,1,(1:3),(1:2))
const = pars(1);    % extract constant
arpar = pars(2:4);  % extract AR parameters
mapar = pars(5:6);  % extract MA parameters

The results vector pars will return the estimated parameters in the following order: constant, AR parameters, MA parameters. Please consult the help function for the armaxfilter function to see how to include exogenous variables (the X in ARMAX) into the estimation.

Model identification

A practical problem when using univariate time series models is to identify the correct order of the ARMA(p,q), i.e the values for [math]p[/math] and [math]q[/math] to be used. The most common approach used is to use information criteria to identify the best model. The idea behind information criteria is the following. The more explanatory variables we use (larger [math]p[/math] and [math]q[/math]) the better we will fit the data. However, it has shown that this does not necessarily translate into better forecast performance and hence we want to trade off this improved in-sample data fit, against the need to include more explanatory variables to achieve this.

This trade-off is formalised in the form of information criteria. The in-sample fit is measured by the [math]s_{\widehat{\varepsilon }}^{2}[/math], the estimated residual variance of the model and the number of parameters, in the case of [math]ARMA(p,q)[/math] models is represented by [math]k=p+q+1[/math], assuming that a constant is included. The sample size used in the estimation is [math]n[/math]. The Akaike Information Criterion (AIC) is:

[math]AIC\left( k\right) =\log \left( s_{\widehat{\varepsilon }}^{2}\right) +2k/n[/math]

and the Schwarz (Bayes) Information Criterion (SIC) is:

[math]SIC\left( k\right) =\log \left( s_{\widehat{\varepsilon }}^{2}\right) +k\log \left( n\right) /n[/math]

The idea is that you calculate these statistics for different model and then choose the model that delivers the smallest model. There are different arguments for why you should use one or the other information criterion (and more than just these two are available), but in practical terms it is good to use that in general the AIC criterion will tend to favour models with more explanatory variables compared to the SIC.

One last note on the use of these information criteria is required. When you compare time-series models different lag lengths also mean different sample sizes. When you estimate time-series models with different lag lengths you should ensure that they are all estimated on the same effective sample size.

MATLAB implementation

We just stated that we want to find the ARMA(p,q) model that minimises the information criterion of our choice. When doing so we need to make a decision on what the maximum value for [math]p[/math] and [math]q[/math] should be. Let’s say it was maxlag = 4. Then we would have to estimate the 25 possible models, as [math]p[/math] and [math]q[/math] can take values from 0 to 4. The way to implement this is in a loop. Here is the general structure:

%% Identify the best lag order p using SIC and AIC
%   ARMA(p,q) model with p,q = 0,1,...,maxlag

maxlag     = 4;    % We search over lag orders p = 0,1,...,4
save_AIC   = zeros(maxlag+1,maxlag+1);    % Save AIC and SIC for all possible combinations of p and q
save_SIC   = zeros(maxlag+1,maxlag+1);    % Save AIC and SIC for all possible lag lengths in here

    % Estimate ARMA(p,q) model and save AIC and SIC
    for i = 0:maxlag
        for j = 0:maxlag

                    % Define required lags
            P = (1:i);      % if i==0 this will crate empty vector as required
            Q = (1:j);

                    % need 5th output of armaxfilter as that is a structure
                    % which contains the AIC and SIC
            [b, LL, errors, SEregression, di] = armaxfilter(y,1,P,Q,[],[],[],maxlag);

            save_AIC(i+1,j+1) = di.AIC;
            save_SIC(i+1,j+1) = di.SBIC;

        end
    end

This code starts with defining the maximum lag, followed by defining storage matrices into which we will save the calculated AIC and SIC criteria. You can then see two nested loops. Inside the loop we will first define the lags in a way in which they are needed in armaxfilter function. Then we are calling the armaxfilter function that does all the hard work. There are two noteworthy differences to the way we call the armaxfilter function previously. First, we are asking it to return five outputs: [b, LL, errors, SEregression, di]. This is required because the fifth di is a structure which contains the SIC and AIC statistics (in di.SBIC and di.AIC).

Second, we are also delivering more inputs, namely eight: (y,1,P,Q,[],[],[],maxlag). We are familiar with the first four and in fact inputs 5, 6 and 7 are left empty ([ ]). You can consult help armaxfilter to exactly figure out what the (unused) inputs 5 to 7 are for. But the eight input is used and we deliver maxlag to the function. This input insures that all the models in in these loops are evaluated on the same set of observations. Consider two different ARMA models, ARMA(2,1) and ARMA(4,4). For the former we loose two observations, but for the latter we loose 4. That means that, if we just estimated these models, the ARMA(2,1) model would be evaluated on two more observations than the ARMA(4,4) model. But this also means that when we compare information criteria for these two models, then their respective [math]s_{\widehat{\varepsilon }}^{2}[/math] are based on slightly different observations and therefore we cannot really compare them anymore. The eight input (maxlag) basically excludes the first maxlag observations from the estimation. As long as none of the AR or MA lags is larger than maxlag this therefore ensures that all models are evaluated on the same range of observations.

Forecasting

Once model parameters are estimated (using armaxfilter as discussed above) we can use the estimated parameter vector to forecast the series. Let’s assume that we estimated an ARMA(2,1) model (with a constant),

[math]y_{t}= \alpha +\phi _{1}y_{t-1}+\phi _{2}y_{t-2}+\varepsilon _{t}+\theta_{1}\varepsilon _{t-1}[/math]

Let’s assume that we used [math]T[/math] observations of [math]y_{t}[/math] to obtain parameter estimates, then we can use the given information to obtain [math]E[y_{T+1}|I_T][/math] where [math]I_T[/math] contains all information available at time [math]T[/math]. The one-step ahead forecast can then be obtained from the following

[math]E[y_{T+1}|I_T]= E[\widehat{\alpha} +\widehat{\phi _{1}}y_{T}+\widehat{\phi _{2}}y_{T-1}+\widehat{\varepsilon}_{T+1}+\widehat{\theta_{1}}\widehat{\varepsilon} _{T}|I_T]\\ = \widehat{\alpha} +\widehat{\phi _{1}}y_{T}+\widehat{\phi _{2}}y_{T-1}+\widehat{\theta_{1}}\widehat{\varepsilon _{T}}[/math]

courtesy of the assumption [math]E[\varepsilon _{T+1}|I_T]=0[/math] and the fact that all other terms are known at time [math]T[/math]. The term [math]\widehat{\varepsilon _{T}}[/math] is the estimated model residual at time [math]T[/math].

If we wanted a two-step ahead forecast we would apply the same approach and calculate

[math]E[y_{T+2}|I_T]= E[\widehat{\alpha} +\widehat{\phi _{1}}y_{T+1}+\widehat{\phi _{2}}y_{T}+\widehat{\varepsilon} _{T+2}+\widehat{\theta_{1}}\widehat{\varepsilon} _{T+1}|I_T]\\ = \widehat{\alpha} +\widehat{\phi _{1}}E[y_{T+1}|I_T]+\widehat{\phi _{2}}y_{T}[/math]

as [math]E[\varepsilon _{T+1}|I_T]=0[/math] by assumption. For [math]E[y_{T+1}|I_T][/math] we can substitute the one-step ahead forecast in the previous step. This makes it obvious that multiple-step ahead forecasting in this context has to be achieved in a recursive manner.

We can also recognise that forecasting the MA part of a univariate time-series process gets very straightforward as we go beyond [math]q[/math] (the MA order) steps ahead.

MATLAB implementation

For the MATLAB implementation of this procedure assume that the vector pars contains the following parameter estimates

[math]\text{pars} = \left(\begin{array}{c} \widehat{\alpha} \\ \widehat{\phi _{1}} \\ \widehat{\phi _{2}} \\ \widehat{\theta_{1}} \end{array} \right)[/math]

Further assume that we used the [math](n\times1)[/math] vector data to obtain these parameter estimates and that now we want to forecast the series for time [math]T+k[/math].

We shall write a function that requires the following seven inputs:

function yf = armaforc(y,k,p,q,al,ph,th);
% Function produces ARMA(p,q) process
% input:    y, data series from which to forecast
%               assumes that last obs in vector is
%               most current observation
%           k, k step forecasts
%           p, AR order
%           q, MA order
%           al, constant parameters
%           ph, AR parameters [ph(1); ph(2); ...; ph(p)]
%           th, MA parameters [th(1); th(2); ...; th(q)]
% output:   yf, (n+k)x1 vector where last k obs are the forecasts

The output will be a [math]((n+k)\times1)[/math] vector that contains the original [math]n[/math] observations followed by the [math]k[/math] forecasts.

Inside the function we need to first calculate the estimated in sample residuals, [math]\widehat{\varepsilon} _{t}[/math] for [math]t=1,..,n[/math] as we need them for the MA forecasting part. Once that is done we merely need to walk through the recursive forecasting approach described above.

% This first part is administrative and to facilitate some of the following calculations
n       = size(y,1);        % length of series
maxlag  = max([p;q]);       % find max lag
um      = al/(1-sum(ph));   % unconditional mean
fph     = flipud(ph);       % reversed AR parameter vector
fth     = flipud(th);       % reversed MA parameter vector
yf      = zeros(n+k,1);     % in here we safe the forecasts
yf(1:n) = y;                % save actual observations in yf
eps     = zeros(n+k,1);     % in here we safe the estimated residuals

The only part here that is perhaps not obvious is the fact that we reverse the order of the AR and MA parameters such that the parameter referring to the longest lag is on top of the vectors fph and fth. The reason for this will become apparent in the recursions.

% we produce the estimated residuals, assuming that eps(1:q)=0
% only needed if q > 0
if q > 0
    for i = q+1:n
        eps(i)  =   y(i) - al - y(i-p:i-1)'*fph - eps(i-q:i-1)'*fth;
    end
end

What follows is the recursions. We produce different recursions depending on whether AR and/or MA parts are present. That makes reading the recursions a little easier, but you could also just write one recursion.

if p>0 & q==0   % AR model

    for i = n+1:n+k
        yf(i) = al + yf(i-p:i-1)'*fph;
    end

elseif p>0 & q>0    % ARMA model

    for i = n+1:n+k
        yf(i) = al + yf(i-p:i-1)'*fph + eps(i-q:i-1)'*fth;
    end

elseif p==0 & q>0    % MA model

    for i = n+1:n+k
        yf(i) = al + eps(i-q:i-1)'*fth;
    end

else                % White noise model

    for i = n+1:n+k
        yf(i) = al;
    end
end

Whatever recursion was used, we now have a [math]((n+k)\times1)[/math] vector yf that contains our [math]k[/math] forecasts as the last [math]k[/math] elements in yf.

In our case we would call this function as datafor = armaforc(data,3,2,2,pars(1),pars(2:3),pars(4)) in case we wanted a three step ahead forecast, with the results stored in datafor.

Footnotes

  1. This means that you need to have this toolbox downloaded and made available to your MATLAB - use FILE - Set Path. Or alternatively use the technique demonstrated in this video clip