RobInf

From ECLR
Jump to: navigation, search

Introduction

This section will cover the calculation of heteroskedasticity and autocorrelation robust standard errors in the context of OLS regressions. The result will be that we will provide an alternative to our OLSest function. We shall call it OLShac function. As the OLSest function it will report (unchanged ) parameter estimates, but it will now also report standard errors for these parameter estimates that are (asymptotically) valid even if the Gauss-Markov (GM) assumptions are breached (either because of heteroskedasticity and/or autocorrelation).

Theory

The main issue addressed here is that the distribution of OLS parameter estimates depends on whether the assumption of homoskedasticity and absence of autocorrelation are met or not. Consider the basic regression model with the [math](n \times k)[/math] matrix of explanatory variables, [math]\mathbf{X}[/math]:

[math]\mathbf{y}=\mathbf{X\beta }+\mathbf{u}.[/math]

If all GM assumptions are met then the distribution of the OLS parameter estimate [math]\widehat{\mathbf{\beta }}\mathbf{=}\left( \mathbf{X}^{\prime }\mathbf{X}\right) ^{-1}\mathbf{X}^{\prime }\mathbf{y}[/math] is

[math]\begin{aligned} \widehat{\mathbf{\beta }} &\sim& N(\mathbf{\beta},Var(\widehat{\mathbf{\beta}}))\\ Var(\widehat{\mathbf{\beta }}) &=& \sigma^2 \left( \mathbf{X}^{\prime }\mathbf{X}\right) ^{-1}\end{aligned}[/math]

where [math]\sigma^2[/math] is the error variance. If, however, either the homoskedasticity or the no-autocorrelation assumption is breached then the correct variance formula is

[math]Var\left( \widehat{\mathbf{\beta }}\right) =\left( \mathbf{X}^{\prime }% \mathbf{X}\right) ^{-1}\mathbf{X}^{\prime }\Omega \mathbf{X}\left( \mathbf{X}% ^{\prime }\mathbf{X}\right) ^{-1}.[/math]

When estimating [math]Var\left( \widehat{\mathbf{\beta }}_{OLS}\right)[/math] we will need to replace [math]\sigma^2[/math] or [math]\Omega[/math] with sample estimates. For [math]\sigma^2[/math] this will be the sample variance of the estimated regression residuals [math]\widehat{\mathbf{u}}[/math], [math]s^2 = \widehat{\mathbf{u}}'\widehat{\mathbf{u}}/(n-k)[/math]. When doing this we obtain the usual OLS standard errors, the ones calculated in the OLSest function.

The real issue arises when making allowance for either heteroskedasticity and or autocorrelation when we need to estimate [math]\Omega[/math]. If error terms are heteroskedastic then we need to apply White standard errors and in the case of autocorrelated error terms we apply Newey-West standard errors. As it turns out White standard errors are a special case of Newey-West standard errors and therefore we will only discuss the latter.

Newey-West standard errors

Recall that we need to estimate

[math]Var\left( \widehat{\mathbf{\beta }}\right) =\left( \mathbf{X}^{\prime }% \mathbf{X}\right) ^{-1}\mathbf{X}^{\prime }\Omega \mathbf{X}\left( \mathbf{X}% ^{\prime }\mathbf{X}\right) ^{-1}.[/math]

As [math]\left( \mathbf{X}^{\prime }\mathbf{X}\right)^{-1}[/math] is readily available (and indeed has already been calculated in the process of estimating [math]\widehat{\mathbf{\beta}}[/math]) we concentrate on the middle term [math]\mathbf{X}^{\prime }\Omega \mathbf{X}[/math] which can be reformulated to:

[math]\begin{aligned} \mathbf{X}^{\prime }\Omega \mathbf{X} &=& \sum_{i=1}^{n}\sum_{j=1}^{n}\sigma _{ji}x_{i}x_{j}^{\prime }\\ &=& \sum_{i=1}^{n}\sigma_{i}^{2}x_{i}x_{i}^{\prime}+\sum_{i=1}^{n}\sum_{j=i+1}^{n}\sigma_{ij}\left(x_{i}x_{j}^{\prime}+x_{j}x_{i}^{\prime }\right)\end{aligned}[/math]

where [math]\sigma_{ij}[/math] is the element in the [math]i[/math]th row and [math]j[/math]th column of [math]\Omega[/math] and [math]\sigma_{ii}=\sigma_i^2[/math]. Of course, as [math]\Omega[/math] is unknown, so are all elements [math]\sigma_{ij}[/math]. The task is then to propose an estimator for [math]\mathbf{X}^{\prime }\Omega \mathbf{X}[/math] that can be plugged into the above variance formula. Newey and West propose to essentially use the following estimates:

[math]\begin{aligned} \widehat{\sigma}_{ij} &=& \widehat{u}_i \widehat{u}_j\\ \widehat{\sigma}_{i}^2 &=& \widehat{u}_i^2\end{aligned}[/math]

where [math]\widehat{u}_i[/math] is the [math]i[/math]th element of the estimated vector of regression residuals. Therefore replacing [math]\sigma_{ij}[/math] with [math]\widehat{u}_i \widehat{u}_j[/math] may seem an obvious strategy. Unfortunately this will lead to [math]Var\left( \widehat{\mathbf{\beta }}\right)=0[/math] and therefore Newey and West actually propose the following estimator

[math]\left( \mathbf{X}^{\prime }\widehat{\Omega }\mathbf{X}\right)_{NW}=\sum_{i=1}^{n}\widehat{u}_{i}^{2}x_{i}x_{i}^{\prime}+\sum_{i=1}^{n}\sum_{j=i+1}^{n}w_{h}\widehat{u}_{i}\widehat{u}_{j}\left(x_{i}x_{j}^{\prime}+x_{j}x_{i}^{\prime }\right).[/math]

The important addition here is the inclusion of a weight [math]w_h[/math] with [math]h=j-i[/math]. Essentially the weight is included to down-weigh contributions of [math](i,j)[/math] combinations where [math]h=j-i[/math] is large. The particular specification of the weights proposed by Newey and West is called Bartlett weights

[math]w_{h}:\left\{ \begin{array}{cc} 1-\frac{h}{B} & \text{for }0\lt h\lt B \\ w_{h}=0 & \text{for }h\geq B \end{array}\right.[/math]

Terms that include combinations of [math]\widehat{u}_{j}\widehat{u}_{i}[/math] where [math]h \geq B[/math] will get 0 weight and can therefore be ignored. In fact White standard errors are the special case in which [math]B=1[/math] and therefore all terms with cross-terms [math]\widehat{u}_{j}\widehat{u}_{i}[/math] are ignored.

MATLAB implementation

The formula above contains three summations. One way how this can be implemented in MATLAB is to allow one loop for each of the summations. However, it is good practice to avoid loops whenever that is possible. Let’s look at the first summand in [math]\left( \mathbf{X}^{\prime }\widehat{\Omega }\mathbf{X}\right)_{NW}[/math], [math]\sum_{i=1}^{n}\widehat{u}_{i}^{2}x_{i}x_{i}^{\prime}[/math]. The straightforward implementation using a loop is (assuming that the estimated residuals are in vector res and x is the [math](n \times k)[/math] matrix of explanatory variables):

xox = zeros(k,k);
for i = 1 : n
  xox = xox + (res(i)^2*x(i,:)'*x(i,:));
end

The same result can be achieved by the following vectorised version:

resx  = repmat(res,1,size(x,2)).*x;
xox = (resx'*resx);

The command repmat(res,1,size(x,2)) in the first line crates a [math](n \times k)[/math] matrix which has the same dimension as x and has value res(i) in all columns of the [math]i[/math]th row. This is done by taking the vector of residuals, res, and replicating it 1 times vertically, and [math]k[/math] times (=size(x,2)) times horizontally. You should convince yourself that both calculations deliver identical results. If we were to calculate White standard errors [math](B=1)[/math] this would already complete the calculation of the central term in the calculation of [math]Var\left( \widehat{\mathbf{\beta }}\right)[/math] and the estimate of the White variance covariance matrix for [math]\widehat{\mathbf{\beta }}[/math] could be completed as follows (assuming that [math]\left( \mathbf{X}^{\prime }\mathbf{X}\right) ^{-1}[/math] has previously been calculated - as it is also required for the calculation of [math]\widehat{\mathbf{\beta }}[/math] - and has been stored in xxi):

white_vcm = xxi*xox*xxi;
white_se  = sqrt(diag(white_vcm));      % extract diagonal and calculate square root

If however [math](B\gt 1)[/math] we need to calculate the extra terms [math]\sum_{i=1}^{n}\sum_{j=i+1}^{n}w_{h}\widehat{u}_{i}\widehat{u}_{j}\left(x_{i}x_{j}^{\prime}+x_{j}x_{i}^{\prime }\right)[/math]. In here we have two sums and we could write a code that involves two nested loops.However, in the same way in which we could avoid the loop in the calculation of the first summand we can eliminate one loop here. This is perhaps easiest seen by first rewriting this term as follows

[math]\begin{aligned} &&\sum_{i=1}^{n}\sum_{j=i+1}^{n}w_{h}\widehat{u}_{i}\widehat{u}_{j}\left(x_{i}x_{j}^{\prime}+x_{j}x_{i}^{\prime }\right)\\ &=& \sum_{h=1}^{B-1}w_{h}\sum_{i=1}^{n}\widehat{u}_{i}\widehat{u}_{i+h}\left(x_{i}x_{i+h}^{\prime}+x_{i+h}x_{i}^{\prime }\right)\\ &=& \sum_{h=1}^{B-1}w_{h}\sum_{i=1}^{n}\left(\widehat{u}_{i}x_{i}\widehat{u}_{i+h}x_{i+h}^{\prime}+\widehat{u}_{i+h}x_{i+h}\widehat{u}_{i}x_{i}^{\prime }\right)\\ &=& \sum_{h=1}^{B-1}w_{h}\left(\sum_{i=1}^{n}\left(\widehat{u}_{i}x_{i}\widehat{u}_{i+h}x_{i+h}^{\prime}\right)+\sum_{i=1}^{n}\left(\widehat{u}_{i+h}x_{i+h}\widehat{u}_{i}x_{i}^{\prime }\right)\right)\end{aligned}[/math]

In the first step we reverse the summations and extract the weight term outside the [math]\sum_{i=1}^{n}[/math] loop. We then change the index of the formerly inner, now outer loop to [math]h[/math]. That sum will now only iterate from [math]h=1[/math] to [math]B-1[/math] as for all [math]h \geq B[/math] the weight factor will equal 0.

As it turns out, the two sums in the parenthesis can now be calculated in a vectorised form much like [math]\sum_{i=1}^{n}\widehat{u}_{i}^{2}x_{i}x_{i}^{\prime}=\sum_{i=1}^{n}\widehat{u}_{i}x_{i}\widehat{u}_{i}x_{i}^{\prime}[/math] above. To do this we will use the matrix resx that was calculated earlier. For instance, if [math]h=1[/math] then [math]\sum_{i=1}^{n}\left(\widehat{u}_{i}x_{i}\widehat{u}_{i+1}x_{i+1}^{\prime}\right)[/math] can be calculated in vectorised form as resx(1+h:end,:)’*resx(1:end-h,:). Again, if you have any doubt, then create a little toy example and confirm that the two give identical results. The second sum will just be the transpose of this term. With these sums calculated we are left with having to write a loop that sums over [math]\sum_{h=1}^{B-1}[/math]. Last we should recognise that these calculations will have to be added to xox as it was calculated earlier.

resx  = repmat(res,1,size(x,2)).*x;
xox = (resx'*resx);

for h = 1:B-1;
  w = 1-(h/B);      % calculate Bartlett weight
  za = resx(1+h:end,:)'*resx(1:end-h,:);
  xox = xox+w*(za+za');
end;

This concludes the calculation for the central term [math]\left( \mathbf{X}^{\prime}\widehat{\Omega }\mathbf{X}\right)_{NW}[/math] in [math]Var\left( \widehat{\mathbf{\beta }}\right)[/math] and the variance covariance matrix and the standard errors are then calculated as follows:

nw_vcm = xxi*xox*xxi;
nw_se = sqrt(diag(nw_vcm));     % extract diagonal and calculate standard errors

The last issue that needs to be addressed is that of what value to use for [math]B[/math]. The basic rule is that the stronger the autocorrelation, the larger the value for [math]B[/math] should be. However, in practice one often uses the following rule of thumb: [math]B = 4(n/100)^{2/9}[/math].

On this Page you can find a revised OLS estimation function that will return, amongst others White and Newey-West standard errors. As the researcher will have a choice to make with respect to [math]B[/math] this function will allow for an additional input in which one can set a particular value for [math]B[/math]. However, if no value is chosen than the above default value will be chosen.