Difference between revisions of "R robust se"

From ECLR
Jump to: navigation, search
Line 3: Line 3:
 
== Which package to use ==
 
== Which package to use ==
  
There are a number of pieces of code available to facilitate this task. Here I recommend to use the [http://cran.r-project.org/web/packages/sandwich/index.html "sandwich" package]. Which has the most comprehensive robust standard error options I am aware of.
+
There are a number of pieces of code available to facilitate this task<ref>An alternative option is discussed [http://www.r-bloggers.com/video-tutorial-on-robust-standard-errors/ here] but it is less powerful than the sandwich package.</ref>. Here I recommend to use the [http://cran.r-project.org/web/packages/sandwich/index.html "sandwich" package]. Which has the most comprehensive robust standard error options I am aware of.
  
 
As described in more detail in [[R_Packages]] you should install the package the first time you use it on a particular computer:
 
As described in more detail in [[R_Packages]] you should install the package the first time you use it on a particular computer:
Line 27: Line 27:
  
 
     vcv <- vcovHC(reg_ex1, type = "HC1")
 
     vcv <- vcovHC(reg_ex1, type = "HC1")
 +
 +
This saves the heteroscedastic robust standard error in <source enclose=none>vcv</source><ref>Predictably the <source enclose=none>type</source> option in this function indicates that there are several options (actually "HC0" to "HC4"). Using "HC1" will replicate the robust standard errors you would obtain using STATA.</ref>. Now you can calculate robust t-tests by using the estimated coefficients and the new standard errors (square roots of the diagonal elements on <source enclose=none>vcv</source>). But note that inference using these standard errors is only valid for sufficiently large sample sizes (asymptotically normally distributed t-tests).
  
 
== Autocorrelation and heteroskedasticity robust standard errors ==
 
== Autocorrelation and heteroskedasticity robust standard errors ==
 +
 +
The us
  
 
== Footnotes ==
 
== Footnotes ==
  
 
  <references />
 
  <references />

Revision as of 22:19, 5 April 2015

Here we briefly discuss how to estimate robust standard errors for linear regression models

Which package to use

There are a number of pieces of code available to facilitate this task[1]. Here I recommend to use the "sandwich" package. Which has the most comprehensive robust standard error options I am aware of.

As described in more detail in R_Packages you should install the package the first time you use it on a particular computer:

   install.packages("sandwich")

and then call the package at the beginning of your script into the library:

   library(sandwich)

All code snippets below assume that you have done so.

Heteroskedasticity robust standard errors

I assume that you know that the presence of heteroskedastic standard errors renders OLS estimators of linear regression models inefficient (although they remain unbiased). More seriously, however, they also imply that the usual standard errors that are computed for your coefficient estimates (e.g. when you use the summary() command as discussed in R_Regression), are incorrect (or sometimes we call them biased). This implies that inference based on these standard errors will be incorrect (incorrectly sized). What we need are coefficient estimate standard errors that are correct even when regression error terms are heteroskedastic.

Let's assume that you have calculated a regression (as in R_Regression):

    # Run a regression
    reg_ex1 <- lm(lwage~exper+log(huswage),data=mydata)

The function from the "sandwich" package that you want to use is called vcovHC() and you use it as follows:

   vcv <- vcovHC(reg_ex1, type = "HC1")

This saves the heteroscedastic robust standard error in vcv[2]. Now you can calculate robust t-tests by using the estimated coefficients and the new standard errors (square roots of the diagonal elements on vcv). But note that inference using these standard errors is only valid for sufficiently large sample sizes (asymptotically normally distributed t-tests).

Autocorrelation and heteroskedasticity robust standard errors

The us

Footnotes

  1. An alternative option is discussed here but it is less powerful than the sandwich package.
  2. Predictably the type option in this function indicates that there are several options (actually "HC0" to "HC4"). Using "HC1" will replicate the robust standard errors you would obtain using STATA.