Difference between revisions of "R reg diag"
(→Heteroskedasticity) |
|||
Line 5: | Line 5: | ||
One of the Gauss-Markov assumption is that the variance of the regression error terms is constant. If they are not, then the OLS parameter estimators will not be efficient and one needs to use heteroskedasticity robust standard errors to obtain valid inference on regression coefficients (see [[R_robust_se]]). | One of the Gauss-Markov assumption is that the variance of the regression error terms is constant. If they are not, then the OLS parameter estimators will not be efficient and one needs to use heteroskedasticity robust standard errors to obtain valid inference on regression coefficients (see [[R_robust_se]]). | ||
− | Tests for heteroskedasticity are usually based on an auxiliary regression of estimated squared regression residuals on a set of explanatory variables that are suspected to be related to the potentially changing error variance. We continue the example we started in [[R_Regression#A | + | Tests for heteroskedasticity are usually based on an auxiliary regression of estimated squared regression residuals on a set of explanatory variables that are suspected to be related to the potentially changing error variance. We continue the example we started in [[R_Regression#A first example]] and which is replicated here, but note the first line which we include to gain access to the procedures in the AER toolbox: |
+ | |||
+ | <span style="color:#0000ff">library(AER)</span> # allow access to AER package | ||
+ | # This is my first R regression! | ||
+ | setwd("T:/ECLR/R/FirstSteps") # This sets the working directory | ||
+ | mydata <- read.csv("mroz.csv") # Opens mroz.csv from working directory | ||
+ | |||
+ | # Now convert variables with "." to num with NA | ||
+ | mydata<source enclose=none>$</source>wage <- as.numeric(as.character(mydata<source enclose=none>$</source>wage)) | ||
+ | mydata<source enclose=none>$</source>lwage <- as.numeric(as.character(mydata<source enclose=none>$</source>lwage)) | ||
+ | |||
+ | Before we run our initial regression model we shall restrict the dataframe <source enclose=none>mydata</source> to those data that do not have missing wage information, using the following <source enclose=none>subset</source> command: | ||
+ | |||
+ | <span style="color:#0000ff">mydata <- subset(mydata, wage!="NA")</span> # select non NA data | ||
+ | |||
+ | Now we can run our initial regression: | ||
+ | |||
+ | # Run a regression | ||
+ | reg_ex1 <- lm(lwage~exper+log(huswage),data=mydata) | ||
= Autocorrelation = | = Autocorrelation = |
Revision as of 08:13, 14 April 2015
When estimating regression models you will usually want to undertake some diagnostic testing. The functions we will use are all contained in the "AER" package (see the relevant CRAN webpage).
Heteroskedasticity
One of the Gauss-Markov assumption is that the variance of the regression error terms is constant. If they are not, then the OLS parameter estimators will not be efficient and one needs to use heteroskedasticity robust standard errors to obtain valid inference on regression coefficients (see R_robust_se).
Tests for heteroskedasticity are usually based on an auxiliary regression of estimated squared regression residuals on a set of explanatory variables that are suspected to be related to the potentially changing error variance. We continue the example we started in R_Regression#A first example and which is replicated here, but note the first line which we include to gain access to the procedures in the AER toolbox:
library(AER) # allow access to AER package # This is my first R regression! setwd("T:/ECLR/R/FirstSteps") # This sets the working directory mydata <- read.csv("mroz.csv") # Opens mroz.csv from working directory # Now convert variables with "." to num with NA mydata$
wage <- as.numeric(as.character(mydata$
wage)) mydata$
lwage <- as.numeric(as.character(mydata$
lwage))
Before we run our initial regression model we shall restrict the dataframe mydata
to those data that do not have missing wage information, using the following subset
command:
mydata <- subset(mydata, wage!="NA") # select non NA data
Now we can run our initial regression:
# Run a regression reg_ex1 <- lm(lwage~exper+log(huswage),data=mydata)