Difference between revisions of "Regression Inference in R"
Line 33: | Line 33: | ||
F-tests are used to test multiple coefficient restrictions on regression coefficients. | F-tests are used to test multiple coefficient restrictions on regression coefficients. | ||
− | Let's say we are interested whether two additional variables <code>age</code> and <code>educ</code> should be included into the model. As a good econometrics student, or even master, you know that to calculate a F-test you need residual sum of squares from a restricted model | + | Let's say we are interested whether two additional variables <code>age</code> and <code>educ</code> should be included into the model. As a good econometrics student, or even master, you know that to calculate a F-test you need residual sum of squares from a restricted model (that is model <code>reg_ex1</code>) and an unrestricted model. The latter we estimate here: |
+ | |||
+ | reg_ex2 <- lm(lwage~exper+log(huswage)+age+educ,data=mydata) | ||
+ | reg_ex2_sm <- summary(reg_ex2) | ||
+ | |||
+ | Calculating the F-test is now very easy. We use the function <code>anova</code>: | ||
+ | |||
+ | print(anova(reg_ex1,reg_ex2)) | ||
+ | |||
+ | which delivers the following output: | ||
+ | |||
+ | Analysis of Variance Table | ||
+ | Model 1: lwage ~ exper + log(huswage) | ||
+ | Model 2: lwage ~ exper + log(huswage) + age + educ | ||
+ | Res.Df RSS Df Sum of Sq F Pr(>F) | ||
+ | 1 425 210.11 | ||
+ | 2 423 188.10 2 22.004 24.741 6.895e-11 *** | ||
+ | --- | ||
+ | Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 | ||
+ | |||
+ | The table at the heart of this output delivers the individual <span style="color:#0000ff">residual sum of squares</span>, the <span style="color:#00ff00">F-test statstic</span> and its <span style="color:#ff0000">p-value</span> |
Revision as of 12:55, 14 April 2015
here we will discuss how to perform standard inference in regression models.
Setup
We continue the example we started in R_Regression#A first example and which is replicated here:
# This is my first R regression! setwd("T:/ECLR/R/FirstSteps") # This sets the working directory mydata <- read.csv("mroz.csv") # Opens mroz.csv from working directory # Now convert variables with "." to num with NA mydata$
wage <- as.numeric(as.character(mydata$
wage)) mydata$
lwage <- as.numeric(as.character(mydata$
lwage))
Before we run our initial regression model we shall restrict the dataframe mydata
to those data that do not have missing wage information, using the following subset
command:
mydata <- subset(mydata, wage!="NA") # select non NA data
Now we can run our initial regression:
# Run a regression reg_ex1 <- lm(lwage~exper+log(huswage),data=mydata) reg_ex1_sm <- summary(reg_ex1)
We will introduce inference in this model.
t-tests
We use t-tests to test simple coefficient restrictions on regression coefficients.
F-tests
F-tests are used to test multiple coefficient restrictions on regression coefficients.
Let's say we are interested whether two additional variables age
and educ
should be included into the model. As a good econometrics student, or even master, you know that to calculate a F-test you need residual sum of squares from a restricted model (that is model reg_ex1
) and an unrestricted model. The latter we estimate here:
reg_ex2 <- lm(lwage~exper+log(huswage)+age+educ,data=mydata) reg_ex2_sm <- summary(reg_ex2)
Calculating the F-test is now very easy. We use the function anova
:
print(anova(reg_ex1,reg_ex2))
which delivers the following output:
Analysis of Variance Table Model 1: lwage ~ exper + log(huswage) Model 2: lwage ~ exper + log(huswage) + age + educ Res.Df RSS Df Sum of Sq F Pr(>F) 1 425 210.11 2 423 188.10 2 22.004 24.741 6.895e-11 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
The table at the heart of this output delivers the individual residual sum of squares, the F-test statstic and its p-value