Difference between revisions of "R Regression"

From ECLR
Jump to: navigation, search
Line 1: Line 1:
Let's assume we want to run a regression with <source enclose=none>lwage</source>
+
 
 +
Let's assume we want to run a regression with <source enclose=none>lwage</source> (the logarithm of the woman's wage) as dependent variable and a constant, <source enclose=none>exper</source> (the years of experience) and the logarithm of the husbands wage (<source enclose=none>huswage</source> as explanatory variables. First we should note that the logarithm of the woman's wage already exists as variable <source enclose=none>lwage</source>, but the logarithm of the husband's wage doesn't exist as its own variable. Hence we are yet to calculate it.
 +
 
 +
== The <source enclose=none>lm()</source> function ==
 +
 
 +
The R function that does the heavy lifting for regression analysis is the <source enclose=none>lm()</source> function and we will have a close up look at how it works. But let's get our first regression under the belt:
 +
 
 +
    # This is my first R regression!
 +
    setwd("T:/ECLR/R/FirstSteps")              # This sets the working directory
 +
    mydata <- read.csv("mroz.csv")  # Opens mroz.csv from working directory
 +
    # Now convert variables with "." to num with NA
 +
    mydata<source enclose=none>$</source>wage <- as.numeric(as.character(mydata<source enclose=none>$</source>wage))
 +
    mydata<source enclose=none>$</source>lwage <- as.numeric(as.character(mydata<source enclose=none>$</source>lwage))
 +
    # Run a regression
 +
    regres <- lm(lwage~exper+log(huswage),data=mydata)

Revision as of 16:25, 17 January 2015

Let's assume we want to run a regression with lwage (the logarithm of the woman's wage) as dependent variable and a constant, exper (the years of experience) and the logarithm of the husbands wage (huswage as explanatory variables. First we should note that the logarithm of the woman's wage already exists as variable lwage, but the logarithm of the husband's wage doesn't exist as its own variable. Hence we are yet to calculate it.

The lm() function

The R function that does the heavy lifting for regression analysis is the lm() function and we will have a close up look at how it works. But let's get our first regression under the belt:

    # This is my first R regression!
    setwd("T:/ECLR/R/FirstSteps")              # This sets the working directory
    mydata <- read.csv("mroz.csv")  # Opens mroz.csv from working directory
    # Now convert variables with "." to num with NA
    mydata$wage <- as.numeric(as.character(mydata$wage))
    mydata$lwage <- as.numeric(as.character(mydata$lwage))
    # Run a regression
    regres <- lm(lwage~exper+log(huswage),data=mydata)