Difference between revisions of "R Data"

From ECLR
Jump to: navigation, search
(csv file import)
Line 18: Line 18:
  
 
     # This is my first R script!
 
     # This is my first R script!
 
 
     setwd("O:/ECLR/R")              # This sets the working directory
 
     setwd("O:/ECLR/R")              # This sets the working directory
 
     mydata <- read.csv("mroz.csv")  # Opens mroz.csv from working directory
 
     mydata <- read.csv("mroz.csv")  # Opens mroz.csv from working directory

Revision as of 14:50, 12 January 2015

On most occasions you would want to use data which already exist in some electronic form (lucky you that you did not study in the 70s when you had to trawl through paper back-copies of some statistical agency and copy data by hand and then enter manually into some spreadsheet). The question then is how to import these data into R and use them for your statistical or econometric analysis.

Upload a data file to your working directory

In the first instance I want you to download this mroz.xls Excel file that contains a dataset which we will use for our first steps in R. It is a well used cross-sectional dataset with 753 observations of female members of the labour force in the US (in 1975). It contains variables such as the number of children, the wage, the hours worked etc. A bit more detail on the data and the variables can be found in this file. See also [1].

Make sure that you note down in which folder you save this file. Save it in a folder in which you want to save your work. We shall soon call this folder our working directory. At this stage we have not yet made the data available to R. This will come soon!

File Formats

R makes it really easy to import data if they are already in the R data format (see later) or indeed if they are in csv (comma separated values) format. This is a rather short list and importantly, this list does not include EXCEL files, which is the format in which most datafiles will land in your inbox.

csv file import

Now, as R is such a popular package, clever and busy programmers have written an extension (or better a package in R speak) that does import data directly into R, but unfortunately this package (gdata) requires that some other software is installed on your computer and ... it just gets too messy. Good thing that it is really easy to turn your Excel file into a csv file. Open your data file in Excel and then "Save as ..." the file again and change the extension from an Excel file to a csv file.

Once you have done this with the "mroz.xls" file you should have, besides the Excel file, a "mroz.csv" file in your folder. It is now time to let R do some work. Return to your Firststeps.R script, or open a new script with the following first two lines:

    # This is my first R script!
    setwd("O:/ECLR/R")              # This sets the working directory
    mydata <- read.csv("mroz.csv")  # Opens mroz.csv from working directory


Return to the R Start page.