Difference between revisions of "R"

From ECLR
Jump to: navigation, search
(Intermediate Techniques)
(15 intermediate revisions by the same user not shown)
Line 38: Line 38:
 
| '''Files'''  
 
| '''Files'''  
 
| [[media:mroz.xls|mroz.xls]] <br> [[media:mroz.csv|mroz.csv]] <br> [[MROZ_Variable_Description|Variable Description]]
 
| [[media:mroz.xls|mroz.xls]] <br> [[media:mroz.csv|mroz.csv]] <br> [[MROZ_Variable_Description|Variable Description]]
| [[media:crim4.xls|crime4.xls]]  <br> [[media:crim4.csv|crime4.csv]]
+
| [[media:crim4.xls|crime4.xls]]  <br> [[media:crim4.csv|crime4.csv]] <br> [[Crim4_Variable_Description|Variable Description]]
 
| [[media:mlb1.xls|mlb1.xls]] <br> [[media:mlb1.csv|mlb1.csv]] <br> [[MLB1_Variable_Description|Variable Description]]
 
| [[media:mlb1.xls|mlb1.xls]] <br> [[media:mlb1.csv|mlb1.csv]] <br> [[MLB1_Variable_Description|Variable Description]]
 
|-
 
|-
Line 57: Line 57:
 
! scope="col"| First Steps
 
! scope="col"| First Steps
 
! scope="col"| Loading Data and<br>Date Formats  
 
! scope="col"| Loading Data and<br>Date Formats  
 +
! scope="col"| Using<br>Packages
 +
 +
|-
 +
| [[R_FirstSteps|Discussion]]
 +
| [[R_Data|Discussion]]
 +
| [[R_Packages|Discussion]] 
 +
|}
 +
 +
{| class="wikitable"
 +
|-
 
! scope="col"| Basic Data<br>Analysis
 
! scope="col"| Basic Data<br>Analysis
 +
! scope="col"| Data Analysis<br>Tidyverse
 
! scope="col"| A<br>Regression
 
! scope="col"| A<br>Regression
 
|-
 
|-
| [[R_FirstSteps|Discussion]]
 
| [[R_Data|Discussion]]
 
 
| [[R_Analysis|Discussion]]  
 
| [[R_Analysis|Discussion]]  
 +
| [[R_AnalysisTidy|Discussion]]
 
| [[R_Regression|Discussion]]
 
| [[R_Regression|Discussion]]
 
|}
 
|}
Line 68: Line 78:
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
! scope="col"| Using<br>Packages
 
 
! scope="col"| Creating <br> Graphics
 
! scope="col"| Creating <br> Graphics
 
! scope="col"| Saving Data and<br>Screen Output
 
! scope="col"| Saving Data and<br>Screen Output
 
|-
 
|-
| [[R_Packages|Discussion]] 
 
 
| [[R_Graphing|Discussion]] <br> [[R_Graphing_Treat|Treat Yourself]]
 
| [[R_Graphing|Discussion]] <br> [[R_Graphing_Treat|Treat Yourself]]
 
| [[R_SavingData|Discussion]]  
 
| [[R_SavingData|Discussion]]  
Line 121: Line 129:
  
 
! scope="col"| Univariate Time<br>Series Modelling
 
! scope="col"| Univariate Time<br>Series Modelling
 +
! scope="col"| Multivariate Time<br>Series Modelling<br>VAR
 +
! scope="col"| Time Series<br>Plotting
 +
! scope="col"| Univariate and<br>Multivariate<br>GARCH Modelling
 +
|-
 +
| [[R_TimeSeries|Discussion]]
 +
| [[R_TS_VAR|Discussion]]
 +
| [[R_TSplots|Discussion]] <br>uses the following data files:<br>[[Media:AggInfl.csv|AggInfl.csv]],[[Media:CoreInfl.csv|CoreInfl.csv]]<br>[[Media:EnergInfl.csv|EnergInfl.csv]],[[Media:FoodInfl.csv|FoodInfl.csv]]
 +
| [[R_GARCH|Discussion]]
 +
|}
 +
 +
{| class="wikitable"
 
|-
 
|-
| [[R_TimeSeries|Discussion]]
+
 
 +
! scope="col"| Bayesian Estimation<br>Principle
 +
|-
 +
| [[R_BayesGrid|Discussion]]  
 +
 
 
|}
 
|}
  
Line 144: Line 167:
 
{| class="wikitable"
 
{| class="wikitable"
 
|-
 
|-
 +
! scope="col"| Sampling and<br>LLN and CLT
 
! scope="col"| Demonstrating OLS<br>estimator unbiasedness
 
! scope="col"| Demonstrating OLS<br>estimator unbiasedness
 
! scope="col"| Demonstrating OLS estimator<br>asymptotic behaviour
 
! scope="col"| Demonstrating OLS estimator<br>asymptotic behaviour
 
|-
 
|-
 +
| [[R_Sampling|Discussion]]
 
| [[R_Unbiasedness|Discussion]]   
 
| [[R_Unbiasedness|Discussion]]   
 
| [[R_Asymptotics|Discussion]]  
 
| [[R_Asymptotics|Discussion]]  
Line 155: Line 180:
 
This wiki was created by [mailto:ralf.becker@manchester.ac.uk Ralf Becker] and [mailto:james.lincoln@manchester.ac.uk James Lincoln] with the financial support of a University of Manchester CHERIL grant. If you have any suggestions please contact us by email. Contributions to this wiki are encouraged. Please contact us if you are interested.
 
This wiki was created by [mailto:ralf.becker@manchester.ac.uk Ralf Becker] and [mailto:james.lincoln@manchester.ac.uk James Lincoln] with the financial support of a University of Manchester CHERIL grant. If you have any suggestions please contact us by email. Contributions to this wiki are encouraged. Please contact us if you are interested.
  
An easy way to create content for this page is to write RMarkdown documents which can then easily be translated, thanks to pandoc, to MediaWiki format (see [http://nicercode.github.io/guides/reports/]).
+
An easy way to create content for this page is to write RMarkdown documents which can then easily be translated, thanks to pandoc, to MediaWiki format (see [http://nicercode.github.io/guides/reports/]). From the command window call "pandoc -f markdown -t MediaWiki FILENAME.md -o FILENAME.mediawiki".
  
 
== More references ==
 
== More references ==
Line 163: Line 188:
 
* A dedicated tweet channel for Econometrics with R [https://twitter.com/Rstats4Econ]
 
* A dedicated tweet channel for Econometrics with R [https://twitter.com/Rstats4Econ]
 
* Rob Hyndman has great material [http://robjhyndman.com/publications/software/], some of which will be referred to here.
 
* Rob Hyndman has great material [http://robjhyndman.com/publications/software/], some of which will be referred to here.
 +
* My colleague Junanjo Medina has material for criminologists that includes good intros to graphing and some basic statistics [http://jjmedinaariza.github.io/R-for-Criminologists/]
 
* [http://www.computerworld.com/article/2497143/business-intelligence-beginner-s-guide-to-r-introduction.html?null A Beginner's Guide to R]
 
* [http://www.computerworld.com/article/2497143/business-intelligence-beginner-s-guide-to-r-introduction.html?null A Beginner's Guide to R]
 +
* Florian Heiss has written an R companion book to Wooldridge's Introductory Econometrics. It is available for free [http://www.urfie.net/read/mobile/index.html#p=1 online] but you can also get a [http://www.urfie.net/index.html hardcopy]
 
* Some R resources provided by [http://www.ats.ucla.edu/stat/r/ UCLA]
 
* Some R resources provided by [http://www.ats.ucla.edu/stat/r/ UCLA]
 
* [http://www.statmethods.net Quick-R] web-site and [http://www.manning.com/kabacoff2/RiA2E_meap_ch1.pdf first chapter of R in Action]
 
* [http://www.statmethods.net Quick-R] web-site and [http://www.manning.com/kabacoff2/RiA2E_meap_ch1.pdf first chapter of R in Action]
 
* Just TryR it! [http://tryr.codeschool.com/levels/1/challenges/1]
 
* Just TryR it! [http://tryr.codeschool.com/levels/1/challenges/1]
 +
* Some resource by the UCLA [http://www.ats.ucla.edu/stat/r/]
 
* A practice RData file [https://drive.google.com/file/d/0B-eFeuIjpKsOWmdpOUsxT2Via3M/view?usp=sharing], use this to load required packages [https://drive.google.com/file/d/0B-eFeuIjpKsOc3VYYnh1bEtZcnM/view?usp=sharing]
 
* A practice RData file [https://drive.google.com/file/d/0B-eFeuIjpKsOWmdpOUsxT2Via3M/view?usp=sharing], use this to load required packages [https://drive.google.com/file/d/0B-eFeuIjpKsOc3VYYnh1bEtZcnM/view?usp=sharing]

Revision as of 22:52, 3 May 2018

R is an open source software that has been been adopted by the statistical community as its standard software package. It is a command driven software, meaning that you will have to give the software written commands to indicate what you do. On first sight this is not as convenient as a menu driven software, but it has the huge advantage that you can collect a large set of commands in a file (script file) and then have R execute all these commands in one go. This then serves as a great documentation of the work you have done and most importantly it makes it easy to change a small aspect of your work and rerun the entire project on the press of a button rather than having to laboriously retrace all your steps through menus.

The fixed cost of learning this software is higher than learning a menu driven statistical software package. But if you engage with this process the rewards will be great.

Last not least, R has a killer advantage. It is free!!!

Installing the Software

Installation Demonstration

To work with R you will have to install the basic software package R, but we also advise you to install RStudio, which is an add-on to R (formally called an Integrated Development Environment - IDE) which makes working with R easier.

As this is open-source software that you get for free it is perhaps understandable that the webpages from which you get the R software aren't as slick as you expect. And the language tends to be somewhat more techy, but don't worry, you'll be fine.

So here are the steps you should take.

  1. Download and install the R software, which is available from the CRAN website. Follow the "Download and Install R" link (and do not be tempted to download the source code!) for your operating system. If you have a window OS only choose the "base" package on the following screen. Then follow the usual installation instructions. You could now already work with R, but we recommend that you first undertake the next step.
  2. Once we have installed R, we can download and install RStudio. You can download it from the RStudio download page.

The basic R software has some basic functionality, but the power of R comes from the ability to use code written to perform statistical and econometric techniques that has been written by other people. These additional pieces of software are called packages and the next step will be to learn how ot use these.

Data Sets

We use a number of datasets on this page. For convenience they are listed here:

Women's wages Crime Statistics Baseball Wages
Description Observations for 753 females on wages, familiar and work circumstances hours worked and wages Crime Statistics for 90 counties in North Carolina (US) for Years 1981 to 1987 (Panel Data); includes a number of variables to characterise the counties Salary and other information (such as race, position and performance information) for 353 Baseball Players in 1993
Files mroz.xls
mroz.csv
Variable Description
crime4.xls
crime4.csv
Variable Description
mlb1.xls
mlb1.csv
Variable Description
Source Wooldridge Book Companion Page Wooldridge Book Companion Page Wooldridge Book Companion Page

Following the links in the above table you will also be able to download R data files for these datasets.

Basic Tasks

To illustrate how to perform basic tasks in R we will use the Women's wages dataset (mroz.csv). This is a comma separated value (csv) file that contains a dataset which we will use for our first steps in R. It is a well used cross-sectional dataset with 753 observations of female members of the labour force in the US (in 1975). It contains variables such as the number of children, the wage, the hours worked etc.

First Steps Loading Data and
Date Formats
Using
Packages
Discussion Discussion Discussion
Basic Data
Analysis
Data Analysis
Tidyverse
A
Regression
Discussion Discussion Discussion
Creating
Graphics
Saving Data and
Screen Output
Discussion
Treat Yourself
Discussion

Bread and Butter Techniques

These are standard econometric problems tasks that any applied econometrician, and indeed aspiring economics students, should be familiar with.

Dummy
variables
Predicting from
a Regression
Discussion Discussion
Standard
inference
Regression
diagnostics
Robust
standard errors
Discussion Discussion Discussion

Intermediate Techniques

Panel
Data
Instrumental Variables
Estimation
Matching
Discussion Discussion Discussion
Univariate Time
Series Modelling
Multivariate Time
Series Modelling
VAR
Time Series
Plotting
Univariate and
Multivariate
GARCH Modelling
Discussion Discussion Discussion
uses the following data files:
AggInfl.csv,CoreInfl.csv
EnergInfl.csv,FoodInfl.csv
Discussion
Bayesian Estimation
Principle
Discussion

Some Fun Stuff

Plotting
Maps
Scraping
the internet
Discussion Discussion


Econometric Demonstrations

In this section you can find code that can be useful to demonstrate a few econometric issues.

Sampling and
LLN and CLT
Demonstrating OLS
estimator unbiasedness
Demonstrating OLS estimator
asymptotic behaviour
Discussion Discussion Discussion

Authors, Maintenance and Contributions

This wiki was created by Ralf Becker and James Lincoln with the financial support of a University of Manchester CHERIL grant. If you have any suggestions please contact us by email. Contributions to this wiki are encouraged. Please contact us if you are interested.

An easy way to create content for this page is to write RMarkdown documents which can then easily be translated, thanks to pandoc, to MediaWiki format (see [1]). From the command window call "pandoc -f markdown -t MediaWiki FILENAME.md -o FILENAME.mediawiki".

More references

There is a plethora of resources if you want to learn R (which is one reason why this resource does not go into too much detail). Here are a few places to start.

  • A dedicated tweet channel for Econometrics with R [2]
  • Rob Hyndman has great material [3], some of which will be referred to here.
  • My colleague Junanjo Medina has material for criminologists that includes good intros to graphing and some basic statistics [4]
  • A Beginner's Guide to R
  • Florian Heiss has written an R companion book to Wooldridge's Introductory Econometrics. It is available for free online but you can also get a hardcopy
  • Some R resources provided by UCLA
  • Quick-R web-site and first chapter of R in Action
  • Just TryR it! [5]
  • Some resource by the UCLA [6]
  • A practice RData file [7], use this to load required packages [8]