Difference between revisions of "R TimeSeries"

From ECLR
Jump to: navigation, search
Line 19: Line 19:
  
 
== Basic Time-Series Data Transformations ==
 
== Basic Time-Series Data Transformations ==
 +
 +
=== Forcing Time-Series ===
  
 
The first thing we need is to ensure that R knows that the data we are dealing with are actually time series data. There is a function in R you can use to check whether R realises that data are time series data or not:
 
The first thing we need is to ensure that R knows that the data we are dealing with are actually time series data. There is a function in R you can use to check whether R realises that data are time series data or not:
Line 28: Line 30:
 
     data$UKCPI <- ts(data<source enclose=none>$</source>UKCPI, start=c(1988, 1), end=c(2013, 4), frequency=4)  
 
     data$UKCPI <- ts(data<source enclose=none>$</source>UKCPI, start=c(1988, 1), end=c(2013, 4), frequency=4)  
  
Let's see what is happening here.  <source enclose=none>ts()</source> is the function used and as usual you can use <source enclose=none>?ts</source> to figure out some more details. Here we needed as the next input the data of the first observation, <source enclose=none>start=c(1988, 1)</source>, and the date of the last observation, and importantly the data frequency, <source enclose=none>frequency=4</source> (4 representing quarterly data and 12 monthly data).
+
Let's see what is happening here.  <source enclose=none>ts()</source> is the function used and as usual you can use <source enclose=none>?ts</source> to figure out some more details. The first input is the data series we want to be recognised as atime-series. The next input is the data of the first observation, <source enclose=none>start=c(1988, 1)</source>, and the date of the last observation, and importantly the data frequency, <source enclose=none>frequency=4</source> (4 representing quarterly data and 12 monthly data).
 +
 
 +
To confirm that the data are now time-series data you could use the <source enclose=none>is.ts()</source> function again which should now return a <source enclose=none>TRUE</source>.
 +
 
 +
=== Differencing data ===
 +
 
 +
Often you will need a differenced time series <math enclose=none>frequency=4</math>
  
 
== Additional Resources ==
 
== Additional Resources ==

Revision as of 21:27, 11 February 2015

In this section we will demonstrate how to do basic univariate time-series modelling with R. We will use a package written by Rob Hyndman, called "forecast". So before you get started you need to go to R and

   install.packages("forecast")

But note that this package requires R of version 3. Then at the beginning of your code you will have to import the library by adding

   library(forecast)

to your code.

Importing Data

Here we will initially use a dataset on UK CPI UKCPI.xls. Doenload this and save it as a csv file as this will facilitate the upload to R.

   setwd("YOUR DIRECTORY")              # This sets the working directory
   data <- read.csv("UKCPI.csv")  # Opens UKCPI.csv from working directory

which will produce a dataframe (data) with two variables, one giving dates (DATE) and the other containing the actual CPI data for 1988Q1 to 2013Q4 (UKCPI).

Basic Time-Series Data Transformations

Forcing Time-Series

The first thing we need is to ensure that R knows that the data we are dealing with are actually time series data. There is a function in R you can use to check whether R realises that data are time series data or not:

   print(is.ts(data$UKCPI))  # check whether it is a time-series object

This will print either TRUE or FALSE. In this case the answer is FALSE, data$UKCPI is not recognised as a time-series. We can now force R to recognise this variable as a time-series. Use the following command:

   data$UKCPI <- ts(data$UKCPI, start=c(1988, 1), end=c(2013, 4), frequency=4) 

Let's see what is happening here. ts() is the function used and as usual you can use ?ts to figure out some more details. The first input is the data series we want to be recognised as atime-series. The next input is the data of the first observation, start=c(1988, 1), and the date of the last observation, and importantly the data frequency, frequency=4 (4 representing quarterly data and 12 monthly data).

To confirm that the data are now time-series data you could use the is.ts() function again which should now return a TRUE.

Differencing data

Often you will need a differenced time series [math]frequency=4[/math]

Additional Resources

  • A very quick intro from Quick-R can be found here [1]
  • We are using the package "forecast" authored by Rob Hyndman who has also written an online textbook on the topic of forecasting [2]
  • To access some very useful data-series in a very convenient way we will also use the QUANDL package.