Python/Data

From ECLR
Revision as of 20:35, 21 July 2014 by Rb (talk | contribs) (Created page with " = Introduction = Here we will give a brief introduction into how to best handle data when using Python to solve Econometric problems. Here we will use a tool called Pandas....")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search


Introduction

Here we will give a brief introduction into how to best handle data when using Python to solve Econometric problems. Here we will use a tool called Pandas. They are based on Numpy arrays. So first you got to make sure that the [[Numpy|Numpy]] and [[Panda|Panda]] modules are available.

You can use Pandas to any of the following:

  • Merge data-sets
  • Filter data-sets
  • Calculate summary statistics

We will do this by way of an example. Here are two datafiles:

  1. S&P500: SP500.csv
  2. IBM: IBM.csv

These are csy files downloaded from [[Yahoo|http://www.yahoo.com/finance Yahoo]] which contain information about the S&P500 share price index and the IBM share prices. But let’s use Python and Pandas to explore the data.

Data Import

Use the following code

import numpy as np      # import modules for use
import pandas as pd

data_SP = pd.read_csv('SP500.csv')
data_IBM = pd.read_csv('IBM.csv')

Literature

Hamilton J.D. (1994) Time Series Analysis, Princeton, Section 5.7 as well as Judge G.G, W.E. Griffiths, R.C. Hill, H. Lütkepohl and T.-C. Lee (1985) The Theory and Practice of Econometrics, John Wiley, Appendix B, give good introductions into the mechanics of nonlinear optimisation algorithms.

Martin V., Hurn S. and Harris D. (2012) Econometric Modelling with Time Series: Specification, Estimation and Testing (Themes in Modern Econometrics), Chapter 3 gives an excellent introduction into nonlinear optimisation strategies.