Difference between revisions of "Bayes"

From ECLR
Jump to: navigation, search
(Created page with "= Problem Setup = This section is yet to be written, but you could check out this [http://youtu.be/8QxiZY-MaGA YouTube clip] I produced to illustrate how a Bayesian Estimatio...")
 
(Problem Setup)
Line 7: Line 7:
 
In fact estimating a proportion is about the simplest problem you can tackle in the Bayesian estimation framework, but it makes the workings very obvious.
 
In fact estimating a proportion is about the simplest problem you can tackle in the Bayesian estimation framework, but it makes the workings very obvious.
  
The data used in the example can be downloaded from here: [[[media: HadCRUT4-gl.xlsx]|[media: HadCRUT4-gl.xlsx]]].
+
The data used in the example can be downloaded from here: [[media:HadCRUT4-gl.xlsx|HadCRUT4-gl.xlsx]].
  
 
At this stage you will have to use the comments in the code for explanations.
 
At this stage you will have to use the comments in the code for explanations.

Revision as of 20:06, 7 December 2014

Problem Setup

This section is yet to be written, but you could check out this YouTube clip I produced to illustrate how a Bayesian Estimation of a proportion would work.

The problem is to estimate the proportion of years in which the average global temperature increases. This is a slightly odd way to check whether there is a detectable global temperature trend, although there surely may be better ways to do this. But it is a nice example to illustrate how Bayesian estimation works.

In fact estimating a proportion is about the simplest problem you can tackle in the Bayesian estimation framework, but it makes the workings very obvious.

The data used in the example can be downloaded from here: HadCRUT4-gl.xlsx.

At this stage you will have to use the comments in the code for explanations.

% Example: using HadCRUT4-gl.xlsx global temperature anomaly data

% Estimate the following parameter: p, probability of an upward trend

% For Bayesian Reasoning we need to start with a prior probability
% distribution: define p as the probability of having an up-tick
% We now need to specify a distribution for p, say f(p) = 1/101
% when we treat p as a discrete distribution that can take values
% of p = 0.00,0.01,0.02,...,1.00

clc;
clear all;

pval = (0:0.01:1)'; % values of p at which to evaluate probabilities


%% load global temperature data
data = xlsread('HadCRUT4-gl.xlsx');
years = data(1:164,1);              % extract the years for which we have data
temp = data(1:164,2);               % extract the temperature data
diff = temp(2:end)-temp(1:end-1);   % calculate the change in temperature
ups = (diff>0);                     % dummy variable 1 if diff > 0, 0 otherwise
% ups = ups(124:end);                 % Select data from 1973 onwards only
n = size(ups,1);                    % number of observations

plot(years,temp);
title('Global Temperature Anamoly');
%% Frequentist approach

pbar = mean(ups);
sd_pbar = sqrt(pbar*(1-pbar)/n);

%% Bayesian Updating

% Prior 1: Normal (m1,sd1^2)
m1 = 0.5;
sd1 = 0.05;
fp_prior1 = normcdf(pval+0.005,m1,sd1);   % continuous
fp_prior1 = [0;fp_prior1(2:end)-fp_prior1(1:end-1)];   % discretised

% Prior 2: Normal (m2,sd2^2)
m2 = 0.6;
sd2 = 0.05;
fp_prior2 = normcdf(pval+0.005,m2,sd2);   % continuous
fp_prior2 = [0;fp_prior2(2:end)-fp_prior2(1:end-1)];   % discretised

% Prior 3: Uniform
fp_prior3 = ones(size(pval)).*(1/size(pval,1));

subplot(2,1,1)
plot(pval,[fp_prior1 fp_prior2 fp_prior3]);
title('Prior Distributions');
axis([0 1 0 0.15]);
subplot(2,1,2)
% Let's say we have a year with temp increase, then the probability at each pval, pval(i),
% is updated as follows
% P(pval(i)|up) = P(pval(i) and up)/P(up) = fp_prior(i)*pval(i)/P(up)
% where p(up) = sum over all j(fp_prior(j)*pval(j))

for sim = 1:n

    % Prior 1
    P_joint1 = fp_prior1.*pval.*ups(sim) + fp_prior1.*(1-pval).*(1-ups(sim));
    P_tick1  = sum(P_joint1);
    fp_post1 = P_joint1./P_tick1;   % this scales all probs such that they sum to 1
    bar(pval,fp_post1,'blue');
    axis([0 1 0 0.15]);
    hold on
    fp_prior1 = fp_post1;

    % Prior 2
    P_joint2 = fp_prior2.*pval.*ups(sim) + fp_prior2.*(1-pval).*(1-ups(sim));
    P_tick2  = sum(P_joint2);
    fp_post2 = P_joint2./P_tick2;   % this scales all probs such that they sum to 1
    bar(pval,fp_post2,'green');
    axis([0 1 0 0.15]);
    hold on
    fp_prior2 = fp_post2;

    % Prior 3
    P_joint3 = fp_prior3.*pval.*ups(sim) + fp_prior3.*(1-pval).*(1-ups(sim));
    P_tick3  = sum(P_joint3);
    fp_post3 = P_joint3./P_tick3;   % this scales all probs such that they sum to 1
    bar(pval,fp_post3,'red');
    axis([0 1 0 0.15]);
    fp_prior3 = fp_post3;

    title('Posterior Distributions');
    hold off;

    if sim < 20;

        pause(0.5);
    else
        pause(0.0003);
    end
end


disp('Bayes 1');
disp('P(p>0.5)');
disp(sum(fp_post1(51:end)));
disp('');
disp('Bayes 2');
disp('P(p>0.5)');
disp(sum(fp_post2(51:end)));
disp('');
disp('Bayes 3');
disp('P(p>0.5)');
disp(sum(fp_post3(51:end)));

2011-025B

A significant part of this paper buids on a previous survey paper:

West, K.W. (2006) Forecast Evaluation, in: Handbook of Economic Forecasting, Volume 1, edited by G. Elliott, C. W.J. Granger, A. G. Timmermann

A review of the uses and abuses of the Diebold-Mariano Test has recently been provided by the man himself, Francis Diebold.