ExampleCodeIV
Below you can find functions that, inter alia, deliver an IV estimate, perform a Hausmann test on endogeneity and a Sargan test on instrument validity. More details on how to use these functions is provided in IV.
IVest.m
This is the function that delivers IV parameter estimates. It will work for exactly and over-identified cases.
function [biv,bse,res,r2] = IVest(y,x,z);
% This function performs an IV estimation
% input: y, vector with dependent variable
% x, matrix with explanatory variable (include vector of ones if
% you want constant
% z, matrix with instrumental variables (at least as many cols as x)
% output: biv, estimated parameters using IV
% bse, standard errors for biv
% res, IV residuals
% r2, Rsquared
[n,kx] = size(x); % sample size - n, number of explan vars (incl constant) - kx
[n,kz] = size(z); % sample size - n, number of instrumental vars - kz
pz = z*inv(z'*z)*z'; % Projection matrix
xpzxi = inv(x'*pz*x); % this is also (Xhat'Xhat)^(-1)
biv = xpzxi*x'*pz*y; % IV estimate
res = y - x*biv; % IV residuals
ssq = res'*res/(n-kx); % Sample variance for IV residuals
s = sqrt(ssq); % Sample Standard deviation for IV res
bse = ssq*xpzxi; % Variance covariance matrix for IV estimates
bse = sqrt(diag(bse)); % Extract diagonal and take square root -> standard errors for IV estimators
ym = y - mean(y);
r2 = 1 - (res'*res)/(ym'*ym);
end
Hausmann endogeneity test
This function can be used to perform to test whether a set of explanatory variables is endogenous or not.
function [teststat,pval] = hausmann_iv_exog_test(y,x1,x2,z);
% This function performs a test on variable exogeneity
% see Heji et al. p. 411
% input: y, vector with dependent variable
% x1, matrix with explanatory variable (include vector of ones if
% you want constant which are assumed to be exogenous
% x2, matrix with explanatory variables that are to be tested on
% exogeneity
% z, matrix with instrumental variables (at least as many cols as x)
% output: teststat, calculated test statistic
% pval, p-value
x = [x1 x2];
xxi = inv(x'*x);
b = xxi*x'*y;
res = y - x*b;
zzi = inv(z'*z);
gam = zzi*z'*x2; % This works even if we have more than one element in x2
% we get as many columns of gam as we have elements in x2
vhat = x2 - z*gam;
[b,bse,res,n,rss,r2] = OLSest(res,[x vhat],0);
teststat = size(res,1)*r2;
pval = 1 - chi2cdf(teststat,size(x2,2));
Sargan test on instrument validity
This is used to check whether a chosen set of instruments are indeed exogenous. This test is only applicable for over-identified cases.
function [teststat,pval] = sargan_iv_validity_test(resiv,x,z);
% This function performs a test on instrument validity
% see Heji et al. p. 412
% input: resiv, residuals from the IV regression
% x, matrix with explanatory variable (include vector of ones if
% you want constant
% z, matrix with instrumental variables (at least as many cols as x)
% output: teststat, calculated test statistic
% pval, p-value
if (size(z,2)-size(x,2))>0
[b,bse,res,n,rss,r2] = OLSest(resiv,z,0);
teststat = size(resiv,1)*r2;
pval = 1 - chi2cdf(teststat,(size(z,2)-size(x,2)));
else
teststat = 'NA';
pval = 'NA';
disp('Sargan test not applicable here as the model is not overidentified');
end
This is the function that delivers IV parameter estimates. It will work for exactly and over-identified cases.
function [biv,bse,res,r2] = IVest(y,x,z);
% This function performs an IV estimation
% input: y, vector with dependent variable
% x, matrix with explanatory variable (include vector of ones if
% you want constant
% z, matrix with instrumental variables (at least as many cols as x)
% output: biv, estimated parameters using IV
% bse, standard errors for biv
% res, IV residuals
% r2, Rsquared
[n,kx] = size(x); % sample size - n, number of explan vars (incl constant) - kx
[n,kz] = size(z); % sample size - n, number of instrumental vars - kz
pz = z*inv(z'*z)*z'; % Projection matrix
xpzxi = inv(x'*pz*x); % this is also (Xhat'Xhat)^(-1)
biv = xpzxi*x'*pz*y; % IV estimate
res = y - x*biv; % IV residuals
ssq = res'*res/(n-kx); % Sample variance for IV residuals
s = sqrt(ssq); % Sample Standard deviation for IV res
bse = ssq*xpzxi; % Variance covariance matrix for IV estimates
bse = sqrt(diag(bse)); % Extract diagonal and take square root -> standard errors for IV estimators
ym = y - mean(y);
r2 = 1 - (res'*res)/(ym'*ym);
end
Hausmann endogeneity test
This function can be used to perform to test whether a set of explanatory variables is endogenous or not.
function [teststat,pval] = hausmann_iv_exog_test(y,x1,x2,z);
% This function performs a test on variable exogeneity
% see Heji et al. p. 411
% input: y, vector with dependent variable
% x1, matrix with explanatory variable (include vector of ones if
% you want constant which are assumed to be exogenous
% x2, matrix with explanatory variables that are to be tested on
% exogeneity
% z, matrix with instrumental variables (at least as many cols as x)
% output: teststat, calculated test statistic
% pval, p-value
x = [x1 x2];
xxi = inv(x'*x);
b = xxi*x'*y;
res = y - x*b;
zzi = inv(z'*z);
gam = zzi*z'*x2; % This works even if we have more than one element in x2
% we get as many columns of gam as we have elements in x2
vhat = x2 - z*gam;
[b,bse,res,n,rss,r2] = OLSest(res,[x vhat],0);
teststat = size(res,1)*r2;
pval = 1 - chi2cdf(teststat,size(x2,2));
Sargan test on instrument validity
This is used to check whether a chosen set of instruments are indeed exogenous. This test is only applicable for over-identified cases.
function [teststat,pval] = sargan_iv_validity_test(resiv,x,z);
% This function performs a test on instrument validity
% see Heji et al. p. 412
% input: resiv, residuals from the IV regression
% x, matrix with explanatory variable (include vector of ones if
% you want constant
% z, matrix with instrumental variables (at least as many cols as x)
% output: teststat, calculated test statistic
% pval, p-value
if (size(z,2)-size(x,2))>0
[b,bse,res,n,rss,r2] = OLSest(resiv,z,0);
teststat = size(resiv,1)*r2;
pval = 1 - chi2cdf(teststat,(size(z,2)-size(x,2)));
else
teststat = 'NA';
pval = 'NA';
disp('Sargan test not applicable here as the model is not overidentified');
end
This function can be used to perform to test whether a set of explanatory variables is endogenous or not.
function [teststat,pval] = hausmann_iv_exog_test(y,x1,x2,z);
% This function performs a test on variable exogeneity
% see Heji et al. p. 411
% input: y, vector with dependent variable
% x1, matrix with explanatory variable (include vector of ones if
% you want constant which are assumed to be exogenous
% x2, matrix with explanatory variables that are to be tested on
% exogeneity
% z, matrix with instrumental variables (at least as many cols as x)
% output: teststat, calculated test statistic
% pval, p-value
x = [x1 x2];
xxi = inv(x'*x);
b = xxi*x'*y;
res = y - x*b;
zzi = inv(z'*z);
gam = zzi*z'*x2; % This works even if we have more than one element in x2
% we get as many columns of gam as we have elements in x2
vhat = x2 - z*gam;
[b,bse,res,n,rss,r2] = OLSest(res,[x vhat],0);
teststat = size(res,1)*r2;
pval = 1 - chi2cdf(teststat,size(x2,2));
Sargan test on instrument validity
This is used to check whether a chosen set of instruments are indeed exogenous. This test is only applicable for over-identified cases.
function [teststat,pval] = sargan_iv_validity_test(resiv,x,z);
% This function performs a test on instrument validity
% see Heji et al. p. 412
% input: resiv, residuals from the IV regression
% x, matrix with explanatory variable (include vector of ones if
% you want constant
% z, matrix with instrumental variables (at least as many cols as x)
% output: teststat, calculated test statistic
% pval, p-value
if (size(z,2)-size(x,2))>0
[b,bse,res,n,rss,r2] = OLSest(resiv,z,0);
teststat = size(resiv,1)*r2;
pval = 1 - chi2cdf(teststat,(size(z,2)-size(x,2)));
else
teststat = 'NA';
pval = 'NA';
disp('Sargan test not applicable here as the model is not overidentified');
end
This is used to check whether a chosen set of instruments are indeed exogenous. This test is only applicable for over-identified cases.
function [teststat,pval] = sargan_iv_validity_test(resiv,x,z);
% This function performs a test on instrument validity
% see Heji et al. p. 412
% input: resiv, residuals from the IV regression
% x, matrix with explanatory variable (include vector of ones if
% you want constant
% z, matrix with instrumental variables (at least as many cols as x)
% output: teststat, calculated test statistic
% pval, p-value
if (size(z,2)-size(x,2))>0
[b,bse,res,n,rss,r2] = OLSest(resiv,z,0);
teststat = size(resiv,1)*r2;
pval = 1 - chi2cdf(teststat,(size(z,2)-size(x,2)));
else
teststat = 'NA';
pval = 'NA';
disp('Sargan test not applicable here as the model is not overidentified');
end