ConNonlinOptImp

From ECLR
Revision as of 11:21, 4 December 2012 by Admin (talk | contribs)
Jump to: navigation, search

Introduction

This Section is based on the Section in which we demonstrate how to implement an unconstrained optimisation (which is where the UNC in fminUNC) came from. This was called unconstrained, because we allowed the optimisation to find any parameter value as the optimal value. The optimal parameter value could be any real number on the real line.

This is appropriate on many occasions (and is equivalent to what happens when we estimate a linear regression model by OLS) but may be inappropriate in some:

  1. Some parameters only make sense if they are constraint to a subset of the real line. An example are variances or standard deviations which have to be positive.
  2. You want to estimate a restricted model where the restriction cannot be implemented by a rearrangement of the explanatory variables (e.g. the dropping of a variable from the model).
  3. You want to estimate a model where some model parameters are related to each other and you need to impose this relationship.

There are basically two ways in which you can implement restrictions. The first is by imposing the restrictions inside the function which is to be optimised (minimised), the second is to use a constrained optimisation routine.

Adjusting the target function

This is the easiest way to deal with constraints if at all possible. As an example we shall consider the case in which we estimate a linearasied regression function

[math]y_{t}=\beta _{0}+\beta _{1}x_{1,t}+\beta _{1}^{2}x_{2,t}+u_{t}[/math]

by Maximum Lilkelihood. In that case we will have to specify the distribution of the error terms [math]u_t[/math] as that will determine the form of the target function. Let’s assume that [math]u_t[/math] come from a normal distribution with variance [math]\sigma^2[/math]. It is well known that the density of the normal distribution is

[math]f(u_{t})=\frac{1}{\sqrt{2\pi \sigma^2}}exp\left( \frac{u_t^2}{2\sigma^2}\right)[/math]

In practice we will have to replace [math]u_t[/math] with the estimate

[math]\widehat{u}_{t}=y_{t}-\beta _{0}-\beta _{1}x_{1,t}-\beta _{1}^{2}x_{2,t}[/math]

This means that [math]f(u_{t})[/math] is a function of all three parameters ([math]\beta _{0}[/math], [math]\beta _{1}[/math] and [math]\sigma[/math]). We want to minimise the negative log-likelihood function and the following target function is written to return just that.

function nll = judgeml(theta,data)
% this returns the negative log likelihood function for the Judge function
% input: (i) theta, parameter vector, 3rd value is standard deviation
%        (ii) data, data matrix

    % first calculate residual for res = y - theta(1) - theta(2)*x1 -
    % theta(2)^2*x2
res = data(:,1) - theta(1) - theta(2)*data(:,2) - theta(2)^2*data(:,3);

    % negative log likelihood
nlogl = 0.5*log(2*pi) + 0.5*log(theta(3)^2) + (res.^2)./(2*theta(3)^2);
nll = sum(nlogl);     % output for function is sum negative log likelihood

end

Why is this function an example for imposing a restriction through the target function? One of the parameters that needs to be estimated is the error variance [math]\sigma^2[/math]. That variance needs to be positive, besides, if it wasn’t the term [math]log(\sigma^2)[/math] was not computable. This is the reason why the parameter vector is defined to contain [math]\theta=(\beta _{0}~\beta _{1}~\sigma)'[/math] and not [math]\theta=(\beta _{0}~\beta _{1}~\sigma^2)'[/math]. Note that in the target function we actually never need [math]\sigma[/math], but always [math]\sigma^2[/math]. This is fortunate as in that way, [math]\sigma^2[/math] (i.e. theta(3)2) will always be positive, even if the nonlinear optimisation routine (e.g. fminunc or fminsearch) suggests a negative value for [math]\sigma[/math] (i.e. theta(3)).

Using fmincon