Difference between revisions of "Program Flow and Logicals"
(→Footnotes) |
(→Relevant example) |
||
Line 254: | Line 254: | ||
− | = Footnotes = <references /> | + | = Footnotes = |
+ | |||
+ | |||
+ | <references /> |
Revision as of 20:41, 25 September 2012
Contents
Preliminaries
Very often in your life you have to repeat the same operation many times (move your right and left leg sequentially while walking/running) or behave differently depending on external conditions (there is or there isn’t a bus on the bus stop). Quite often these two are combined together. Say, if there is a bus on the bus stop, then you run trying to catch it, otherwise walk or stop and enjoy the usual Manchester weather. The same is true for programming. Quite often you want to repeat the same operation many times, or you want to change the way you process your data depending on some conditions. We start with conditional statements. They execute different pieces of the code depending on whether condition
is true or false. There are several ways you can formulate it. The shortest
if condition
statement1;
statement2;
...
end
, executes statement1;statement2,...
only if condition
is satisfied. condition
can be anything that generate non-zero or 0 (True or False), say i>0
, size(y,1)\sim=40
, or 5-i
. The last condition is True always but for i=5
. A slightly longer version
if condition
statement1;
statement2;
...
else
statement1a;
statement2a;
...
end
runs statement1;statement2;...
, if condition
is true and statement1a;statement2a;...
otherwise. The most general specification is
if condition1
statement1;
statement2;
...
elseif condition2
statement1a;
statement2a;
...
...
...
elseif conditionN
statement1b;
statement2b;
...
else
statement1c;
statement2c;
...
end
In this case, however, you have to ensure that condition1, condition2, …, conditionN
are mutually disjoint. As an example, you might think about different actions depending on your final grade. Condition1: grade<30
; Condition 2: (grade>=30)&&(grade<40)
; Condition 3: (grade>=40)&&(grade<50)
; etc.
MATLAB has two statements that create a loop. First, it is unconditional loop:
for CounterVariable=[range of values]
statement1;
statement2;
...
end
It repeats at most as many times as many elements it has in the [range of values]
. If the range of values is empty, this loop does not run. Say, if you define a range 10:1
, MATLAB creates an empty range. Thus, this loop will not be executed. If you define a range 1:3:10
, MATLAB creates a range of four values [1 4 7 10]
, and the loop runs four times. During the first iteration, CounterVariable=1
, during the second CounterVariable=4
, etc. After the end of the loop CounterVariable=10
. Please note, it is very unwise to modify the counter inside the loop. All modifications will disappear after the next iteration. Please also note, that the values in the range could be anything, including filenames or matrices from a cell vector. There are two commands that can modify the execution of the loop. continue
breaks the current iteration of the loop. Once it is executed, the loop continues skipping current iteration. break
stops the execution of the loop and your program continues after this point. These commands are used inside if
statements. For example, if CounterVariable == 10 continue;end
skips the loop iteration for CounterVariable = 10
.
Second, it is the conditional loop
while condition
statement1;
statement2;
...
end
This version of the loop executes statements as long as condition
is true. If condition
is always true, your loop runs forever.
for ... end
loop
A standard application for for ... end
loop is the reconstruction of AR(p) series once AR(p) coefficients and the vector of error terms is known. [math]y_t=\phi_0+\sum_{i=1}^p \phi_i y_{t-i}+e_t.[/math] For simplicity, we assume that [math]p=1[/math]. Also, to be able to compute [math]y_1[/math], we need to provide [math]y_0[/math]. Since we don’t know [math]y_0[/math],the best guess for [math]y_0[/math] is [math]E(y_0)[/math]. For stationary AR(1) process, that is for the case [math]|\phi_1|\lt 1[/math], [math]E(y_0)=\phi_0/(1-\phi_1)[/math]. Thus, knowing [math]y_0[/math] and [math]e_t[/math] for [math]t=1,\ldots,T[/math], we can reconstruct [math]y_t,\ t=1\ldots,T[/math]:
[math]\begin{aligned} y_1=&\phi_0+\phi_1 y_0+e_1\\ y_2=&\phi_0+\phi_1 y_1+e_2\\ &\ldots\\ y_t=&\phi_0+\phi_1 y_{t-1}+e_t\\ &\ldots\\ y_T=&\phi_0+\phi_1 y_{T-1}+e_T\end{aligned}[/math]
Definitely, if you are patient enough and [math]T[/math] is not very large, you can create your m file with [math]T[/math] lines in it. However, once [math]T[/math] is unknown, this approach would not work. Fortunately, there is a better alternative for this type of operations. All these computations can be summarized using the following algorithm:
- Find the length of a vector of error terms
e
:T=size(e,1)
- Initialize a vector
y
of the same length as vectore
:y=zeros(T,1)
- Compute
y(1)=phi0+phi1*(phi0/(1-phi1))+e(1)
. Please remember, we assume that [math]y_0=E(y)=\phi_0/(1-\phi_1)[/math] - Compute
y(i)=phi0+phi1*y(i-1)+e(i)
for [math]i=2[/math] - Repeat line 4 for [math]i=3,...,T[/math]
When vector e
is known in advance, the MATLAB code is
T=size(e,1);
y=zeros(T,1);
y0=phi0/(1-phi1);
y(1)=phi0+phi1*y0+e(1);
for i=2:T
y(i)=phi0+phi1*y(i-1)+e(i);
end
However, if phi1=1
, [math]E(y_t)[/math] is not constant, then the formula we use in the code does not work and will create either a series y
of [math]\pm\infty[/math], if phi0 \ne 0
or a series of not a numbers NaN
, if phi0=0
[1].
if else end
or if end
To avoid these inconveniences, we have to consider separately two cases:
- AR(1) process is stationary, i.e. [math]|\phi_1|\lt 1[/math]
- AR(1) process is nonstationary, i.e. [math]|\phi_1|\ge 1[/math]
For the latter, we have to acknowledge the fact that [math]E(y_t)=\mu_t[/math], i.e. unconditional expectation is a function of time. In this case we have to set [math]E(y_0)[/math] to some value. A standard assumption for non-stationary series is to assume that [math]E(y_0)=0[/math].
The algorithm in this case would look like:
- Find a length of a vector of error terms
e
:T=size(e,1)
- Initialize a vector
y
of the same length as vectore
:y=zeros(T,1)
- Check whether
abs(phi1)<1
. If this statement is true, theny0=phi0/(1-phi1)
. Else,y0=0
. Please remember, we set [math]y_0=E(y_0)[/math]. - Compute
y(1)=phi0+phi1*y0+e(1)
. - Compute
y(i)=phi0+phi1*y(i-1)+e(i)
for [math]i=2[/math] - Repeat line 4 for [math]i=3,...,T[/math]
Assuming vector e
is known in advance, the MATLAB code is
T=size(e,1);
y=zeros(T,1);
if abs(phi1)<1
y0=phi0/(1-phi1);
else
y0=0;
end
y(1)=phi0+phi1*y0+e(1)
for i=2:T
y(i)=phi0+phi1*y(i-1)+e(i);
end
If you don’t like the word else, you can skip it:
T=size(e,1);
y=zeros(T,1);
y0=0;
if abs(phi1)<1
y0=phi0/(1-phi1);
end
y(1)=phi0+phi1*y0+e(1)
for i=2:T
y(i)=phi0+phi1*y(i-1)+e(i);
end
while end
loop
An alternative way of running the same code is to use a conditional loop (purely for demonstration purposes). Usually the conditional loop is used when the number of iterations is not known in advance.
- Find the length of a vector of error terms
e
:T=size(e,1)
- Initialize a vector
y
of the same length as vectore
:y=zeros(T,1)
- Check whether
abs(phi1)<1
. If this statement is true, theny0=phi0/(1-phi1)
. Else,y0=0
. Please remember, we set [math]y_0=E(y_0)[/math]. - Compute
y(1)=phi0+phi1*y0+e(1)
. - Compute
y(i)=phi0+phi1*y(i-1)+e(i)
for [math]i=2[/math] - Increase i by 1, i.e. [math]i=i+1[/math] (please note, in programming this is an important statement),
- Repeat line 4 while [math]i\lt =T[/math]
Assuming vector e
is known in advance, the MATLAB code is
T=size(e,1);
y=zeros(T,1);
if abs(phi1)<1
y0=phi0/(1-phi1);
else
y0=0;
end
y(1)=phi0+phi1*y0+e(1)
i=2;
while i<=T
y(i)=phi0+phi1*y(i-1)+e(i);
i=i+1;
end
Imperfect substitutes of the above
MATLAB has two powerful tools that make programmer’s life much easier and utilization of loops/if less frequent. In addition, quite often it makes the code run faster. In particular,
- Logical expressions work not only on scalars, but also on vectors, matrices and, in general, on n-dimensional arrays.
- Subvectors/submatrices can be extracted using logical 0-1 arrays.
Irrelevant but useful example
typing a=1:5
in MATLAB command window create a [math]1\times5[/math] row-vector a
with values [math][1\ 2\ 3\ 4\ 5][/math]. Logical expression ind=(a>3.5)
will create the so called logical vector ind
with values [math][0\ 0\ 0\ 1\ 1][/math], i.e. it is 1 if the according element is greater than 3.5 and 0 otherwise. Now, typing b=a(ind)
will generate a [math]2\times1[/math] subvector b
with values [math][4\ 5][/math]. You can also create some vectors or matrices with specific values changed: the command a(ind)=a(ind)*2
replace the last two values of the original vector a
. As a result, the vector a
becomes [math][1 \ 2\ 3\ 8\ 10][/math].
Slightly less irrelevant example
In some occasions you would like to modify the matrix of interest. Say, in some surveys “no answer” is coded as 999. Once you import the whole dataset in X
, you might want to replace these with, say, NaN. It can be done for the whole matrix of interest: X(X==999)=NaN
.
Relevant example
To demonstrate these capabilities in a more relevant environment, let’s run a very simple example. Assume that we have [math]T\times1[/math] vector of returns r
and we want to
- Compute number of positive, negative and zero returns
- Compute means of positive and negative returns
The algorithm for this is quite straightforward:
- Find out the length of vector
r
, T - Initiate three counter variables,
Tplus=0, Tzero=0, Tminus=0
, and vectorsrplus=zeros(T,1), rminus=zeros(T,1)
(since we don’t know how many negative and positive returns we will observe - Check whether r(i) is greater, smaller or equal to 0 for i=1
- If
r(i)>0
, add 1 to Tplus, setrplus(Tplus)=r(i)
; - Else if
r(i)<0
add 1 to Tminus, setrminus(Tminus)=r(i)
; - Else add 1 to Tzero
- Repeat 3-6 for [math]i=2,\ldots,T[/math]
- Remove excessive zeros from
rplus
andrminus
:rplus=rplus(1:Tplus);
rminus=rminus(1:Tminus);
- Compute means of rminus and rplus. Number of positive, negative and zero returns are stored in
Tplus,Tminus,Tzero
MATLAB translation:
T=size(r,1);
Tplus=0;Tminus=0;Tzero=0;
rplus=zeros(T,1);rminus=zeros(T,1);
for i=1:T
if r(i)>0
Tplus=Tplus+1;%increasing Tplus by one if return is positive
rplus(Tplus)=r(i);%storing positive return in the proper subvector
elseif r(i)<0
Tminus=Tminus+1;%increasing Tminus by one if return is negative
rminus(Tminus)=r(i);%storing negative return in the proper subvector
else
Tzero=Tzero+1;%increasing Tzero by one if return is neither positive nor negative
end
end
rplus=rplus(1:Tplus);%removing excessive zeros from a subvector of positive returns
rminus=rminus(1:Tminus);%removing excessive zeros from a subvector of negative returns
meanplus=mean(rplus);%computing mean of positive returns
meanminus=sum(rminus)/Tminus;%computing mean of negative returns
Using MATLAB capabilities mentioned in this section, the algorithm can be reduced to:
- Construct a vector
indplus
that has 1 for positive returns and 0 for negative returns - Construct a vector
indminus
that has 1 for negative returns and 0 for positive returns - Assign to
Tplus
a sum of elements ofindplus
. This is the number of positive returns - Assign to
Tminus
a sum of elements ofindminus
. This is the number of negative returns - Compute
Tzero
which isT-Tplus-Tminus
- Construct a vector of positive returns
rplus=r(indplus)
and compute its mean - Construct a vector of negative returns
rminus=r(indminus)
and compute its mean
MATLAB implementation:
T=size(r,1);
indplus = r>0;%constructing an indicator vector with 1s if r(i)>0, 0 otherwise
indminus = r<0;%constructing an indicator vector with 1s if r(i)<0, 0 otherwise
Tplus=sum(indplus);%computing a number of positive returns
Tminus=sum(indminus);%computing a number of negative returns
Tzero=T-Tplus-Tminus;%computing a number of zero returns
rplus=r(indplus);%constructing a vector of positive returns
rminus=r(indminus);%constructing a vector of negative returns
meanplus=sum(rplus)/Tplus; %computing mean of positive returns
meanminus=mean(rminus); %computing mean of negative returns
Or, a slightly shorter version of the same thing
T=size(r,1);
rplus = r(r>0);%constructing a vector of positive returns
rminus = r(r<0);%%constructing a vector of negative returns
Tplus=size(rplus,1);%computing a number of positive returns
Tminus=size(indminus,l);%computing a number of negative returns
Tzero=T-Tplus-Tminus;%computing a number of zero returns
meanplus=sum(rplus)/Tplus; %computing mean of positive returns
meanminus=mean(rminus); %computing mean of negative returns
A shorter code is less exposed to errors and easier to read (at least after some practice).
Footnotes
- ↑ There are two special numerical values in MATLAB. One is infinity
Inf
, and another is not a numberNaN
. A value of a variable becomesInf
if the number is too big in absolute value ([math]\approx \pm 2e308[/math]). Also, infinity is generated once you have expressions like [math]x/0[/math], where [math]x\ne0[/math]. After that, infinity can only change a sign or become not a number. Not a number appears when there is an uncertainty of a kind of [math]0/0[/math], [math]\infty-\infty[/math] and such. Any algebraic operations withNaN
resultNaN