Difference between revisions of "Linear Combinations"
Line 107: | Line 107: | ||
<math>V=aX+bY+c.</math> | <math>V=aX+bY+c.</math> | ||
− | What is <math> | + | What is <math>var\left[ V\right] ?</math> To find this, it is helpful to use notation that will simplify the proof. By definition, |
− | <math> | + | <math>var\left[ V\right] =E\left[ \left( V-E\left[ V\right] \right) ^{2}\right] .</math> |
Put | Put | ||
Line 117: | Line 117: | ||
so that | so that | ||
− | <math> | + | <math>var\left[ V\right] =E\left[ \tilde{V}^{2}\right].</math> |
We saw that | We saw that | ||
Line 132: | Line 132: | ||
and then | and then | ||
− | <math> | + | <math>var\left[ V\right] =E\left[ \tilde{V}^{2}\right] =E\left[ \left( a\tilde{X}+b\tilde{Y}\right) ^{2}\right].</math> |
Notice that this does not depend on the constant <math>c</math>. | Notice that this does not depend on the constant <math>c</math>. | ||
Line 139: | Line 139: | ||
<math>\begin{aligned} | <math>\begin{aligned} | ||
− | + | var\left[ X\right] &=&E\left[ \tilde{X}^{2}\right] ,\;\;\;var\left[ Y\right] =E\left[ \tilde{Y}^{2}\right] , \\ | |
\limfunc{cov}\left[ X,Y\right] &=&E\left[ \left( X-E\left[ X\right] \right)\left( Y-E\left[ Y\right] \right) \right] \\ | \limfunc{cov}\left[ X,Y\right] &=&E\left[ \left( X-E\left[ X\right] \right)\left( Y-E\left[ Y\right] \right) \right] \\ | ||
&=&E\left[ \tilde{X}\tilde{Y}\right] .\end{aligned}</math> | &=&E\left[ \tilde{X}\tilde{Y}\right] .\end{aligned}</math> | ||
Line 146: | Line 146: | ||
<math>\begin{aligned} | <math>\begin{aligned} | ||
− | + | var\left[ V\right] &=&E\left[ \left( a\tilde{X}+b\tilde{Y}\right)^{2}\right] \\ | |
&=&E\left[ a^{2}\tilde{X}^{2}+2ab\tilde{X}\tilde{Y}+b^{2}\tilde{Y}^{2}\right]\\ | &=&E\left[ a^{2}\tilde{X}^{2}+2ab\tilde{X}\tilde{Y}+b^{2}\tilde{Y}^{2}\right]\\ | ||
&=&a^{2}E\left[ \tilde{X}^{2}\right] +2abE\left[ \tilde{X}\tilde{Y}\right]+b^{2}E\left[ \tilde{Y}^{2}\right] \\ | &=&a^{2}E\left[ \tilde{X}^{2}\right] +2abE\left[ \tilde{X}\tilde{Y}\right]+b^{2}E\left[ \tilde{Y}^{2}\right] \\ | ||
− | &=&a^{2} | + | &=&a^{2}var\left[ X\right] +2ab\limfunc{cov}\left[ X,Y\right]+b^{2}var\left[ Y\right] ,\end{aligned}</math> |
using the linear combination result for expected values. | using the linear combination result for expected values. | ||
Line 157: | Line 157: | ||
<ul> | <ul> | ||
<li><p>if <math>V=aX+bY+c</math>, then</p> | <li><p>if <math>V=aX+bY+c</math>, then</p> | ||
− | <p><math> | + | <p><math>var\left[ V\right] =a^{2}var\left[ X\right] +2ab\limfunc{cov}\left[ X,Y\right] +b^{2}var\left[ Y\right] .</math></p></li> |
<li><p>If <math>X</math> and <math>Y</math> are '''uncorrelated''', so that <math>\limfunc{cov}\left[X,Y\right] =0</math>,</p> | <li><p>If <math>X</math> and <math>Y</math> are '''uncorrelated''', so that <math>\limfunc{cov}\left[X,Y\right] =0</math>,</p> | ||
− | <p><math> | + | <p><math>var\left[ V\right] =a^{2}var\left[ X\right] +b^{2}var\left[ Y\right] .</math></p></li> |
<li><p>If <math>X</math> and <math>Y</math> are independent, the same result holds.</p></li></ul> | <li><p>If <math>X</math> and <math>Y</math> are independent, the same result holds.</p></li></ul> | ||
Line 165: | Line 165: | ||
<ol> | <ol> | ||
− | <li><p>Suppose that <math>X</math> and <math>Y</math> are independent random variables with <math> | + | <li><p>Suppose that <math>X</math> and <math>Y</math> are independent random variables with <math>var\left[ X\right] =0.25</math>, <math>var\left[ Y\right] =2.5</math>. If</p> |
<p><math>V=X+Y,</math></p> | <p><math>V=X+Y,</math></p> | ||
<p>then</p> | <p>then</p> | ||
<p><math>\begin{aligned} | <p><math>\begin{aligned} | ||
− | + | var\left[ V\right] &=&var\left[ X\right] +var\left[ Y\right] \\ | |
&=&0.25+2.5 \\ | &=&0.25+2.5 \\ | ||
&=&2.75. | &=&2.75. | ||
Line 175: | Line 175: | ||
<li><p>A [[ Joint_Probability_Distributions#Marginal_Probabilities|previous example]] used the following random variables <math>W </math> and <math>H</math>, and defined <math>T</math> by</p> | <li><p>A [[ Joint_Probability_Distributions#Marginal_Probabilities|previous example]] used the following random variables <math>W </math> and <math>H</math>, and defined <math>T</math> by</p> | ||
<p><math>T=W+H.</math></p> | <p><math>T=W+H.</math></p> | ||
− | <p>There we found that <math> | + | <p>There we found that <math>var\left[ W\right] =0.49, var\left[ H\right] =0.61</math>, while we also found <math>\limfunc{cov}\left[ W,H\right] =0.03</math>. Then,</p> |
<p><math>\begin{aligned} | <p><math>\begin{aligned} | ||
− | + | var\left[ T\right] &=&var\left[ W+H\right] \\ | |
− | &=& | + | &=&var\left[ W\right] +2\limfunc{cov}\left[ W,H\right] +var\left[ H\right] |
\end{aligned}</math></p> | \end{aligned}</math></p> | ||
<p>(since this is a case with <math>a=b=1)</math>. So,</p> | <p>(since this is a case with <math>a=b=1)</math>. So,</p> | ||
− | <p><math> | + | <p><math>var\left[ T\right] =0.49+\left( 2\right) \left( 0.03\right)+0.61=1.16.</math></p></li> |
<li><p>For the same joint distribution, the difference between the income of husbands and wives is</p> | <li><p>For the same joint distribution, the difference between the income of husbands and wives is</p> | ||
<p><math>D=H-W.</math></p> | <p><math>D=H-W.</math></p> | ||
<p>This case has <math>a=1</math> and <math>b=-1</math>, so that</p> | <p>This case has <math>a=1</math> and <math>b=-1</math>, so that</p> | ||
<p><math>\begin{aligned} | <p><math>\begin{aligned} | ||
− | + | var\left[ D\right] &=&\left( 1\right) ^{2}var\left[ H \right] +2\left( 1\right) \left( -1\right) \limfunc{cov}\left[ W,H\right] +\left( -1\right) ^{2}var\left[ W\right] \\ | |
&=&0.61-\left( 2\right) \left( 0.03\right) +0.49 \\ | &=&0.61-\left( 2\right) \left( 0.03\right) +0.49 \\ | ||
&=&1.04. | &=&1.04. | ||
Line 197: | Line 197: | ||
<ul> | <ul> | ||
<li><p>for <math>X_{1},...,X_{n}</math> uncorrelated,</p> | <li><p>for <math>X_{1},...,X_{n}</math> uncorrelated,</p> | ||
− | <p><math> | + | <p><math>var\left[ \sum_{i=1}^{n}a_{i}X_{i}\right] =\sum_{i=1}^{n}a_{i}^{2}var\left[ X_{i}\right] .</math></p></li> |
<li><p>This also applies when <math>X_{1},...,X_{n}</math> are '''independent''' random variables.</p></li></ul> | <li><p>This also applies when <math>X_{1},...,X_{n}</math> are '''independent''' random variables.</p></li></ul> | ||
Revision as of 15:21, 29 August 2013
In this section, some properties of linear functions of random variables [math]X[/math] and [math]Y[/math] are considered. In this Section, a new random variable was defined as a function of two other variables. Let us, here, define a random variable [math]V[/math] as a function of random variables [math]X[/math] and [math]Y[/math],
[math]V=g\left( X,Y\right) ,[/math]
with no restriction on the nature of the function or transformation [math]g[/math]. In this section, the function [math]g[/math] is restricted to be a linear function of [math]X[/math] and [math]Y[/math]:
[math]V=aX+bY+c,[/math]
where [math]a,b[/math] and [math]c[/math] are constants. [math]V[/math] is also called a linear combination of [math]X[/math] and [math]Y[/math].
The properties developed in this section are specific to linear functions: they do not hold in general for nonlinear functions or transformations.
Contents
The Expected Value of a Linear Combination
This result is easy to remember: it amounts to saying that the expected value of a linear combination is the linear combination of the expected values.
Even more simply, the expected value of a sum is a sum of expected values.
If [math]V=aX+bY+c[/math], where [math]a,b,c[/math] are constants, then
[math]E\left[ V\right] =E\left[ aX+bY+c\right] =aE\left[ X\right] +bE\left[ Y\right] +c.[/math]
This result is a natural generalisation of that given in this previous Section.
Proof (discrete random variables case only). Using the result in this previous Section,
[math]\begin{aligned} E\left[ V\right] &=&E\left[ g\left( X,Y\right) \right] \\ &=&E\left[ aX+bY+c\right] \\ &=&\sum_{x}\sum_{y}\left( ax+by+c\right) p\left( x,y\right) . \end{aligned}[/math]
From this point on, the proof just involves manipulation of the summation signs:
[math]\begin{aligned} \sum_{x}\sum_{y}\left( ax+by+c\right) p\left( x,y\right)&=&a\sum_{x}\sum_{y}xp\left( x,y\right)+b\sum_{x}\sum_{y}yp\left(x,y\right) +c\sum_{x}\sum_{y}p\left( x,y\right) \\ &=&a\sum_{x}\left[ x\left( \sum_{y}p\left( x,y\right) \right) \right]+b\sum_{y}\left[ y\left( \sum_{x}p\left( x,y\right) \right) \right] +c \\ &=&a\sum_{x}\left[ xp_{X}\left( x\right) \right] +b\sum_{y}\left[yp_{Y}\left( y\right) \right] +c \\ &=&aE\left[ X\right] +bE\left[ Y\right] +c. \end{aligned}[/math]
Notice the steps used:
- [math]\sum\limits_{y}xp\left( x,y\right) =x\sum\limits_{y}p\left(x,y\right) =xp_{X}\left( x\right) [/math], [math]\sum\limits_{x}yp\left( x,y\right)=y\sum\limits_{x}p\left( x,y\right) =xp_{X}\left( x\right) [/math], as [math]x[/math] is constant with respect to [math]y[/math] summation and [math]y[/math] is constant with respect to [math]x[/math] summation;
- we used: [math]\sum\limits_{x}\sum\limits_{y}p\left( x,y\right) =1[/math],
- we used the definitions of marginal distributions from Section;
- we usedthe definitions of expected value for discrete random variables.
Notice that nothing need be known about the joint probability distribution [math]p\left( x,y\right) [/math] of [math]X[/math] and [math]Y[/math]. The result is also valid for continuous random variables, nothing need be known about [math]p\left( x,y\right) [/math].
Examples
In a previous example we defined
[math]T=W+H,[/math]
and had [math]E\left[ H\right] =1.3[/math], [math]E\left[ W\right] =0.9[/math], giving
[math]\begin{aligned} E\left[ T\right] &=&E\left[ W\right] +E\left[ H\right] \\ &=&2.2 \end{aligned}[/math]
confirming the result obtained earlier.
Suppose that the random variables [math]X[/math] and [math]Y[/math] have [math]E\left[ X\right]=0.5[/math] and [math]E\left[ Y\right] =3.5[/math], and let
[math]V=5X-Y.[/math]
Then,
[math]\begin{aligned} E\left[ V\right] &=&5E\left[ X\right] -E\left[ Y\right] \\ &=&\left( 5\right) \left( 0.5\right) -3.5 \\ &=&-1. \end{aligned}[/math]
Generalisation
Let [math]X_{1},...,X_{n}[/math] be random variables and [math]a_{1},....,a_{n},c[/math] be constants, and define the random variable [math]W[/math] by
[math]\begin{aligned} W &=&a_{1}X_{1}+...+a_{n}X_{n}+c \\ &=&\sum_{i=1}^{n}a_{i}X_{i}+c.\end{aligned}[/math]
Then,
[math]E\left[ W\right] =\sum_{i=1}^{n}a_{i}E\left[ X_{i}\right] +c.[/math]
The proof uses the linear combination result for two variables repeatedly:
[math]\begin{aligned} E\left[ W\right] &=&a_{1}E\left[ X_{1}\right] +E\left[a_{2}X_{2}+...+a_{n}X_{n}+c\right] \\ &=&a_{1}E\left[ X_{1}\right] +a_{2}E\left[ X_{2}\right] +E\left[a_{3}X_{3}+...+a_{n}X_{n}+c\right] \\ &=&... \\ &=&a_{1}E\left[ X_{1}\right] +a_{2}E\left[ X_{2}\right] +...+a_{n}E\left[X_{n}\right] +c.\end{aligned}[/math]
Examples
Let [math]E\left[ X_{1}\right] =2,E\left[ X_{2}\right] =-1,E\left[ X_{3}\right] =3[/math], [math]W=2X_{1}+5X_{2}-3X_{3}+4[/math] and then
[math]\begin{aligned} E\left[ W\right] &=&E\left[ 2X_{1}+5X_{2}-3X_{3}+4\right] \\ &=&2E\left[ X_{1}\right] +5E\left[ X_{2}\right] -3E\left[ X_{3}\right] +4 \\ &=&\left( 2\right) \left( 2\right) +\left( 5\right) \left( -1\right) -\left(3\right) \left( 3\right) +4 \\ &=&-6.\end{aligned}[/math]
The Variance of a Linear Combination
Two Variable Case
Let [math]V[/math] be the random variable defined above as:
[math]V=aX+bY+c.[/math]
What is [math]var\left[ V\right] ?[/math] To find this, it is helpful to use notation that will simplify the proof. By definition,
[math]var\left[ V\right] =E\left[ \left( V-E\left[ V\right] \right) ^{2}\right] .[/math]
Put
[math]\tilde{V}=V-E\left[ V\right][/math]
so that
[math]var\left[ V\right] =E\left[ \tilde{V}^{2}\right].[/math]
We saw that
[math]E\left[ V\right] =aE\left[ X\right] +bE\left[ Y\right] +c,[/math]
so that
[math]\begin{aligned} \tilde{V} &=&\left( aX+bY+c\right) -\left( aE\left[ X\right] +bE\left[ Y\right] +c\right) \\ &=&a\left( X-E\left[ X\right] \right) +b\left( Y-E\left[ Y\right] \right) \\ &=&a\tilde{X}+b\tilde{Y}\end{aligned}[/math]
and then
[math]var\left[ V\right] =E\left[ \tilde{V}^{2}\right] =E\left[ \left( a\tilde{X}+b\tilde{Y}\right) ^{2}\right].[/math]
Notice that this does not depend on the constant [math]c[/math].
To make further progress, recall that in the current notation,
[math]\begin{aligned} var\left[ X\right] &=&E\left[ \tilde{X}^{2}\right] ,\;\;\;var\left[ Y\right] =E\left[ \tilde{Y}^{2}\right] , \\ \limfunc{cov}\left[ X,Y\right] &=&E\left[ \left( X-E\left[ X\right] \right)\left( Y-E\left[ Y\right] \right) \right] \\ &=&E\left[ \tilde{X}\tilde{Y}\right] .\end{aligned}[/math]
Then,
[math]\begin{aligned} var\left[ V\right] &=&E\left[ \left( a\tilde{X}+b\tilde{Y}\right)^{2}\right] \\ &=&E\left[ a^{2}\tilde{X}^{2}+2ab\tilde{X}\tilde{Y}+b^{2}\tilde{Y}^{2}\right]\\ &=&a^{2}E\left[ \tilde{X}^{2}\right] +2abE\left[ \tilde{X}\tilde{Y}\right]+b^{2}E\left[ \tilde{Y}^{2}\right] \\ &=&a^{2}var\left[ X\right] +2ab\limfunc{cov}\left[ X,Y\right]+b^{2}var\left[ Y\right] ,\end{aligned}[/math]
using the linear combination result for expected values.
Summarising,
if [math]V=aX+bY+c[/math], then
[math]var\left[ V\right] =a^{2}var\left[ X\right] +2ab\limfunc{cov}\left[ X,Y\right] +b^{2}var\left[ Y\right] .[/math]
If [math]X[/math] and [math]Y[/math] are uncorrelated, so that [math]\limfunc{cov}\left[X,Y\right] =0[/math],
[math]var\left[ V\right] =a^{2}var\left[ X\right] +b^{2}var\left[ Y\right] .[/math]
If [math]X[/math] and [math]Y[/math] are independent, the same result holds.
Examples
Suppose that [math]X[/math] and [math]Y[/math] are independent random variables with [math]var\left[ X\right] =0.25[/math], [math]var\left[ Y\right] =2.5[/math]. If
[math]V=X+Y,[/math]
then
[math]\begin{aligned} var\left[ V\right] &=&var\left[ X\right] +var\left[ Y\right] \\ &=&0.25+2.5 \\ &=&2.75. \end{aligned}[/math]
A previous example used the following random variables [math]W [/math] and [math]H[/math], and defined [math]T[/math] by
[math]T=W+H.[/math]
There we found that [math]var\left[ W\right] =0.49, var\left[ H\right] =0.61[/math], while we also found [math]\limfunc{cov}\left[ W,H\right] =0.03[/math]. Then,
[math]\begin{aligned} var\left[ T\right] &=&var\left[ W+H\right] \\ &=&var\left[ W\right] +2\limfunc{cov}\left[ W,H\right] +var\left[ H\right] \end{aligned}[/math]
(since this is a case with [math]a=b=1)[/math]. So,
[math]var\left[ T\right] =0.49+\left( 2\right) \left( 0.03\right)+0.61=1.16.[/math]
For the same joint distribution, the difference between the income of husbands and wives is
[math]D=H-W.[/math]
This case has [math]a=1[/math] and [math]b=-1[/math], so that
[math]\begin{aligned} var\left[ D\right] &=&\left( 1\right) ^{2}var\left[ H \right] +2\left( 1\right) \left( -1\right) \limfunc{cov}\left[ W,H\right] +\left( -1\right) ^{2}var\left[ W\right] \\ &=&0.61-\left( 2\right) \left( 0.03\right) +0.49 \\ &=&1.04. \end{aligned}[/math]
Generalisation
To extend the result to the case of a linear combination of [math]n[/math] random variables [math]X_{1},...,X_{n}[/math] is messy because of the large number of covariance terms involved. So, we simplify by supposing that [math]X_{1},...,X_{n}[/math] are uncorrelated random variables, with all covariances equal to zero: [math]\limfunc{cov}\left[ X_{i},X_{j}\right] =0,i\neq j[/math]. Then,
for [math]X_{1},...,X_{n}[/math] uncorrelated,
[math]var\left[ \sum_{i=1}^{n}a_{i}X_{i}\right] =\sum_{i=1}^{n}a_{i}^{2}var\left[ X_{i}\right] .[/math]
This also applies when [math]X_{1},...,X_{n}[/math] are independent random variables.
Standard Deviations
None of these results apply directly to standard deviations. Consider the simple case where [math]X[/math] and [math]Y[/math] are independent random variables and
[math]W=X+Y.[/math]
Then,
[math]\begin{aligned} var\left[ W\right] &=&\sigma _{W}^{2} \\ &=&var\left[ X\right] +var\left[ Y\right] \\ &=&\sigma _{X}^{2}+\sigma _{Y}^{2}\end{aligned}[/math]
and then
[math]\sigma _{W}=\sqrt{\sigma _{X}^{2}+\sigma _{Y}^{2}.}[/math]
In general it is true that
[math]\sigma _{W}\neq \sigma _{X}+\sigma _{Y}.[/math]
To illustrate, if [math]X_{1},X_{2}[/math] and [math]X_{3}[/math] are independent random variables with [math]var\left[ X_{1}\right] =3,var\left[ X_{2}\right]=1 [/math] and [math]var\left[ X_{3}\right] =5[/math], and if
[math]P=2X_{1}+5X_{2}-3X_{3},[/math]
then
[math]\begin{aligned} var\left[ P\right] &=&2^{2}var\left[ X_{1}\right] +5^{2}var\left[ X_{2}\right] +\left( -3\right) ^{2}var\left[X_{3}\right] \\ &=&\left( 4\right) \left( 3\right) +\left( 25\right) \left( 1\right) +\left(9\right) \left( 5\right) \\ &=&82, \\ \sigma _{P} &=&\sqrt{82}=9.06.\end{aligned}[/math]
Linear Combinations of Normal Random Variables
The results in this section so far relate to characteristics of the probability distribution of a linear combination like
[math]V=aX+bY,[/math]
not to the probability distribution itself. Indeed, part of the attraction of these results is that they can be obtained without having to find the probability distribution of [math]V[/math].
However, if we knew that [math]X[/math] and [math]Y[/math] had normal distributions then it would follow that [math]V[/math] is also normally distributed. This innocuous sounding result is EXTREMELY IMPORTANT! It is also rather unusual: there are not many distributions for which this type of result holds.
More specifically,
if [math]X\sim N\left( \mu _{X},\sigma _{X}^{2}\right) [/math] and [math]Y\sim N\left(\mu _{Y},\sigma _{Y}^{2}\right) [/math] and [math]W=aX+bY+c[/math], then
[math]W\sim N\left( \mu _{W},\sigma _{W}^{2}\right)[/math]
with
[math]\begin{aligned} \mu _{W} &=&a\mu _{X}+b\mu _{Y}+c, \\ \sigma _{W}^{2} &=&a^{2}\sigma _{X}^{2}+2ab\sigma _{XY}+b^{2}\sigma _{Y}^{2} \end{aligned}[/math]
Note that independence of [math]X[/math] and [math]Y[/math] has not been assumed.
If [math]X_{1},...X_{n}[/math] are uncorrelated random variables with [math]X_{i}\sim N\left( \mu _{i},\sigma _{i}^{2}\right) [/math] and [math]W=\sum \limits_{i=1}^{n}a_{i}X_{i}[/math], where [math]a_{1},...,a_{n}[/math] are constants, then
[math]W\sim N\left( \sum\limits_{i=1}^{n}a_{i}\mu_{i},\sum\limits_{i=1}^{n}a_{i}^{2}\sigma _{i}^{2}\right).[/math]
You should note that we did apply the general rules for the expected value and variance above, but in addition, in this case, we also know what the type of distribution is, not only the values for [math]E[W][/math] and [math]Var[W][/math].
Of course, standard normal distribution tables can be used in the usual way to compute probabilities of events involving [math]W[/math]. This is illustrated in the following example.
Example
If [math]X\sim N\left( 20,5\right) ,Y\sim N\left( 30,11\right) [/math], [math]X[/math] and [math]Y[/math] independent, and
[math]D=X-Y,[/math]
then
[math]D\sim N\left( -10,16\right)[/math]
and
[math]\begin{aligned} \Pr \left( D\gt 0\right) &=&\Pr \left( Z\gt \dfrac{0-\left( -10\right) }{4}\right)\\ &=&\Pr \left( Z\gt 2.5\right) \\ &=&0.00621,\end{aligned}[/math]
where [math]Z\sim N\left( 0,1\right) [/math].