Linear Combinations

From ECLR
Jump to: navigation, search


In this section, some properties of linear functions of random variables [math]X[/math] and [math]Y[/math] are considered. In this Section, a new random variable was defined as a function of two other variables. Let us, here, define a random variable [math]V[/math] as a function of random variables [math]X[/math] and [math]Y[/math],

[math]V=g\left( X,Y\right) ,[/math]

with no restriction on the nature of the function or transformation [math]g[/math]. In this section, the function [math]g[/math] is restricted to be a linear function of [math]X[/math] and [math]Y[/math]:

[math]V=aX+bY+c,[/math]

where [math]a,b[/math] and [math]c[/math] are constants. [math]V[/math] is also called a linear combination of [math]X[/math] and [math]Y[/math].

The properties developed in this section are specific to linear functions: they do not hold in general for nonlinear functions or transformations.

The Expected Value of a Linear Combination

This result is easy to remember: it amounts to saying that the expected value of a linear combination is the linear combination of the expected values.

Even more simply, the expected value of a sum is a sum of expected values.

If [math]V=aX+bY+c[/math], where [math]a,b,c[/math] are constants, then

[math]E\left[ V\right] =E\left[ aX+bY+c\right] =aE\left[ X\right] +bE\left[ Y\right] +c.[/math]

This result is a natural generalisation of that given in this previous Section.

  • Proof (discrete random variables case only). Using the result in this previous Section,

    [math]\begin{aligned} E\left[ V\right] &=&E\left[ g\left( X,Y\right) \right] \\ &=&E\left[ aX+bY+c\right] \\ &=&\sum_{x}\sum_{y}\left( ax+by+c\right) p\left( x,y\right) . \end{aligned}[/math]

    From this point on, the proof just involves manipulation of the summation signs:

    [math]\begin{aligned} \sum_{x}\sum_{y}\left( ax+by+c\right) p\left( x,y\right)&=&a\sum_{x}\sum_{y}xp\left( x,y\right)+b\sum_{x}\sum_{y}yp\left(x,y\right) +c\sum_{x}\sum_{y}p\left( x,y\right) \\ &=&a\sum_{x}\left[ x\left( \sum_{y}p\left( x,y\right) \right) \right]+b\sum_{y}\left[ y\left( \sum_{x}p\left( x,y\right) \right) \right] +c \\ &=&a\sum_{x}\left[ xp_{X}\left( x\right) \right] +b\sum_{y}\left[yp_{Y}\left( y\right) \right] +c \\ &=&aE\left[ X\right] +bE\left[ Y\right] +c. \end{aligned}[/math]

Notice the steps used:

  • [math]\sum\limits_{y}xp\left( x,y\right) =x\sum\limits_{y}p\left(x,y\right) =xp_{X}\left( x\right) [/math], [math]\sum\limits_{x}yp\left( x,y\right)=y\sum\limits_{x}p\left( x,y\right) =xp_{X}\left( x\right) [/math], as [math]x[/math] is constant with respect to [math]y[/math] summation and [math]y[/math] is constant with respect to [math]x[/math] summation;
  • we used: [math]\sum\limits_{x}\sum\limits_{y}p\left( x,y\right) =1[/math],
  • we used the definitions of marginal distributions from Section;
  • we usedthe definitions of expected value for discrete random variables.

Notice that nothing need be known about the joint probability distribution [math]p\left( x,y\right) [/math] of [math]X[/math] and [math]Y[/math]. The result is also valid for continuous random variables, nothing need be known about [math]p\left( x,y\right) [/math].

Examples

  1. In a previous example we defined

    [math]T=W+H,[/math]

    and had [math]E\left[ H\right] =1.3[/math], [math]E\left[ W\right] =0.9[/math], giving

    [math]\begin{aligned} E\left[ T\right] &=&E\left[ W\right] +E\left[ H\right] \\ &=&2.2 \end{aligned}[/math]

    confirming the result obtained earlier.

  2. Suppose that the random variables [math]X[/math] and [math]Y[/math] have [math]E\left[ X\right]=0.5[/math] and [math]E\left[ Y\right] =3.5[/math], and let

    [math]V=5X-Y.[/math]

    Then,

    [math]\begin{aligned} E\left[ V\right] &=&5E\left[ X\right] -E\left[ Y\right] \\ &=&\left( 5\right) \left( 0.5\right) -3.5 \\ &=&-1. \end{aligned}[/math]

Generalisation

Let [math]X_{1},...,X_{n}[/math] be random variables and [math]a_{1},....,a_{n},c[/math] be constants, and define the random variable [math]W[/math] by

[math]\begin{aligned} W &=&a_{1}X_{1}+...+a_{n}X_{n}+c \\ &=&\sum_{i=1}^{n}a_{i}X_{i}+c.\end{aligned}[/math]

Then,

[math]E\left[ W\right] =\sum_{i=1}^{n}a_{i}E\left[ X_{i}\right] +c.[/math]

The proof uses the linear combination result for two variables repeatedly:

[math]\begin{aligned} E\left[ W\right] &=&a_{1}E\left[ X_{1}\right] +E\left[a_{2}X_{2}+...+a_{n}X_{n}+c\right] \\ &=&a_{1}E\left[ X_{1}\right] +a_{2}E\left[ X_{2}\right] +E\left[a_{3}X_{3}+...+a_{n}X_{n}+c\right] \\ &=&... \\ &=&a_{1}E\left[ X_{1}\right] +a_{2}E\left[ X_{2}\right] +...+a_{n}E\left[X_{n}\right] +c.\end{aligned}[/math]

Examples

Let [math]E\left[ X_{1}\right] =2,E\left[ X_{2}\right] =-1,E\left[ X_{3}\right] =3[/math], [math]W=2X_{1}+5X_{2}-3X_{3}+4[/math] and then

[math]\begin{aligned} E\left[ W\right] &=&E\left[ 2X_{1}+5X_{2}-3X_{3}+4\right] \\ &=&2E\left[ X_{1}\right] +5E\left[ X_{2}\right] -3E\left[ X_{3}\right] +4 \\ &=&\left( 2\right) \left( 2\right) +\left( 5\right) \left( -1\right) -\left(3\right) \left( 3\right) +4 \\ &=&-6.\end{aligned}[/math]

The Variance of a Linear Combination

Two Variable Case

Let [math]V[/math] be the random variable defined above as:

[math]V=aX+bY+c.[/math]

What is [math]var\left[ V\right] ?[/math] To find this, it is helpful to use notation that will simplify the proof. By definition,

[math]var\left[ V\right] =E\left[ \left( V-E\left[ V\right] \right) ^{2}\right] .[/math]

Put

[math]\tilde{V}=V-E\left[ V\right][/math]

so that

[math]var\left[ V\right] =E\left[ \tilde{V}^{2}\right].[/math]

We saw that

[math]E\left[ V\right] =aE\left[ X\right] +bE\left[ Y\right] +c,[/math]

so that

[math]\begin{aligned} \tilde{V} &=&\left( aX+bY+c\right) -\left( aE\left[ X\right] +bE\left[ Y\right] +c\right) \\ &=&a\left( X-E\left[ X\right] \right) +b\left( Y-E\left[ Y\right] \right) \\ &=&a\tilde{X}+b\tilde{Y}\end{aligned}[/math]

and then

[math]var\left[ V\right] =E\left[ \tilde{V}^{2}\right] =E\left[ \left( a\tilde{X}+b\tilde{Y}\right) ^{2}\right].[/math]

Notice that this does not depend on the constant [math]c[/math].

To make further progress, recall that in the current notation,

[math]\begin{aligned} var\left[ X\right] &=&E\left[ \tilde{X}^{2}\right] ,\;\;\;var\left[ Y\right] =E\left[ \tilde{Y}^{2}\right] , \\ Cov\left[ X,Y\right] &=&E\left[ \left( X-E\left[ X\right] \right)\left( Y-E\left[ Y\right] \right) \right] \\ &=&E\left[ \tilde{X}\tilde{Y}\right] .\end{aligned}[/math]

Then,

[math]\begin{aligned} var\left[ V\right] &=&E\left[ \left( a\tilde{X}+b\tilde{Y}\right)^{2}\right] \\ &=&E\left[ a^{2}\tilde{X}^{2}+2ab\tilde{X}\tilde{Y}+b^{2}\tilde{Y}^{2}\right]\\ &=&a^{2}E\left[ \tilde{X}^{2}\right] +2abE\left[ \tilde{X}\tilde{Y}\right]+b^{2}E\left[ \tilde{Y}^{2}\right] \\ &=&a^{2}var\left[ X\right] +2abCov\left[ X,Y\right]+b^{2}var\left[ Y\right] ,\end{aligned}[/math]

using the linear combination result for expected values.

Summarising,

  • if [math]V=aX+bY+c[/math], then

    [math]var\left[ V\right] =a^{2}var\left[ X\right] +2abCov\left[ X,Y\right] +b^{2}var\left[ Y\right] .[/math]

  • If [math]X[/math] and [math]Y[/math] are uncorrelated, so that [math]Cov\left[X,Y\right] =0[/math],

    [math]var\left[ V\right] =a^{2}var\left[ X\right] +b^{2}var\left[ Y\right] .[/math]

  • If [math]X[/math] and [math]Y[/math] are independent, the same result holds.

Examples

  1. Suppose that [math]X[/math] and [math]Y[/math] are independent random variables with [math]var\left[ X\right] =0.25[/math], [math]var\left[ Y\right] =2.5[/math]. If

    [math]V=X+Y,[/math]

    then

    [math]\begin{aligned} var\left[ V\right] &=&var\left[ X\right] +var\left[ Y\right] \\ &=&0.25+2.5 \\ &=&2.75. \end{aligned}[/math]

  2. A previous example used the following random variables [math]W [/math] and [math]H[/math], and defined [math]T[/math] by

    [math]T=W+H.[/math]

    There we found that [math]var\left[ W\right] =0.49, var\left[ H\right] =0.61[/math], while we also found [math]Cov\left[ W,H\right] =0.03[/math]. Then,

    [math]\begin{aligned} var\left[ T\right] &=&var\left[ W+H\right] \\ &=&var\left[ W\right] +2Cov\left[ W,H\right] +var\left[ H\right] \end{aligned}[/math]

    (since this is a case with [math]a=b=1)[/math]. So,

    [math]var\left[ T\right] =0.49+\left( 2\right) \left( 0.03\right)+0.61=1.16.[/math]

  3. For the same joint distribution, the difference between the income of husbands and wives is

    [math]D=H-W.[/math]

    This case has [math]a=1[/math] and [math]b=-1[/math], so that

    [math]\begin{aligned} var\left[ D\right] &=&\left( 1\right) ^{2}var\left[ H \right] +2\left( 1\right) \left( -1\right) Cov\left[ W,H\right] +\left( -1\right) ^{2}var\left[ W\right] \\ &=&0.61-\left( 2\right) \left( 0.03\right) +0.49 \\ &=&1.04. \end{aligned}[/math]

Generalisation

To extend the result to the case of a linear combination of [math]n[/math] random variables [math]X_{1},...,X_{n}[/math] is messy because of the large number of covariance terms involved. So, we simplify by supposing that [math]X_{1},...,X_{n}[/math] are uncorrelated random variables, with all covariances equal to zero: [math]Cov\left[ X_{i},X_{j}\right] =0,i\neq j[/math]. Then,

  • for [math]X_{1},...,X_{n}[/math] uncorrelated,

    [math]var\left[ \sum_{i=1}^{n}a_{i}X_{i}\right] =\sum_{i=1}^{n}a_{i}^{2}var\left[ X_{i}\right] .[/math]

  • This also applies when [math]X_{1},...,X_{n}[/math] are independent random variables.

Standard Deviations

None of these results apply directly to standard deviations. Consider the simple case where [math]X[/math] and [math]Y[/math] are independent random variables and

[math]W=X+Y.[/math]

Then,

[math]\begin{aligned} var\left[ W\right] &=&\sigma _{W}^{2} \\ &=&var\left[ X\right] +var\left[ Y\right] \\ &=&\sigma _{X}^{2}+\sigma _{Y}^{2}\end{aligned}[/math]

and then

[math]\sigma _{W}=\sqrt{\sigma _{X}^{2}+\sigma _{Y}^{2}.}[/math]

In general it is true that

[math]\sigma _{W}\neq \sigma _{X}+\sigma _{Y}.[/math]

To illustrate, if [math]X_{1},X_{2}[/math] and [math]X_{3}[/math] are independent random variables with [math]var\left[ X_{1}\right] =3,var\left[ X_{2}\right]=1 [/math] and [math]var\left[ X_{3}\right] =5[/math], and if

[math]P=2X_{1}+5X_{2}-3X_{3},[/math]

then

[math]\begin{aligned} var\left[ P\right] &=&2^{2}var\left[ X_{1}\right] +5^{2}var\left[ X_{2}\right] +\left( -3\right) ^{2}var\left[X_{3}\right] \\ &=&\left( 4\right) \left( 3\right) +\left( 25\right) \left( 1\right) +\left(9\right) \left( 5\right) \\ &=&82, \\ \sigma _{P} &=&\sqrt{82}=9.06.\end{aligned}[/math]

Linear Combinations of Normal Random Variables

The results in this section so far relate to characteristics of the probability distribution of a linear combination like

[math]V=aX+bY,[/math]

not to the probability distribution itself. Indeed, part of the attraction of these results is that they can be obtained without having to find the probability distribution of [math]V[/math].

However, if we knew that [math]X[/math] and [math]Y[/math] had normal distributions then it would follow that [math]V[/math] is also normally distributed. This innocuous sounding result is EXTREMELY IMPORTANT! It is also rather unusual: there are not many distributions for which this type of result holds.

More specifically,

  • if [math]X\sim N\left( \mu _{X},\sigma _{X}^{2}\right) [/math] and [math]Y\sim N\left(\mu _{Y},\sigma _{Y}^{2}\right) [/math] and [math]W=aX+bY+c[/math], then

    [math]W\sim N\left( \mu _{W},\sigma _{W}^{2}\right)[/math]

    with

    [math]\begin{aligned} \mu _{W} &=&a\mu _{X}+b\mu _{Y}+c, \\ \sigma _{W}^{2} &=&a^{2}\sigma _{X}^{2}+2ab\sigma _{XY}+b^{2}\sigma _{Y}^{2} \end{aligned}[/math]

    Note that independence of [math]X[/math] and [math]Y[/math] has not been assumed.

  • If [math]X_{1},...X_{n}[/math] are uncorrelated random variables with [math]X_{i}\sim N\left( \mu _{i},\sigma _{i}^{2}\right) [/math] and [math]W=\sum \limits_{i=1}^{n}a_{i}X_{i}[/math], where [math]a_{1},...,a_{n}[/math] are constants, then

    [math]W\sim N\left( \sum\limits_{i=1}^{n}a_{i}\mu_{i},\sum\limits_{i=1}^{n}a_{i}^{2}\sigma _{i}^{2}\right).[/math]

  • You should note that we did apply the general rules for the expected value and variance above, but in addition, in this case, we also know what the type of distribution is, not only the values for [math]E[W][/math] and [math]Var[W][/math].

Of course, standard normal distribution tables can be used in the usual way to compute probabilities of events involving [math]W[/math]. This is illustrated in the following example.

Example

If [math]X\sim N\left( 20,5\right) ,Y\sim N\left( 30,11\right) [/math], [math]X[/math] and [math]Y[/math] independent, and

[math]D=X-Y,[/math]

then

[math]D\sim N\left( -10,16\right)[/math]

and

[math]\begin{aligned} \Pr \left( D\gt 0\right) &=&\Pr \left( Z\gt \dfrac{0-\left( -10\right) }{4}\right)\\ &=&\Pr \left( Z\gt 2.5\right) \\ &=&0.00621,\end{aligned}[/math]

where [math]Z\sim N\left( 0,1\right) [/math].

Footnotes