Random Vectors

Source:

Random Vectors

Notation

Symbol Type Description
X Random vector A vector of jointly distributed random variables X=[X1,X2,,Xp]T
E[X] Vector Expectation of the random vector X
μX, μ Vector Alternative notation for the expectation E[X]
Var(X) Matrix Variance-covariance matrix (or simply covariance matrix) of X
ΣX, Σ, Cov(X) Matrix Alternative notations for the variance-covariance matrix
Cov(X,Y) Matrix Covariance matrix between two random vectors X and Y
AX Linear transformation Transformation of the random vector X by a k×p matrix A
Cov(Xi,Yj) Scalar Covariance between the i-th component of X and the j-th component of Y
[E[X]] Column vector Expectation of the random vector X expressed as a column matrix
E[XYT] Matrix Matrix of expected pairwise products between components of X and Y
X1,X2,,Xp Random variables Components of the random vector X

Distinction: Variance-Covariance Matrix vs. Covariance Matrix:

The terms variance-covariance matrix and covariance matrix are often used interchangeably in many contexts, but they can have subtle distinctions depending on the scenario:

  1. Variance-Covariance Matrix: Refers specifically to the covariance matrix of a single random vector.
  2. Covariance Matrix: A more general term that applies to the covariance between two random vectors.

Abbreviations

Abbreviation Description
r.v. Random variable

Notation

  • The expected value of a random vector X is often denoted by E(X), E[X], EX, with E also often stylized as E or E, or symbolically as μX or simply μ.
  • The variance of random vector X is typically designated as Var(X), or sometimes as Cov(X). Since the variance is a variance-covariance matrix, it's also denoted as ΣX or Σ. The element at i-th row j-th column is Σij
  • The covariance of two random vectors X and Y is typically designated as Cov(X,Y). Since the covariance is a variance-covariance matrix, it's also denoted as Σ(X,Y).

Abbrevations

Definition

Definition: A random vector X is a vector X=[X1X2Xp] of jointly distributed random variables X1,,Xp. As is customary in linear algebra, we will write vectors as column matrices whenever convenient.

Expectation of a random vector

Definition: The expectation EX of a random vector X=[X1,X2,,Xp]T is given by E[X]=[E[X1]E[X2]E[Xp]] It's also denoted as μX or μ.

Linearity of expectation

Recalling that, the expectation for random variables is a linear operation, this linearity also holds for random vectors.

The linearity properties of the expectation can be expressed compactly by stating that for any k×p-matrix A and any 1×j-matrix B, E[AX]=AE[X] and E[XB]=E[X]B

Variance of a random vector

The variance of a random vector X is represented as a matrix, known as the variance-covariance matrix (often simply referred to as the covariance matrix in some literature). Var(X)=Cov(X,X)=E[(XE[X])(XE[X])T]. It's also denoted as ΣX, Σ, or Cov(X).

Expectation --> Variance

One important property is that, Var(X)Cov(X,X)=E[(XE[X])(XE[X])T]=E[XXT]E[X]E[X]T. The proof is easily derived from covariance operator Cov(X,Y).

Covariance between two random vectors

For two jointly distributed real-valued random vectors X and Y, the covariance is represented as a matrix, called the covariance matrix: Cov(X,Y)=E[(XE[X])(YE[Y])T] It's also denoted as Σ(X,Y).

The covariance matrix

For two random vectors X=[X1,X2,,Xp]TRp and Y=[Y1,Y2,,Yq]TRq, their covariance matrix is a p×q matrix defined as:

Cov(X,Y)=[Cov(X1,Y1)Cov(X1,Y2)Cov(X1,Yq)Cov(X2,Y1)Cov(X2,Y2)Cov(X2,Yq)Cov(Xp,Y1)Cov(Xp,Y2)Cov(Xp,Yq)]

Here: - Cov(Xi,Yj) represents the covariance between the random variables Xi( from X) and Yj (from Y ). - If X=Y, this matrix reduces to the variance-covariance matrix of X, which is symmetric because Cov(Xi,Yj)=Cov(Xj,Yi) by the definition of covariance for random variables.

Expectation --> Covariance

Cov(X,Y)=E[(XE[X])(YE[Y])T]=E[XYTXE[Y]TE[X]YT+E[X]E[Y]T]=E[XYT]E[XE[Y]T]E[E[X]YT]+E[E[X]E[Y]T]=E[XYT]E[X]E[Y]TE[X]E[Y]T+E[X]E[Y]T=E[XYT]E[X]E[Y]T

Linear combinations of random variables

Consider random variables X1,,Xp. We want to find the expectation and variance of a new random variable L(X1,,Xp) obtained as a linear combination of X1,,Xp; that is, L(X1,,Xp)=i=1paiXi.

Using vector-matrix notation we can write this in a compact way: L(X)=aTX, where aT=[a1,,ap]. Then we get: E[L(X)]=E[aTX]=aTEX, and Var[L(X)]=E[L(X)L(X)T]E[L(X)]E[L(X)]T=E[aTXXTa]E(aTX)[E(aTX)]T=aTE[XXT]aaTEX(EX)Ta=aT(E[XXT]EX(EX)T)a=aTCov(X)a

Thus, knowing EX and Cov(X), we can easily find the expectation and variance of any linear combination of X1,,Xp.

Collary: Σ is positive semi-definite

Corollary: If Σ is the covariance matrix of a random vector, then for any constant vector a we have aTΣa0.

That is, Σ satisfies the property of being a positive semi-definite (PSD) matrix.

Proof: According to the previous section, aTΣa is the variance of a random variable. We know that variance is always non-negative.

This suggests the question: Given a symmetric, positive semi-definite matrix, is it the covariance matrix of some random vector? The answer is yes.

#TODO

Linear transform of a random vector

Consider a random vector X with covariance matrix Σ. Then, for any k dimensional constant vector c and any p×k-matrix A, the k dimensional random vector c+ATX has mean c+ATEX and has covariance matrix Cov(c+ATX)=ATΣA

The proof is quite simple:

Let Y=c+ATX, due to the linearity expectation operator, its expectation is E[Y]=E[c+ATX]=c+ATE[X]. Thus, YE[Y]=(c+ATX)(c+ATE[X])=AT(XE[X]). Therefore, Cov(c+ATX)=Cov(Y)=E[(YE[Y])(YE[Y])T]=E[(AT(XE[X]))(AT(XE[X]))T]=E[AT(XE[X])(XE[X])TA]=ATE[(XE[X])(XE[X])T]A=ATΣA Remember that ΣCov(X) .

What if all elements are independent?

If X1,X2,,Xp are i.i.d. (independent identically distributed), then Cov([X1,X2,,Xp]T)​, or the covariance matrix Σ, is a diagonal matrix with σ2 on the diagonal and zeros elsewhere: Σ=[σ2000σ2000σ2]=σ2Ip where Ip is the p×p identity matrix.

Proof:

  • The diagonal elements Σii represent the variance of each Xi :

Σii=Var(Xi)=σ2 for all i

  • The off-diagonal elements Σij represent the covariance between different Xi and Xj. Since Xi and Xj are independent, we have:

Σij=Cov(Xi,Xj)=0 for ij