Covariance
Covariance is a measure of how much something varies with another. It is a generalization of variance: Var(X) = Cov(X,X).
Description
Covariance is calculated as:
Cov(X,Y) = E[(X - E[X])(Y - E[y])]
Covariance is related to correlation as:
Corr(X,Y) = Cov(X,Y)/σXσY
Letting X̅ be the mean of X, and letting Y̅ be the mean of Y, the calculation becomes:
Cov(X,Y) = E[(X - X̅)(Y - Y̅)]
E[XY - X̅Y - XY̅ + X̅Y̅]
E[XY] - X̅E[Y] - E[X]Y̅ + X̅Y̅
E[XY] - X̅Y̅ - X̅Y̅ + X̅Y̅
E[XY] - X̅Y̅
This gives a trivial proof that independent variables have zero correlation and zero covariance. Necessarily E[XY] = E[X]E[Y], so E[XY] - X̅Y̅ = 0
Properties
Covariance is symmetric: Cov(X,Y) = Cov(Y,X)
Transformations
Covariance linearly transforms with scalars.
Cov(aX,Y) = E[aXY] - E[aX]E[Y]
a E[XY] - a E[X]E[Y]
a (E[XY] - E[X]E[Y])
a Cov(X,Y)
Covariance is linear with inputs.
Cov(X+Y,Z) = E[(X+Y)Z] - E[X+Y]E[Z]
E[XZ+YZ] - E[X+Y]E[Z]
(E[XZ] + E[YZ]) - (E[X] + E[Y]) E[Z]
(E[XZ] + E[YZ]) - (E[X]E[Z] + E[Y]E[Z])
(E[XZ] - E[X]E[Z] + E[YZ] - E[Y]E[Z]
Cov(X,Z) + Cov(Y,Z)
This gives a trivial proof that constant additions cancel out.
Cov(a+X,Y) = Cov(X,Y) + Cov(a,Y) = Cov(X,Y) + 0
Altogether: Cov(a+bX,c+dY) = b d Cov(X,Y)
Matrix
A covariance matrix describes multivariate covariances. Consider a column x: the covariance matrix reflects Cov(x,x). Cell (i,j) is the covariance of the ith termwith the jth term. On the diagonal are variances (i.e., covariance of a term with itself). The matrix is usually notated as Σ.
The inverse covariance matrix, Σ-1, is also called the precision matrix.
The covariance matrix is calculated as:
Σ = E[(x - E[x])(x - E[x])T]
Letting x̅ be the mean vector of x, the calculation becomes:
Σ = E[(X - x̅)(X - x̅)T]
Alternatively:
Properties
A covariance matrix is necessarily square, symmetric, and positive semi-definite.
Σ = ΣT
the determinant is bound by |Σ| >= 0
Σ0.5 can always be evaluated
Linear Algebra
The covariance matrix linearly transforms with the inputs.
Cov(Ax,Ax) = E[(AX - Ax̅)(AX - Ax̅)T]
E[A(X - x̅)(X - x̅)TAT]
AE[(X - x̅)(X - x̅)T]AT
AΣAT
Trivially, if the transformation is a scalar like aI:
aIΣaIT
aΣa
a2Σ
