Orthonormalization

Gram-Schmidt orthonormalization is a process for transforming a set of vectors (or a matrix) into orthogonal unit vectors.


Description

To orthonormalize a set of vectors is to...

  1. make them orthogonal to each other, then

  2. normalize them to a unit distance of 1.

Note that orthogonality is a property of two vectors, not one. No process is needed to 'orthogonalize' a single vector.


Process

Solution

Consider two vectors, A and B. They will be made orthogonal to each other and become a and b. Then they will be normalized and become â and .

This Gram-Schmidt procedure uses linear combinations and does not change the column space of a system that includes both A and B. Put another way, if A and B are a basis for some subspace, then â and will be a basis for the exact same subspace. But they will be orthonormal, so certain operations will be simpler.

First, recall that orthogonality is a property of two vectors, not one. Therefore A needs no transformation to become a.

B is transformed into b by subtracting all components of a from B. Recall that projections capture how much a vector moves in the column space of another vector. It follows that subtracting the projection of a onto B, from B, results in a b that is independent of a.

This is generally notated as b = B - [(a⋅B)/(a⋅a)]a (i.e., in terms of the dot product) or b = B - [(aTB)/(aTa)]a. But it can also be expressed in terms of the inner product: b = B - [⟨a, B⟩/⟨a, a⟩]a.

If there is a third term C, it must be transformed by subtracting all components of a and of b (not B!) from C. That is, c = C - [(aTC)/(aTa)]a - [(bTC)/(bTb)]b or c = C - [⟨a, C⟩/⟨a, a⟩]a - [⟨b, C⟩/⟨b, b⟩]b. And so on.

The final step is to normalize each vector by their own distance, as in â = a/||a||.

Extension to Square Matrices

The Gram-Schmidt procedure can be directly applied to the columns of a square matrix. To reiterate, it does not change the column space.

Note that the procedure involves linear combinations. First, to orthogonalize a column, a combination of all preceding columns is subtracted. Second, to normalize a column, some portion of itself is taken. Therefore the diagonal will be the normalizing factors, the values above the diagonal will be the components of other columns to subtract, and everything below the diagonal will be zeros. This can be expressed as a matrix, and importantly it will be an upper triangular matrix.

Following the above notation, where a matrix has columns A, B, and C and the transformed matrix will have columns â, , and ĉ:

r.svg

The orthonormalized columns comprise the matrix Q. Because this matrix is orthonormal, it has useful properties like QT = Q-1.

Altogether, this is the QR decomposition and it is expressed as A = QR.


CategoryRicottone