Orthonormalization
Gram-Schmidt orthonormalization is a process for transforming a set of vectors (or a matrix) into orthogonal unit vectors.
Description
To orthonormalize a set of vectors is to...
make them orthogonal to each other, then
normalize them to a unit distance of 1.
Note that orthogonality is a property of two vectors, not one. No process is needed to 'orthogonalize' a single vector.
Process
Solution
Consider two vectors, A and B. They will be made orthogonal to each other and become a and b. Then they will be normalized and become â and b̂.
The Gram-Schmidt procedure does this using linear combinations, so does not change the subspace spanned by A and B. Put another way, if A and B are a basis for some subspace, then â and b̂ will be a basis for the exact same subspace. But they will be orthonormal, so certain operations will be simpler.
First, recall that orthogonality is a property of two vectors, not one. Therefore A needs no transformation to become a.
B is transformed into b by subtracting all components of a from B. Recall that projections capture how much a vector moves in the column space of another vector. It follows that subtracting the projection of a onto B, from B, results in a b that is independent of a.
This is generally notated as b = B - [(a⋅B)/(a⋅a)]a (i.e., in terms of the dot product) or b = B - [(aTB)/(aTa)]a. But it can also be expressed in terms of the inner product: b = B - [⟨a, B⟩/⟨a, a⟩]a.
If there is a third term C, it must be transformed by subtracting all components of a and of b (not B!) from C. That is, c = C - [(aTC)/(aTa)]a - [(bTC)/(bTb)]b or c = C - [⟨a, C⟩/⟨a, a⟩]a - [⟨b, C⟩/⟨b, b⟩]b. And so on.
The final step is to normalize each vector by their own distance, as in â = a/||a||.
QR Decomposition
To reiterate, the Gram-Schmidt procedure uses linear combinations. It can therefore be represented with matrices. First, to orthogonalize a column, a combination of all preceding columns is subtracted. Second, to normalize a column, some portion of itself is taken. Therefore, in the matrix representing this procedure...
- the numbers on the diagonal are normalizing factors; how much of the same column to take.
- the numbers above the diagonal are how much of each preceding column to take away.
- the numbers below the diagonal are all zeros.
Note furthermore that this is an upper triangular matrix.
Following the above notation, where a matrix has columns A, B, and C and the transformed matrix will have columns â, b̂, and ĉ:
The orthonormalized columns comprise the matrix Q. Because this matrix is orthonormal, it has useful properties like QT = Q-1.
Altogether, this is the QR decomposition and it is expressed as A = QR.
