Differences between revisions 2 and 10 (spanning 8 versions)
Revision 2 as of 2024-01-21 03:45:10
Size: 2578
Comment: Reorganize to vectors and matrices
Revision 10 as of 2025-03-28 03:59:08
Size: 3077
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
When two vectors do not exist in the same column space, the best approximation of one in the other's columns space is called a '''projection'''. When a vector does not exist in a column space, the '''projection''' is the best approximation of it in linear combinations of that column space.
Line 13: Line 13:
Given two vectors ''a'' and ''b'', we can '''project''' ''b'' onto ''a'' to get the best possible estimate of the former as a multiple of the latter. This projection ''p'' has an error term ''e''. Given vectors ''a'' and ''b'', ''b'' can be projected into ''C(a)'', the column space of ''a''. This projection ''p'' has an error term ''e''.
Line 15: Line 15:
Take the multiple as ''x'', so that ''p = ax''. The error term can be characterized as ''b-p'' or ''b-ax''. The factor which converts ''a'' into an estimate is notated as ''x̂'', so that ''p = ax̂''. The '''error term''' can be characterized by ''e = b - p'' or ''e = b - ax̂''. ''a'' is [[LinearAlgebra/Orthogonality|orthogonal]] to ''e'', so ''a^T^(b - ax̂) = 0''. This simplifies to ''x̂ = (a^T^b)/(a^T^a)''. Altogether, the projection is ''p = a(a^T^b)/(a^T^a)''.
Line 17: Line 17:
''a'' is [[LinearAlgebra/Orthogonality|orthogonal]] to ''e''. Therefore, ''a^T^(b-ax) = 0''. This simplifies to ''x = (a^T^b)/(a^T^a)''. Altogether, the projection is characterized as ''p = a(a^T^b)/(a^T^a)''. The '''projection matrix''' '''''P''''' satisfies ''p = '''P'''b''. It is calculated ''(aa^T^)/(a^T^a)''. ''C('''P''')'', the column space of '''''P''''', is equivalent to the column space of ''a''. (It follows that '''''P''''' is also of [[LinearAlgebra/Rank|rank]] 1.)
Line 19: Line 19:
A matrix '''''P''''' can be defined such that ''p = '''P'''b''. The projection matrix is ''(aa^T^)/(a^T^a)''. The column space of '''''P''''' (a.k.a. ''C('''P''')'') is the line through ''a'', and its rank is 1.
Line 21: Line 20:
Incidentally, '''''P''''' is symmetric (i.e. '''''P'''^T^ = '''P''''') and re-projecting does not change the result (i.e. '''''P'''^2^ = '''P''''').
=== Properties ===

The projection matrix '''''P''''' is [[LinearAlgebra/MatrixProperties#Symmetry|symmetric]] (i.e. '''''P'''^T^ = '''P''''') and [[LinearAlgebra/MatrixProperties#Idempotency|idempotent]] (i.e. '''''P'''^2^ = '''P''''').
Line 29: Line 31:
For problems like '''''A'''x = b'' where there is no solution for ''x'', as in b does not exist in the column space of '''''A''''', we can instead solve '''''A'''x = p'' where ''p'' estimates ''b'' with an error term ''e''. Given a system as '''''A'''x = b'', if ''b'' is not in ''C('''A''')'', the column space of '''''A''''', then there is no possible solution for ''x''. The best approximation is expressed as '''''A'''x̂ = p'' where projection ''p'' estimates ''b'' with an error term ''e''.
Line 31: Line 33:
''p'' is a linear combination of '''''A''''': if there are two columns ''a,,1,,'' and ''a,,2,,'', then ''p = x,,1,,a,,1,, + x,,2,,a,,2,,'' and ''b = x,,1,,a,,1,, + x,,2,,a,,2,, + e''. The error term can be characterized by ''e = b - p'' or ''e = b - '''A'''x̂''. ''e'' is orthogonal to ''R('''A''')'', the row space of '''''A'''''; equivalently it is orthogonal to ''C('''A'''^T^)''. Orthogonality in this context means that ''e'' is in the [[LinearAlgebra/NullSpaces|null space]], so '''''A'''^T^(b - '''A'''x̂) = 0''.
Line 33: Line 35:
''e'' is orthogonal to the column space of '''''A'''^T^'' (a.k.a. ''C('''A'''^T^)''), so '''''A'''^T^(b-'''A'''x) = 0''. Concretely in the same example, ''a,,1,,^T^(b-'''A'''x) = 0'' and ''a,,2,,^T^(b-'''A'''x) = 0''. More generally, this re-emphasizes that ''e'' is orthogonal in the null space of '''''A'''^T^'' (a.k.a. ''N('''A'''^T^)''). The system of '''normal equations''' is '''''A'''^T^'''A'''x̂ = '''A'''^T^b''. This simplifies to ''x̂ = ('''A'''^T^'''A''')^-1^'''A'''^T^b''. Altogether, the projection is characterized by ''p = '''A'''('''A'''^T^'''A''')^-1^'''A'''^T^b''.
Line 35: Line 37:
The solution for this all is ''x = ('''A'''^T^'''A''')^-1^'''A'''^T^b''. That also means that ''p = '''A'''('''A'''^T^'''A''')^-1^'''A'''^T^b''. The projection matrix '''''P''''' satisfies ''p = '''P'''b''. It is calculated as '''''P''' = '''A'''('''A'''^T^'''A''')^-1^'''A'''^T^''.
Line 37: Line 39:
A matrix '''''P''''' can be defined such that ''p = '''P'''b''. The projection matrix is '''A'''('''A'''^T^'''A''')^-1^'''A'''^T^. ''b'' can also be projected onto ''e'', which geometrically means projecting into the null space of '''''A'''^T^''. Algebraically, if one projection matrix has been computed as '''''P''''', then the projection matrix for going the other way is ''('''I''' - '''P''')b''.
Line 39: Line 41:
Note that if '''''A''''' were a square matrix, most of the above equations would [[LinearAlgebra/MatrixInversion|cancel out]]. But we cannot make that assumption. This fundamentally means though that if ''b'' were in the column space of '''''A''''', then '''''P''''' would be the identity matrix.
Line 41: Line 42:
[[Econometrics/OrdinaryLeastSquares|This should look familiar.]]
=== Properties ===

As above, the projection matrix '''''P''''' is symmetric and idempotent.

If '''''A''''' is square, the above equations simplify rapidly.

If ''b'' actually ''was'' in ''C('''A''')'', then '''''P''' = '''I'''''. Conversely, if ''b'' is orthogonal to ''C('''A''')'', then '''''P'''b = 0'' and ''b = e''.



=== Usage ===

[[Statistics/OrdinaryLeastSquares|This should look familiar.]] A projection is inherently the minimization of the error term.

Projections

When a vector does not exist in a column space, the projection is the best approximation of it in linear combinations of that column space.


Vectors

Given vectors a and b, b can be projected into C(a), the column space of a. This projection p has an error term e.

The factor which converts a into an estimate is notated as , so that p = ax̂. The error term can be characterized by e = b - p or e = b - ax̂. a is orthogonal to e, so aT(b - ax̂) = 0. This simplifies to x̂ = (aTb)/(aTa). Altogether, the projection is p = a(aTb)/(aTa).

The projection matrix P satisfies p = Pb. It is calculated (aaT)/(aTa). C(P), the column space of P, is equivalent to the column space of a. (It follows that P is also of rank 1.)

Properties

The projection matrix P is symmetric (i.e. PT = P) and idempotent (i.e. P2 = P).


Matrices

Given a system as Ax = b, if b is not in C(A), the column space of A, then there is no possible solution for x. The best approximation is expressed as Ax̂ = p where projection p estimates b with an error term e.

The error term can be characterized by e = b - p or e = b - A. e is orthogonal to R(A), the row space of A; equivalently it is orthogonal to C(AT). Orthogonality in this context means that e is in the null space, so AT(b - Ax̂) = 0.

The system of normal equations is ATAx̂ = ATb. This simplifies to x̂ = (ATA)-1ATb. Altogether, the projection is characterized by p = A(ATA)-1ATb.

The projection matrix P satisfies p = Pb. It is calculated as P = A(ATA)-1AT.

b can also be projected onto e, which geometrically means projecting into the null space of AT. Algebraically, if one projection matrix has been computed as P, then the projection matrix for going the other way is (I - P)b.

Properties

As above, the projection matrix P is symmetric and idempotent.

If A is square, the above equations simplify rapidly.

If b actually was in C(A), then P = I. Conversely, if b is orthogonal to C(A), then Pb = 0 and b = e.

Usage

This should look familiar. A projection is inherently the minimization of the error term.


CategoryRicottone

LinearAlgebra/Projections (last edited 2025-03-28 15:32:28 by DominicRicottone)