Differences between revisions 5 and 7 (spanning 2 versions)
Revision 5 as of 2024-01-21 19:53:14
Size: 3515
Comment: Reorganized
Revision 7 as of 2024-01-21 21:30:18
Size: 3609
Comment: Updated intro
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
When two vectors do not exist in the same column space, the best approximation of one in the other's columns space is called a '''projection'''. When a vector does not exist in s column space, the best approximation of it in that columns space is called a '''projection'''.
Line 15: Line 15:
Take the multiple as ''x'', so that ''p = ax''. The error term can be characterized as ''b-p'' or ''b-ax''. The factor which converts ''a'' into an estimate is notated as ''x̂'', so that ''p = ax̂''. The error term can be characterized by ''e = b - p'' or ''e = b - ax̂''.
Line 17: Line 17:
''a'' is [[LinearAlgebra/Orthogonality|orthogonal]] to ''e''. Therefore, ''a^T^(b-ax) = 0''. This simplifies to ''x = (a^T^b)/(a^T^a)''. Altogether, the projection is characterized as ''p = a(a^T^b)/(a^T^a)''. ''a'' is [[LinearAlgebra/Orthogonality|orthogonal]] to ''e''. Therefore, ''a^T^(b - ax̂) = 0''. This simplifies to ''x̂ = (a^T^b)/(a^T^a)''. Altogether, the projection is characterized as ''p = a(a^T^b)/(a^T^a)''.
Line 29: Line 29:
For problems like '''''A'''x = b'' where there is no solution for ''x'', as in b does not exist in the column space of '''''A''''', we can instead solve '''''A'''x = p'' where ''p'' estimates ''b'' with an error term ''e''. For systems of equations like '''''A'''x = b'' where there is no solution for ''x'', as in b does not exist in the column space of '''''A''''', we can instead solve '''''A'''x̂ = p'' where ''p'' estimates ''b'' with an error term ''e''.
Line 31: Line 31:
''p'' is a linear combination of '''''A''''': if there are two columns ''a,,1,,'' and ''a,,2,,'', then ''p = x,,1,,a,,1,, + x,,2,,a,,2,,'' and ''b = x,,1,,a,,1,, + x,,2,,a,,2,, + e''. The error term can be characterized as ''e = b - p'' or ''e = b - '''A'''x̂''
Line 33: Line 33:
''e'' is [[LinearAlgebra/Orthogonality|orthogonal]] to the row space of '''''A''''' because the error term does not exist in any linear combination of the rows. The projection is more easily worked with in terms of '''''A'''^T^'', so instead think of ''e'' being orthogonal to the column space of '''''A'''^T^''. Therefore, '''''A'''^T^(b-'''A'''x) = 0''. Concretely in the same example, ''a,,1,,^T^(b-'''A'''x) = 0'' and ''a,,2,,^T^(b-'''A'''x) = 0''. More generally, that re-emphasizes that ''e'' is the [[LinearAlgebra/NullSpaces|null space]] of '''''A'''^T^''. ''e'' is [[LinearAlgebra/Orthogonality|orthogonal]] to the row space of '''''A''''' because the error term does not exist in any linear combination of the rows. The projection is more easily worked with in terms of '''''A'''^T^'', so instead think of ''e'' being orthogonal to the column space of '''''A'''^T^'', a.k.a. ''e'' is the [[LinearAlgebra/NullSpaces|null space]] of '''''A'''^T^''. Therefore, '''''A'''^T^(b - '''A'''x̂) = 0''.
Line 35: Line 35:
The solution for this all is ''x = ('''A'''^T^'''A''')^-1^'''A'''^T^b''. That also means that ''p = '''A'''('''A'''^T^'''A''')^-1^'''A'''^T^b''. Altogether, the system of '''normal equations''' for this problem is '''''A'''^T^'''A'''x̂ = '''A'''^T^b''. This simplifies to ''x̂ = ('''A'''^T^'''A''')^-1^'''A'''^T^b''. Altogether, the projection is characterized as ''p = '''A'''('''A'''^T^'''A''')^-1^'''A'''^T^b''.
Line 37: Line 37:
A matrix '''''P''''' can be defined such that ''p = '''P'''b''. The '''projection matrix''' is '''A'''('''A'''^T^'''A''')^-1^'''A'''^T^. A matrix '''''P''''' can be defined such that ''p = '''P'''b''. The '''projection matrix''' is '''''A'''('''A'''^T^'''A''')^-1^'''A'''^T^''.
Line 39: Line 39:
''b'' can also be projected onto ''e'', which geometrically means projecting into the null space of '''''A'''^T^''. Algebraically, that projection matrix in terms of '''''P''''' is ''('''I'''-'''P''')b''. ''b'' can also be projected onto ''e'', which geometrically means projecting into the null space of '''''A'''^T^''. Algebraically, if one projection matrix has been computed as '''''P''''', then the projection matrix for going the other way is ''('''I''' - '''P''')b''.
Line 43: Line 43:
Note that if '''''A''''' were a square matrix, most of the above equations would [[LinearAlgebra/MatrixInversion|cancel out]]. But we cannot make that assumption. [[Econometrics/OrdinaryLeastSquares|This should look familiar.]] A projection is inherently the minimization of the error term.
Line 45: Line 45:
[[Econometrics/OrdinaryLeastSquares|This should look familiar.]] Some notes:
Line 47: Line 47:
Note that if ''b'' were in the column space of '''''A''''', then '''''P''''' would be the identity matrix. And if ''b'' were orthogonal to the column space of '''''A''''', then necessarily ''b'' is in the null space of '''''A'''^T^''. For that reason, projecting ''b'' onto ''e'' would give an identity matrix. In that case, '''''P'''b = 0'' and ''b = e''.  1. If '''''A''''' were a square matrix, most of the above equations would [[LinearAlgebra/MatrixInversion|cancel out]]. But we cannot make that assumption.
 2. If ''b'' were in the column sp
ace of '''''A''''', then '''''P''''' would be the identity matrix.
 3. I
f ''b'' were orthogonal to the column space of '''''A''''', then necessarily ''b'' is in the null space of '''''A'''^T^''. For that reason, projecting ''b'' onto ''e'' would give an identity matrix. In that case, '''''P'''b = 0'' and ''b = e''.

Projections

When a vector does not exist in s column space, the best approximation of it in that columns space is called a projection.


Vectors

Given two vectors a and b, we can project b onto a to get the best possible estimate of the former as a multiple of the latter. This projection p has an error term e.

The factor which converts a into an estimate is notated as , so that p = ax̂. The error term can be characterized by e = b - p or e = b - ax̂.

a is orthogonal to e. Therefore, aT(b - ax̂) = 0. This simplifies to x̂ = (aTb)/(aTa). Altogether, the projection is characterized as p = a(aTb)/(aTa).

A matrix P can be defined such that p = Pb. The projection matrix is (aaT)/(aTa). The column space of P (a.k.a. C(P)) is the line through a, and its rank is 1.

Incidentally, P is symmetric (i.e. PT = P) and idempotent (i.e. P2 = P).


Matrices

For systems of equations like Ax = b where there is no solution for x, as in b does not exist in the column space of A, we can instead solve Ax̂ = p where p estimates b with an error term e.

The error term can be characterized as e = b - p or e = b - A

e is orthogonal to the row space of A because the error term does not exist in any linear combination of the rows. The projection is more easily worked with in terms of AT, so instead think of e being orthogonal to the column space of AT, a.k.a. e is the null space of AT. Therefore, AT(b - Ax̂) = 0.

Altogether, the system of normal equations for this problem is ATAx̂ = ATb. This simplifies to x̂ = (ATA)-1ATb. Altogether, the projection is characterized as p = A(ATA)-1ATb.

A matrix P can be defined such that p = Pb. The projection matrix is A(ATA)-1AT.

b can also be projected onto e, which geometrically means projecting into the null space of AT. Algebraically, if one projection matrix has been computed as P, then the projection matrix for going the other way is (I - P)b.

As above, P is symmetric (i.e. PT = P) and idempotent (i.e. P2 = P).

This should look familiar. A projection is inherently the minimization of the error term.

Some notes:

  1. If A were a square matrix, most of the above equations would cancel out. But we cannot make that assumption.

  2. If b were in the column space of A, then P would be the identity matrix.

  3. If b were orthogonal to the column space of A, then necessarily b is in the null space of AT. For that reason, projecting b onto e would give an identity matrix. In that case, Pb = 0 and b = e.


CategoryRicottone

LinearAlgebra/Projections (last edited 2025-03-28 15:32:28 by DominicRicottone)