|
Size: 1437
Comment: Rewrite 3
|
← Revision 14 as of 2025-11-03 01:35:01 ⇥
Size: 1611
Comment: Link
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 1: | Line 1: |
| = Ordinary Least Squares Univariate Proof = | = OLS Single Regression Derivation = |
| Line 6: | Line 6: |
The model is fit by a minimization problem: {{attachment:min.svg}} |
|
| Line 21: | Line 25: |
| * and ''r,,XY,,'' is the sample correlation coefficient between ''X'' and ''Y'' (estimating ''ρ,,XY,,'') | * and ''r,,XY,,'' is the sample [[Statistics/Correlation|correlation]] coefficient between ''X'' and ''Y'' (estimating ''ρ,,XY,,'') |
| Line 37: | Line 41: |
| This reduced form can be quickly solved for ''β''. |
|
| Line 39: | Line 45: |
| Expand the formula for correlation as: | Because the correlation coefficient can be expressed in terms of covariance and standard deviations... |
| Line 41: | Line 47: |
| {{attachment:b10.svg}} | {{attachment:correlation.svg}} |
| Line 43: | Line 49: |
| This can now be eliminated into: | ...the solution for ''β'' can be further reduced. |
| Line 45: | Line 51: |
| {{attachment:b11.svg}} | {{attachment:beta5.svg}} |
| Line 47: | Line 53: |
| Finally, ''b'' can be eloquently written as: | Therefore, the regression line is estimated to be: |
| Line 49: | Line 55: |
| {{attachment:b12.svg}} Giving a generic formula for the regression line: {{attachment:b13.svg}} |
{{attachment:regression.svg}} |
OLS Single Regression Derivation
The model is constructed like:
The model is fit by a minimization problem:
This is estimated as:
This line must pass through the mean and the slope of the line must be the marginal change in Y given a unit change in X. In other words, the line must pass through two points:
where:
X‾ is the sample mean of X (estimating μX)
Y‾ is the sample mean of Y (estimating μY)
sX is the sample standard deviation of X (estimating σX)
sY is the sample standard deviation of Y (estimating σY)
and rXY is the sample correlation coefficient between X and Y (estimating ρXY)
Insert the first point into the estimation. This is quickly solved for α.
Insert the second point and the solution for α into the estimation.
This reduced form can be quickly solved for β.
Because the correlation coefficient can be expressed in terms of covariance and standard deviations...
...the solution for β can be further reduced.
Therefore, the regression line is estimated to be:
