|
Size: 1289
Comment: Rewrite 1
|
Size: 1584
Comment: Apparently mutlivariate regression ~= multiple regression
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 1: | Line 1: |
| = Ordinary Least Squares Univariate Proof = | = OLS Single Regression Derivation = |
| Line 7: | Line 7: |
| The model is fit by a minimization problem: {{attachment:min.svg}} |
|
| Line 11: | Line 15: |
| As a starting point, the regression line must pass through these two points: | This line must pass through the mean and the slope of the line must be the marginal change in ''Y'' given a unit change in ''X''. In other words, the line must pass through two points: |
| Line 13: | Line 17: |
| {{attachment:regression1.svg}} | {{attachment:model3.svg}} |
| Line 15: | Line 19: |
| and | where: |
| Line 17: | Line 21: |
| {{attachment:regression2.svg}} | * ''X‾'' is the sample mean of ''X'' (estimating ''μ,,X,,'') * ''Y‾'' is the sample mean of ''Y'' (estimating ''μ,,Y,,'') * ''s,,X,,'' is the sample standard deviation of ''X'' (estimating ''σ,,X,,'') * ''s,,Y,,'' is the sample standard deviation of ''Y'' (estimating ''σ,,Y,,'') * and ''r,,XY,,'' is the sample correlation coefficient between ''X'' and ''Y'' (estimating ''ρ,,XY,,'') |
| Line 19: | Line 27: |
| Take the generic equation form of a line: | Insert the first point into the estimation. This is quickly solved for ''α''. |
| Line 21: | Line 29: |
| {{attachment:b01.svg}} | {{attachment:alpha1.svg}} |
| Line 23: | Line 31: |
| Insert the first point into this form. | {{attachment:alpha2.svg}} |
| Line 25: | Line 33: |
| {{attachment:b02.svg}} | Insert the second point and the solution for ''α'' into the estimation. |
| Line 27: | Line 35: |
| This can be trivially rewritten to solve for ''a'' in terms of ''b'': | {{attachment:beta1.svg}} |
| Line 29: | Line 37: |
| {{attachment:b03.svg}} | {{attachment:beta2.svg}} |
| Line 31: | Line 39: |
| Insert the second point into the original form. | {{attachment:beta3.svg}} |
| Line 33: | Line 41: |
| {{attachment:b04.svg}} | This reduced form can be quickly solved for ''β''. |
| Line 35: | Line 43: |
| Now additionally insert the solution for ''a'' in terms of ''b''. | {{attachment:beta4.svg}} |
| Line 37: | Line 45: |
| {{attachment:b05.svg}} | Because the correlation coefficient can be expressed in terms of covariance and standard deviations... |
| Line 39: | Line 47: |
| Expand all terms to produce: | {{attachment:correlation.svg}} |
| Line 41: | Line 49: |
| {{attachment:b06.svg}} | ...the solution for ''β'' can be further reduced. |
| Line 43: | Line 51: |
| This can now be eliminated into: | {{attachment:beta5.svg}} |
| Line 45: | Line 53: |
| {{attachment:b07.svg}} | Therefore, the regression line is estimated to be: |
| Line 47: | Line 55: |
| Giving a solution for ''b'': {{attachment:b08.svg}} This solution is trivially rewritten as: {{attachment:b09.svg}} Expand the formula for correlation as: {{attachment:b10.svg}} This can now be eliminated into: {{attachment:b11.svg}} Finally, ''b'' can be eloquently written as: {{attachment:b12.svg}} Giving a generic formula for the regression line: {{attachment:b13.svg}} |
{{attachment:regression.svg}} |
OLS Single Regression Derivation
The model is constructed like:
The model is fit by a minimization problem:
This is estimated as:
This line must pass through the mean and the slope of the line must be the marginal change in Y given a unit change in X. In other words, the line must pass through two points:
where:
X‾ is the sample mean of X (estimating μX)
Y‾ is the sample mean of Y (estimating μY)
sX is the sample standard deviation of X (estimating σX)
sY is the sample standard deviation of Y (estimating σY)
and rXY is the sample correlation coefficient between X and Y (estimating ρXY)
Insert the first point into the estimation. This is quickly solved for α.
Insert the second point and the solution for α into the estimation.
This reduced form can be quickly solved for β.
Because the correlation coefficient can be expressed in terms of covariance and standard deviations...
...the solution for β can be further reduced.
Therefore, the regression line is estimated to be:
