|
Size: 1293
Comment:
|
Size: 1866
Comment: Rewrite
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 13: | Line 13: |
| The regression line passes through two points: | Given one independent variable and one dependent (outcome) variable, the OLS model is specified as: |
| Line 15: | Line 15: |
| {{attachment:regression1.svg}} | {{attachment:model.svg}} |
| Line 17: | Line 17: |
| and | It is estimated as: |
| Line 19: | Line 19: |
| {{attachment:regression2.svg}} | {{attachment:estimate.svg}} |
| Line 21: | Line 21: |
| Take the generic equation form of a line: | This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable. |
| Line 23: | Line 23: |
| {{attachment:b01.svg}} | The proof can be seen [[Econometrics/OrdinaryLeastSquares/UnivariateProof|here]]. |
| Line 25: | Line 25: |
| Insert the first point into this form. | ---- |
| Line 27: | Line 27: |
| {{attachment:b02.svg}} | |
| Line 29: | Line 28: |
| This can be trivially rewritten to solve for ''a'' in terms of ''b'': | |
| Line 31: | Line 29: |
| {{attachment:b03.svg}} | == Multivariate == |
| Line 33: | Line 31: |
| Insert the second point into the original form. | ---- |
| Line 35: | Line 33: |
| {{attachment:b04.svg}} | |
| Line 37: | Line 34: |
| Now additionally insert the solution for ''a'' in terms of ''b''. | |
| Line 39: | Line 35: |
| {{attachment:b05.svg}} | == Linear Model == |
| Line 41: | Line 37: |
| Expand all terms to produce: | The linear model can be expressed as: |
| Line 43: | Line 39: |
| {{attachment:b06.svg}} | {{attachment:model1.svg}} |
| Line 45: | Line 41: |
| This can now be eliminated into: | If these assumptions can be made: |
| Line 47: | Line 43: |
| {{attachment:b07.svg}} | 1. Linearity 2. [[Econometrics/Exogeneity|Exogeneity]] 3. Random sampling 4. No perfect multicolinearity 5. [[Econometrics/Homoskedasticity|Homoskedasticity]] |
| Line 49: | Line 49: |
| Giving a solution for ''b'': | Then OLS is the best linear unbiased estimator ('''BLUE''') for these coefficients. |
| Line 51: | Line 51: |
| {{attachment:b08.svg}} | Using the computation above, the coefficients are estimated to produce: |
| Line 53: | Line 53: |
| This solution is trivially rewritten as: | {{attachment:model2.svg}} |
| Line 55: | Line 55: |
| {{attachment:b09.svg}} | The variances for each coefficient are: |
| Line 57: | Line 57: |
| Expand the formula for correlation as: | {{attachment:homo1.svg}} |
| Line 59: | Line 59: |
| {{attachment:b10.svg}} | Note that the standard deviation of the population's parameter is unknown, so it's estimated like: |
| Line 61: | Line 61: |
| This can now be eliminated into: | {{attachment:homo2.svg}} |
| Line 63: | Line 63: |
| {{attachment:b11.svg}} | If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually: |
| Line 65: | Line 65: |
| Finally, ''b'' can be eloquently written as: | {{attachment:hetero1.svg}} |
| Line 67: | Line 67: |
| {{attachment:b12.svg}} | Wherein, for example, ''r,,1j,,'' is the residual from regressing ''x,,1,,'' onto ''x,,2,,'', ... ''x,,k,,''. |
| Line 69: | Line 69: |
| Giving a generic formula for the regression line: | The variances for each coefficient can be estimated with the Eicker-White formula: |
| Line 71: | Line 71: |
| {{attachment:b13.svg}} | {{attachment:hetero2.svg}} See [[https://www.youtube.com/@kuminoff|Nicolai Kuminoff's]] video lectures for the derivation of the robust estimators. |
Ordinary Least Squares
Ordinary Least Squares (OLS) is a linear regression method. It minimizes root mean square errors.
Univariate
Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:
It is estimated as:
This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.
The proof can be seen here.
Multivariate
Linear Model
The linear model can be expressed as:
If these assumptions can be made:
- Linearity
- Random sampling
- No perfect multicolinearity
Then OLS is the best linear unbiased estimator (BLUE) for these coefficients.
Using the computation above, the coefficients are estimated to produce:
The variances for each coefficient are:
Note that the standard deviation of the population's parameter is unknown, so it's estimated like:
If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually:
Wherein, for example, r1j is the residual from regressing x1 onto x2, ... xk.
The variances for each coefficient can be estimated with the Eicker-White formula:
See Nicolai Kuminoff's video lectures for the derivation of the robust estimators.
