|
Size: 1349
Comment:
|
Size: 1689
Comment: Clarifications
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 1: | Line 1: |
| ## page was renamed from Econometrics/LinearRegression | |
| Line 14: | Line 13: |
| The regression line passes through two points: | Given one independent variable and one dependent (outcome) variable, the OLS model is specified as: |
| Line 16: | Line 15: |
| {{attachment:regression1.svg}} | {{attachment:model.svg}} |
| Line 18: | Line 17: |
| and | It is estimated as: |
| Line 20: | Line 19: |
| {{attachment:regression2.svg}} | {{attachment:estimate.svg}} |
| Line 22: | Line 21: |
| Take the generic equation form of a line: | This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable. |
| Line 24: | Line 23: |
| {{attachment:b01.svg}} | The derivation can be seen [[Statistics/OrdinaryLeastSquares/Univariate|here]]. |
| Line 26: | Line 25: |
| Insert the first point into this form. | ---- |
| Line 28: | Line 27: |
| {{attachment:b02.svg}} | |
| Line 30: | Line 28: |
| This can be trivially rewritten to solve for ''a'' in terms of ''b'': | |
| Line 32: | Line 29: |
| {{attachment:b03.svg}} | == Multivariate == |
| Line 34: | Line 31: |
| Insert the second point into the original form. | Given ''k'' independent variables, the OLS model is specified as: |
| Line 36: | Line 33: |
| {{attachment:b04.svg}} | {{attachment:mmodel.svg}} |
| Line 38: | Line 35: |
| Now additionally insert the solution for ''a'' in terms of ''b''. | It is estimated as: |
| Line 40: | Line 37: |
| {{attachment:b05.svg}} | {{attachment:mestimate.svg}} |
| Line 42: | Line 39: |
| Expand all terms to produce: | More conventionally, this is estimated with [[LinearAlgebra|linear algebra]] as: |
| Line 44: | Line 41: |
| {{attachment:b06.svg}} | {{attachment:matrix.svg}} |
| Line 46: | Line 43: |
| This can now be eliminated into: | The derivation can be seen [[Statistics/OrdinaryLeastSquares/Multivariate|here]]. |
| Line 48: | Line 45: |
| {{attachment:b07.svg}} | ---- |
| Line 50: | Line 47: |
| Giving a solution for ''b'': | |
| Line 52: | Line 48: |
| {{attachment:b08.svg}} | |
| Line 54: | Line 49: |
| This solution is trivially rewritten as: | == Estimated Coefficients == |
| Line 56: | Line 51: |
| {{attachment:b09.svg}} | The '''Gauss-Markov theorem''' demonstrates that (with some assumptions) the OLS estimations are the '''best linear unbiased estimators''' ('''BLUE''') for the regression coefficients. The assumptions are: |
| Line 58: | Line 53: |
| Expand the formula for correlation as: | 1. Linearity 2. Exogeneity, i.e. predictors are independent of the outcome and the error term 3. Random sampling 4. No perfect [[LinearAlgebra/Basis|multicolinearity]] 5. Homoskedasticity, i.e. error terms are constant across observations |
| Line 60: | Line 59: |
| {{attachment:b10.svg}} This can now be eliminated into: {{attachment:b11.svg}} Finally, ''b'' can be eloquently written as: {{attachment:b12.svg}} Giving a generic formula for the regression line: {{attachment:b13.svg}} |
#5 mostly comes into the estimation of [[Statistics/StandardErrors|standard errors]], and there are alternative estimators that are robust to heteroskedasticity. |
Ordinary Least Squares
Ordinary Least Squares (OLS) is a linear regression method. It minimizes root mean square errors.
Univariate
Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:
It is estimated as:
This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.
The derivation can be seen here.
Multivariate
Given k independent variables, the OLS model is specified as:
It is estimated as:
More conventionally, this is estimated with linear algebra as:
The derivation can be seen here.
Estimated Coefficients
The Gauss-Markov theorem demonstrates that (with some assumptions) the OLS estimations are the best linear unbiased estimators (BLUE) for the regression coefficients. The assumptions are:
- Linearity
- Exogeneity, i.e. predictors are independent of the outcome and the error term
- Random sampling
No perfect multicolinearity
- Homoskedasticity, i.e. error terms are constant across observations
#5 mostly comes into the estimation of standard errors, and there are alternative estimators that are robust to heteroskedasticity.
