Size: 1390
Comment:
|
Size: 2224
Comment: Killing Econometrics page
|
Deletions are marked like this. | Additions are marked like this. |
Line 1: | Line 1: |
= Linear Regression = | ## page was renamed from Econometrics/OrdinaryLeastSquares = Ordinary Least Squares = |
Line 3: | Line 4: |
A linear regression expresses the linear relation of a treatment variable to an outcome variable. | '''Ordinary Least Squares''' ('''OLS''') is a linear regression method. It minimizes root mean square errors. |
Line 11: | Line 12: |
== Regression Line == | == Univariate == |
Line 13: | Line 14: |
A regression line can be especially useful on a scatter plot. | Given one independent variable and one dependent (outcome) variable, the OLS model is specified as: |
Line 15: | Line 16: |
The regression line passes through two points: | {{attachment:model.svg}} |
Line 17: | Line 18: |
{{attachment:regression1.svg}} | It is estimated as: |
Line 19: | Line 20: |
and | {{attachment:estimate.svg}} |
Line 21: | Line 22: |
{{attachment:regression2.svg}} | This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable. The derivation can be seen [[Econometrics/OrdinaryLeastSquares/Univariate|here]]. |
Line 27: | Line 30: |
== Regression Computation == | == Multivariate == |
Line 29: | Line 32: |
Take the generic equation form of a line: | Given ''k'' independent variables, the OLS model is specified as: |
Line 31: | Line 34: |
{{attachment:b01.svg}} | {{attachment:mmodel.svg}} |
Line 33: | Line 36: |
Insert the first point into this form. | It is estimated as: |
Line 35: | Line 38: |
{{attachment:b02.svg}} | {{attachment:mestimate.svg}} |
Line 37: | Line 40: |
This can be trivially rewritten to solve for ''a'' in terms of ''b'': | More conventionally, this is estimated with [[LinearAlgebra|linear algebra]] as: |
Line 39: | Line 42: |
{{attachment:b03.svg}} | {{attachment:matrix.svg}} |
Line 41: | Line 44: |
Insert the second point into the original form. | The derivation can be seen [[Econometrics/OrdinaryLeastSquares/Multivariate|here]]. |
Line 43: | Line 46: |
{{attachment:b04.svg}} | ---- |
Line 45: | Line 48: |
Now additionally insert the solution for ''a'' in terms of ''b''. | |
Line 47: | Line 49: |
{{attachment:b05.svg}} | |
Line 49: | Line 50: |
Expand all terms to produce: | == Estimated Coefficients == |
Line 51: | Line 52: |
{{attachment:b06.svg}} | The '''Gauss-Markov theorem''' demonstrates that (with some assumptions) the OLS estimations are the '''best linear unbiased estimators''' ('''BLUE''') for the regression coefficients. The assumptions are: |
Line 53: | Line 54: |
This can now be eliminated into: | 1. Linearity 2. [[Econometrics/Exogeneity|Exogeneity]] 3. Random sampling 4. No perfect [[LinearAlgebra/Basis|multicolinearity]] 5. [[Econometrics/Homoskedasticity|Homoskedasticity]] |
Line 55: | Line 60: |
{{attachment:b07.svg}} | The variances for each coefficient are: |
Line 57: | Line 62: |
Giving a solution for ''b'': | {{attachment:homo1.svg}} |
Line 59: | Line 64: |
{{attachment:b08.svg}} | Note that the standard deviation of the population's parameter is unknown, so it's estimated like: |
Line 61: | Line 66: |
This solution is trivially rewritten as: | {{attachment:homo2.svg}} |
Line 63: | Line 68: |
{{attachment:b09.svg}} | If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually: |
Line 65: | Line 70: |
Expand the formula for correlation as: | {{attachment:hetero1.svg}} |
Line 67: | Line 72: |
{{attachment:b10.svg}} | Wherein, for example, ''r,,1j,,'' is the residual from regressing ''x,,1,,'' onto ''x,,2,,'', ... ''x,,k,,''. |
Line 69: | Line 74: |
This can now be eliminated into: | The variances for each coefficient can be estimated with the Eicker-White formula: |
Line 71: | Line 76: |
{{attachment:b11.svg}} | {{attachment:hetero2.svg}} |
Line 73: | Line 78: |
Finally, ''b'' can be eloquently written as: {{attachment:b12.svg}} Giving a generic formula for the regression line: {{attachment:b13.svg}} |
See [[https://www.youtube.com/@kuminoff|Nicolai Kuminoff's]] video lectures for the derivation of the robust estimators. |
Ordinary Least Squares
Ordinary Least Squares (OLS) is a linear regression method. It minimizes root mean square errors.
Univariate
Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:
It is estimated as:
This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.
The derivation can be seen here.
Multivariate
Given k independent variables, the OLS model is specified as:
It is estimated as:
More conventionally, this is estimated with linear algebra as:
The derivation can be seen here.
Estimated Coefficients
The Gauss-Markov theorem demonstrates that (with some assumptions) the OLS estimations are the best linear unbiased estimators (BLUE) for the regression coefficients. The assumptions are:
- Linearity
- Random sampling
No perfect multicolinearity
The variances for each coefficient are:
Note that the standard deviation of the population's parameter is unknown, so it's estimated like:
If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually:
Wherein, for example, r1j is the residual from regressing x1 onto x2, ... xk.
The variances for each coefficient can be estimated with the Eicker-White formula:
See Nicolai Kuminoff's video lectures for the derivation of the robust estimators.