Differences between revisions 5 and 18 (spanning 13 versions)
Revision 5 as of 2023-10-28 07:02:07
Size: 2051
Comment: Added model
Revision 18 as of 2024-06-05 21:29:24
Size: 1866
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 13: Line 13:
The regression line passes through two points: Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:
Line 15: Line 15:
{{attachment:regression1.svg}} {{attachment:model.svg}}
Line 17: Line 17:
and It is estimated as:
Line 19: Line 19:
{{attachment:regression2.svg}} {{attachment:estimate.svg}}
Line 21: Line 21:
Take the generic equation form of a line: This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.
Line 23: Line 23:
{{attachment:b01.svg}} The proof can be seen [[Econometrics/OrdinaryLeastSquares/UnivariateProof|here]].
Line 25: Line 25:
Insert the first point into this form. ----
Line 27: Line 27:
{{attachment:b02.svg}}
Line 29: Line 28:
This can be trivially rewritten to solve for ''a'' in terms of ''b'':
Line 31: Line 29:
{{attachment:b03.svg}}

Insert the second point into the original form.

{{attachment:b04.svg}}

Now additionally insert the solution for ''a'' in terms of ''b''.

{{attachment:b05.svg}}

Expand all terms to produce:

{{attachment:b06.svg}}

This can now be eliminated into:

{{attachment:b07.svg}}

Giving a solution for ''b'':

{{attachment:b08.svg}}

This solution is trivially rewritten as:

{{attachment:b09.svg}}

Expand the formula for correlation as:

{{attachment:b10.svg}}

This can now be eliminated into:

{{attachment:b11.svg}}

Finally, ''b'' can be eloquently written as:

{{attachment:b12.svg}}

Giving a generic formula for the regression line:

{{attachment:b13.svg}}
== Multivariate ==
Line 86: Line 44:
 2. Exogeneity

{{attachm
ent:model2.svg}}

 3.#3 Random sampling
 2. [[Econometrics/Exogeneity|Exogeneity]]
 3. Random sampling
Line 92: Line 47:
 5. Heteroskedasticity  5. [[Econometrics/Homoskedasticity|Homoskedasticity]]
Line 98: Line 53:
{{attachment:model3.svg}} {{attachment:model2.svg}}
Line 100: Line 55:
The variance for each coefficient is estimated as: The variances for each coefficient are:
Line 102: Line 57:
{{attachment:model4.svg}} {{attachment:homo1.svg}}
Line 104: Line 59:
Where R^^2^^ is calculated as: Note that the standard deviation of the population's parameter is unknown, so it's estimated like:
Line 106: Line 61:
{{attachment:model5.svg}} {{attachment:homo2.svg}}
Line 108: Line 63:
Note also that the standard deviation of the population's parameter is unknown, so it's estimated like: If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually:
Line 110: Line 65:
{{attachment:model6.svg}} {{attachment:hetero1.svg}}

Wherein, for example, ''r,,1j,,'' is the residual from regressing ''x,,1,,'' onto ''x,,2,,'', ... ''x,,k,,''.

The variances for each coefficient can be estimated with the Eicker-White formula:

{{attachment:hetero2.svg}}

See [[https://www.youtube.com/@kuminoff|Nicolai Kuminoff's]] video lectures for the derivation of the robust estimators.

Ordinary Least Squares

Ordinary Least Squares (OLS) is a linear regression method. It minimizes root mean square errors.


Univariate

Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:

model.svg

It is estimated as:

estimate.svg

This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.

The proof can be seen here.


Multivariate


Linear Model

The linear model can be expressed as:

model1.svg

If these assumptions can be made:

  1. Linearity
  2. Exogeneity

  3. Random sampling
  4. No perfect multicolinearity
  5. Homoskedasticity

Then OLS is the best linear unbiased estimator (BLUE) for these coefficients.

Using the computation above, the coefficients are estimated to produce:

model2.svg

The variances for each coefficient are:

[ATTACH]

Note that the standard deviation of the population's parameter is unknown, so it's estimated like:

[ATTACH]

If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually:

[ATTACH]

Wherein, for example, r1j is the residual from regressing x1 onto x2, ... xk.

The variances for each coefficient can be estimated with the Eicker-White formula:

[ATTACH]

See Nicolai Kuminoff's video lectures for the derivation of the robust estimators.


CategoryRicottone

Statistics/OrdinaryLeastSquares (last edited 2025-09-03 02:08:40 by DominicRicottone)