Differences between revisions 18 and 26 (spanning 8 versions)

Ordinary Least Squares

Ordinary Least Squares (OLS) is a linear regression method. It minimizes root mean square errors.

Contents

Ordinary Least Squares

Univariate

Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:

It is estimated as:

This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.

The derivation can be seen here.

Multivariate

Given k independent variables, the OLS model is specified as:

It is estimated as:

More conventionally, this is estimated with linear algebra as:

The derivation can be seen here.

Estimated Coefficients

The Gauss-Markov theorem demonstrates that (with some assumptions) the OLS estimations are the best linear unbiased estimators (BLUE) for the regression coefficients. The assumptions are:

Linearity
Exogeneity, i.e. predictors are independent of the outcome and the error term
Random sampling
No perfect multicolinearity
Homoskedasticity, i.e. error terms are constant across observations

#5 mostly comes into the estimation of standard errors, and there are alternative estimators that are robust to heteroskedasticity.

CategoryRicottone

Statistics/OrdinaryLeastSquares (last edited 2025-05-17 03:48:23 by DominicRicottone)

-  ⇤ ← Revision 18 as of 2024-06-05 21:29:24 → 
  Size: 1866
  Editor: DominicRicottone
  Comment: Rewrite
+   ← Revision 26 as of 2025-05-17 03:48:23 → ⇥
  Size: 1689
  Editor: DominicRicottone
  Comment: Clarifications
-Deletions are marked like this.
+Additions are marked like this.
 Line 23:
-The proof can be seen [[Econometrics/OrdinaryLeastSquares/UnivariateProof|here]].
+The derivation can be seen [[Statistics/OrdinaryLeastSquares/Univariate|here]].
 Line 31:
+Given ''k'' independent variables, the OLS model is specified as:

{{attachment:mmodel.svg}}

It is estimated as:

{{attachment:mestimate.svg}}

More conventionally, this is estimated with [[LinearAlgebra|linear algebra]] as:

{{attachment:matrix.svg}}

The derivation can be seen [[Statistics/OrdinaryLeastSquares/Multivariate|here]].
-Line 35:
+Line 49:
-== Linear Model ==
+== Estimated Coefficients ==
-Line 37:
+Line 51:
-The linear model can be expressed as:

{{attachment:model1.svg}}

If these assumptions can be made:
+The '''Gauss-Markov theorem''' demonstrates that (with some assumptions) the OLS estimations are the '''best linear unbiased estimators''' ('''BLUE''') for the regression coefficients. The assumptions are:
-Line 44:
+Line 54:
-. [[Econometrics/Exogeneity|Exogeneity]]
+. Exogeneity, i.e. predictors are independent of the outcome and the error term
-Line 46:
+Line 56:
-. No perfect multicolinearity
 5. [[Econometrics/Homoskedasticity|Homoskedasticity]]
+. No perfect [[LinearAlgebra/Basis|multicolinearity]]
 5. Homoskedasticity, i.e. error terms are constant across observations
-Line 49:
+Line 59:
-Then OLS is the best linear unbiased estimator ('''BLUE''') for these coefficients.

Using the computation above, the coefficients are estimated to produce:

{{attachment:model2.svg}}

The variances for each coefficient are:

{{attachment:homo1.svg}}

Note that the standard deviation of the population's parameter is unknown, so it's estimated like:

{{attachment:homo2.svg}}

If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually:

{{attachment:hetero1.svg}}

Wherein, for example, ''r,,1j,,'' is the residual from regressing ''x,,1,,'' onto ''x,,2,,'', ... ''x,,k,,''.

The variances for each coefficient can be estimated with the Eicker-White formula:

{{attachment:hetero2.svg}}

See [[https://www.youtube.com/@kuminoff|Nicolai Kuminoff's]] video lectures for the derivation of the robust estimators.
+#5 mostly comes into the estimation of [[Statistics/StandardErrors|standard errors]], and there are alternative estimators that are robust to heteroskedasticity.

Diff for "Statistics/OrdinaryLeastSquares"

Ordinary Least Squares

Univariate

Multivariate

Estimated Coefficients