Differences between revisions 7 and 21 (spanning 14 versions)

Ordinary Least Squares

Ordinary Least Squares (OLS) is a linear regression method. It minimizes root mean square errors.

Contents

Ordinary Least Squares

Univariate

Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:

It is estimated as:

This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.

The derivation can be seen here.

Multivariate

Given k independent variables, the OLS model is specified as:

It is estimated as:

More conventionally, this is estimated with linear algebra as:

The derivation can be seen here.

Estimated Coefficients

If these assumptions can be made:

Linearity
Exogeneity
Random sampling
No perfect multicolinearity
Homoskedasticity

Then OLS is the best linear unbiased estimator (BLUE) for regression coefficients.

The variances for each coefficient are:

Note that the standard deviation of the population's parameter is unknown, so it's estimated like:

If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually:

Wherein, for example, r_1j is the residual from regressing x₁ onto x₂, ... x_k.

The variances for each coefficient can be estimated with the Eicker-White formula:

See Nicolai Kuminoff's video lectures for the derivation of the robust estimators.

CategoryRicottone

Statistics/OrdinaryLeastSquares (last edited 2025-09-03 02:08:40 by DominicRicottone)

-  ⇤ ← Revision 7 as of 2023-10-28 16:21:41 → 
  Size: 1381
  Editor: DominicRicottone
  Comment:
+   ← Revision 21 as of 2024-06-05 23:20:59 → ⇥
  Size: 2060
  Editor: DominicRicottone
  Comment: Simplify
-Deletions are marked like this.
+Additions are marked like this.
 Line 13:
-The regression line passes through two points:
+Given one independent variable and one dependent (outcome) variable, the OLS model is specified as:
 Line 15:
-{{attachment:regression1.svg}}
+{{attachment:model.svg}}
 Line 17:
-and
+It is estimated as:
 Line 19:
-{{attachment:regression2.svg}}
+{{attachment:estimate.svg}}
 Line 21:
-These points, with the generic equation for a line, can [[Econometrics/OrdinaryLeastSquares/UnivariateProof|prove]] that the slope of the regression line is equal to:
+This model describes (1) the mean observation and (2) the marginal changes to the outcome per unit changes in the independent variable.
 Line 23:
-{{attachment:b12.svg}}

The generic formula for the regression line is:

{{attachment:b13.svg}}
+The derivation can be seen [[Econometrics/OrdinaryLeastSquares/Univariate|here]].
-Line 33:
+Line 29:
-== Linear Model ==
+== Multivariate ==
-Line 35:
+Line 31:
-The linear model can be expressed as:
+Given ''k'' independent variables, the OLS model is specified as:
-Line 37:
+Line 33:
-{{attachment:model1.svg}}
+{{attachment:mmodel.svg}}

It is estimated as:

{{attachment:mestimate.svg}}

More conventionally, this is estimated with [[LinearAlgebra|linear algebra]] as:

{{attachment:matrix.svg}}

The derivation can be seen [[Econometrics/OrdinaryLeastSquares/Multivariate|here]].

----



== Estimated Coefficients ==
-Line 42:
+Line 54:
-. Exogeneity
+. [[Econometrics/Exogeneity|Exogeneity]]
 3. Random sampling
 4. No perfect multicolinearity
 5. [[Econometrics/Homoskedasticity|Homoskedasticity]]
-Line 44:
+Line 59:
-{{attachment:model2.svg}}
+Then OLS is the best linear unbiased estimator ('''BLUE''') for regression coefficients.
-Line 46:
+Line 61:
-.#3 Random sampling
 4. No perfect multicolinearity
 5. Heteroskedasticity
+The variances for each coefficient are:
-Line 50:
+Line 63:
-Then OLS is the best linear unbiased estimator ('''BLUE''') for these coefficients.
+{{attachment:homo1.svg}}
-Line 52:
+Line 65:
-Using the computation above, the coefficients are estimated to produce:
+Note that the standard deviation of the population's parameter is unknown, so it's estimated like:
-Line 54:
+Line 67:
-{{attachment:model3.svg}}
+{{attachment:homo2.svg}}
-Line 56:
+Line 69:
-The variance for each coefficient is estimated as:
+If the homoskedasticity assumption does not hold, then the estimators for each coefficient are actually:
-Line 58:
+Line 71:
-{{attachment:model4.svg}}
+{{attachment:hetero1.svg}}
-Line 60:
+Line 73:
-Where R^2^ is calculated as:
+Wherein, for example, ''r,,1j,,'' is the residual from regressing ''x,,1,,'' onto ''x,,2,,'', ... ''x,,k,,''.
-Line 62:
+Line 75:
-{{attachment:model5.svg}}
+The variances for each coefficient can be estimated with the Eicker-White formula:
-Line 64:
+Line 77:
-Note also that the standard deviation of the population's parameter is unknown, so it's estimated like:
+{{attachment:hetero2.svg}}
-Line 66:
+Line 79:
-{{attachment:model6.svg}}
+See [[https://www.youtube.com/@kuminoff|Nicolai Kuminoff's]] video lectures for the derivation of the robust estimators.

Diff for "Statistics/OrdinaryLeastSquares"

Ordinary Least Squares

Univariate

Multivariate

Estimated Coefficients