Differences between revisions 2 and 3

Standard Errors

Standard errors are the standard deviations of estimated coefficients.

Contents

Standard Errors

Description

In the classical OLS model,, estimated coefficients are:

univariate case:
multivariate case:

Standard errors are the standard deviations of these coefficients.

Classical

Univariate

In the univariate case, standard errors are classically specified as:

Var(\hat(\beta)|X_i) = \frac{\sum_{i=1}^{n Var((X_i-\bar{X})\hat{\epsilon}_i)}{(\sum_{i=1}}n(X_i-\bar{X})²⁾2}

Supposing the population Var(β) is known, this can be simplified.

Var(\hat{\beta}|X_i) = \frac{Var(\beta)(\sum_{i=1}^{n(x_i-\bar{X})}2)}{(\sum_{i=1}^{n(X_i-\bar{X})}2)^{2} = \frac{Var(\beta)}{\sum_{i=1}}n(X_i-\bar{X})^2}

Lastly, rewrite the denominator in terms of Var(X).

Var(\hat{\beta}|X_i) = \frac{Var(\beta)}{n (\frac{1}{n}\sum_{i=1}^{n(X_i-\bar{X})}2)} = \frac{Var(\beta)}{n Var(X)}

Var(β) is unknown, so this term is estimated as:

\hat{\epsilon}_i = Y_i - \hat{Y}_i

Var(\hat{\epsilon}) = \frac{1}{n-1}\sum_{i=1}^{n(\hat{\epsilon}_i}2)

1 degree of freedom is lost in assuming homoskedasticity of errors, i.e.

k degrees of freedom are lost in assuming independence of errors and k independent variables, which is necessarily 1 in the univariate case, i.e.:

\sum_{i=1}^nX_i\hat{\epsilon}_i = 0

This arrives at estimation as:

\hat{Var}(\hat{\beta}|X_i) = \frac{\frac{1}{n-2}Var(\hat{\epsilon})}{n Var(X)}

Multivariate

The classical multivariate specification is expressed in terms of (b-β), as:

Var(\mathbf{b} | \mathbf{X}) = E\Bigl[(\mathbf{b}-\mathbf{\beta})(\mathbf{b}-\mathbf{\beta})^T \Big| \mathbf{X}\Bigr]

That term is rewritten as (X^TX)^-1Xu.

Var(\mathbf{b} | \mathbf{X}) = E\Bigl[\bigl((\mathbf{X}^T\mathbf{X}){-1}\mathbf{X}^{T\mathbf{u}\bigr)\bigl((\mathbf{X}}T\mathbf{X})^{{-1}\mathbf{X}}T\mathbf{u}\bigr)^{{T} \Big| \mathbf{X}\Bigr] = E\Bigl[(\mathbf{X}}T\mathbf{X})^{{-1}\mathbf{X}}T\mathbf{u}\mathbf{u}^{T\mathbf{X}(\mathbf{X}}T\mathbf{X})^{-1} \Big| \mathbf{X}\Bigr]

Var(\mathbf{b} | \mathbf{X}) = (\mathbf{X}^T\mathbf{X}){-1}\mathbf{X}^{T E\bigl[\mathbf{u}\mathbf{u}}T\big|\mathbf{X}\bigr]\mathbf{X}(\mathbf{X}^T\mathbf{X}){-1}

E[uu^T|X] is not a practical matrix to work with. But if homoskedasticity and independence are assumed, i.e.:

E\bigl[\mathbf{u}\mathbf{u}^T\big|\mathbf{X}\bigr] = Var(\mathbf{\beta})\mathbf{I}_n

...then this simplifies to:

Var(\mathbf{b} | \mathbf{X}) = Var(\mathbf{\beta}) (\mathbf{X}^T\mathbf{X}){-1}

Var(β) is unknown, so the estimate is:

\hat{Var}(\mathbf{b} | \mathbf{X}) = \frac{1}{1-k} \mathbf{u}^{T\mathbf{u} (\mathbf{X}}T\mathbf{X})^{-1}

Robust

In the presence of heteroskedasticity of errors, the above simplifications cannot apply. In the univariate case, use the original estimator.

This is mostly interesting in the multivariate case, where E[uu^T|X] is still not practical. The assumptions made, when incorrect, lead to...

OLS estimators are not BLUE
- they are unbiased, but no longer most efficient in terms of MSE
nonlinear GLMs, such as logit, can be biased
even if the model's estimates are unbiased, statistics derived from those estimates (e.g., conditioned probability distributions) can be biased

Eicker-Huber-White heterskedasticity consistent errors (HCE) assume that errors are still independent but allowed to vary, i.e. Σ = diag(ε₁,...ε_n). Importantly, this is not a function of X, so the standard errors can be estimated as:

\hat{Var}(\mathbf{b} | \mathbf{X}) = (\mathbf{X}^T\mathbf{X}){-1}\mathbf{X}^{T \mathbf{\Sigma} \mathbf{X}(\mathbf{X}}T\mathbf{X})^{-1}

Note however that heterskedasticity consistent errors are not always appropriate. To reiterate, for OLS, classical estimators are not biased even given heteroskedasticity; if the model changes with introduction of robust standard errors, there must be a specification error. Furthermore, heterskedasticity consistent errors are asymptotically unbiased; they can be biased for small n.

Statistics/StandardErrors (last edited 2025-05-26 21:15:15 by DominicRicottone)

-  ⇤ ← Revision 2 as of 2025-05-16 20:57:28 → 
  Size: 2897
  Editor: DominicRicottone
  Comment: Some attachments
+   ← Revision 3 as of 2025-05-16 21:22:54 → ⇥
  Size: 4340
  Editor: DominicRicottone
  Comment: More content...
-Deletions are marked like this.
+Additions are marked like this.
 Line 69:
-Practically speaking, ''E['''uu'''^T^|'''X''']'' is never known. But if homoskedasticity and independence are assumed, i.e.:
+''E['''uu'''^T^|'''X''']'' is not a practical matrix to work with. But if homoskedasticity and independence are assumed, i.e.:
 Line 80:
+----



== Robust ==

In the presence of heteroskedasticity of errors, the above simplifications cannot apply. In the univariate case, use the original estimator.

This is mostly interesting in the multivariate case, where ''E['''uu'''^T^|'''X''']'' is still not practical. The assumptions made, when incorrect, lead to...
 * OLS estimators are not BLUE
   * they are unbiased, but no longer most efficient in terms of MSE
 * nonlinear GLMs, such as logit, can be biased
 * even if the model's estimates are unbiased, statistics derived from those estimates (e.g., conditioned probability distributions) can be biased

'''Eicker-Huber-White heterskedasticity consistent errors''' ('''HCE''') assume that errors are still independent but allowed to vary, i.e. '''''Σ''' = diag(ε,,1,,,...ε,,n,,)''. Importantly, this is not a function of '''''X''', so the standard errors can be estimated as:

\hat{Var}(\mathbf{b} | \mathbf{X}) = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T \mathbf{\Sigma} \mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1}

Note however that heterskedasticity consistent errors are not always appropriate. To reiterate, for OLS, classical estimators are not biased even given heteroskedasticity; if the model changes with introduction of robust standard errors, there must be a specification error. Furthermore, heterskedasticity consistent errors are asymptotically unbiased; they can be biased for small ''n''.

Diff for "Statistics/StandardErrors"