Size: 2959
Comment: Initial commit
|
Size: 3486
Comment: Clarification
|
Deletions are marked like this. | Additions are marked like this. |
Line 14: | Line 14: |
* univariate case: \hat{\beta} = \frac{Cov(X,Y)}{Var(X)} * multivariate case: \mathbf{b} = (\mathbf(X)^T\mathbf(X))^{-1}\mathbf(X)^T\mathbf(y) |
* univariate case: {{attachment:coef1.svg}} * multivariate case: {{attachment:coef2.svg}} |
Line 29: | Line 29: |
Var(\hat(\beta)|X_i) = \frac{\sum_{i=1}^n Var((X_i-\bar{X})\hat{\epsilon}_i)}{(\sum_{i=1}^n(X_i-\bar{X})^2)^2} | {{attachment:unispec1.svg}} |
Line 31: | Line 31: |
Supposing the population ''Var(β)'' is known, this can be simplified. | Supposing the population ''Var(ε)'' is known and errors are homoskedastic, i.e. they are constant across all cases, this can be simplified. |
Line 33: | Line 33: |
Var(\hat{\beta}|X_i) = \frac{Var(\beta)(\sum_{i=1}^n(x_i-\bar{X})^2)}{(\sum_{i=1}^n(X_i-\bar{X})^2)^2} = \frac{Var(\beta)}{\sum_{i=1}^n(X_i-\bar{X})^2} | {{attachment:unispec2.svg}} |
Line 37: | Line 37: |
Var(\hat{\beta}|X_i) = \frac{Var(\beta)}{n (\frac{1}{n}\sum_{i=1}^n(X_i-\bar{X})^2)} = \frac{Var(\beta)}{n Var(X)} | {{attachment:unispec3.svg}} |
Line 39: | Line 39: |
''Var(β)'' is unknown, so this term is estimated as: | ''Var(ε)'' is unknown, so this term is estimated as: |
Line 41: | Line 41: |
\hat{\epsilon}_i = Y_i - \hat{Y}_i | {{attachment:uniest1.svg}}, {{attachment:uniest2.svg}} |
Line 43: | Line 43: |
Var(\hat{\epsilon}) = \frac{1}{n-1}\sum_{i=1}^n(\hat{\epsilon}_i^2) 1 degree of freedom is lost in assuming homoskedasticity of errors, i.e.: \sum_{i=1}^n\hat{\epsilon}_i = 0 ''k'' degrees of freedom are lost in assuming independence of errors and ''k'' independent variables, which is necessarily 1 in the univariate case, i.e.: \sum_{i=1}^nX_i\hat{\epsilon}_i = 0 |
1 degree of freedom is lost in assuming homoskedasticity of errors, i.e. {{attachment:homosked.svg}}; and ''k'' degrees of freedom are lost in assuming independence of errors and ''k'' independent variables, which is necessarily 1 in the univariate case, i.e.: {{attachment:ind.svg}} |
Line 55: | Line 47: |
\hat{Var}(\hat{\beta}|X_i) = \frac{\frac{1}{n-2}Var(\hat{\epsilon})}{n Var(X)} | {{attachment:uniest3.svg}} |
Line 63: | Line 55: |
Var(\mathbf{b} | \mathbf{X}) = E\Bigl[(\mathbf{b}-\mathbf{\beta})(\mathbf{b}-\mathbf{\beta})^T \Big| \mathbf{X}\Bigr] | {{attachment:multspec1.svg}} |
Line 65: | Line 57: |
That term is rewritten as ''('''X'''^T^'''X''')^-1^'''X'''u''. | That term is rewritten as ''('''X'''^T^'''X''')^-1^'''Xε'''''. |
Line 67: | Line 59: |
Var(\mathbf{b} | \mathbf{X}) = E\Bigl[\bigl((\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{u}\bigr)\bigl((\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{u}\bigr)^{T} \Big| \mathbf{X}\Bigr] = E\Bigl[(\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T\mathbf{u}\mathbf{u}^T\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1} \Big| \mathbf{X}\Bigr] | {{attachment:multspec2.svg}} |
Line 69: | Line 61: |
Var(\mathbf{b} | \mathbf{X}) = (\mathbf{X}^T\mathbf{X})^{-1}\mathbf{X}^T E\bigl[\mathbf{u}\mathbf{u}^T\big|\mathbf{X}\bigr]\mathbf{X}(\mathbf{X}^T\mathbf{X})^{-1} | {{attachment:multspec3.svg}} |
Line 71: | Line 63: |
Practically speaking, ''E['''uu'''^T^|'''X''']'' is never known. But if homoskedasticity and independence are assumed, i.e.: | ''E['''εε'''^T^|'''X''']'' is not a practical matrix to work with, even if known. But if homoskedasticity and independence are assumed, i.e.: {{attachment:homosked_ind.svg}}, then this simplifies to: |
Line 73: | Line 65: |
E\bigl[\mathbf{u}\mathbf{u}^T\big|\mathbf{X}\bigr] = Var(\mathbf{\beta})\mathbf{I}_n | {{attachment:multspec4.svg}} |
Line 75: | Line 67: |
...then this simplifies to: | ''s^2^'' is unknown, so this term is estimated as: |
Line 77: | Line 69: |
Var(\mathbf{b} | \mathbf{X}) = Var(\mathbf{\beta}) (\mathbf{X}^T\mathbf{X})^{-1} | {{attachment:multspec5.svg}} |
Line 79: | Line 71: |
''Var(β)'' is unknown, so the estimate is: | This arrives at estimation as: |
Line 81: | Line 73: |
\hat{Var}(\mathbf{b} | \mathbf{X}) = \frac{1}{1-k} \mathbf{u}^T\mathbf{u} (\mathbf{X}^T\mathbf{X})^{-1} | {{attachment:multspec6.svg}} ---- == Robust == In the presence of heteroskedasticity of errors, the above simplifications cannot apply. In the univariate case, use the original estimator. This is mostly interesting in the multivariate case, where ''E['''εε'''^T^|'''X''']'' is still not practical. The assumptions made, when incorrect, lead to... * OLS estimators are not BLUE * they are unbiased, but no longer most efficient in terms of MSE * nonlinear GLMs, such as logit, can be biased * even if the model's estimates are unbiased, statistics derived from those estimates (e.g., conditioned probability distributions) can be biased '''Eicker-Huber-White heterskedasticity consistent errors''' ('''HCE''') assume that errors are still independent but allowed to vary, i.e. '''''Σ''' = diag(ε,,1,,,...ε,,n,,)''. Importantly, this is not a function of '''''X''''', so the standard errors can be estimated as: {{attachment:robust.svg}} Note however that heterskedasticity consistent errors are not always appropriate. To reiterate, for OLS, classical estimators are not biased even given heteroskedasticity; if the model changes with introduction of robust standard errors, there must be a specification error. (For example, an omitted variable leads to heteroskedastic-appearing errors. The variance that could have been explained devolves to the error term, and usually is not constant.) Furthermore, heterskedasticity consistent errors are asymptotically unbiased; they can be biased for small ''n''. ---- CategoryRicottone |
Standard Errors
Standard errors are the standard deviations of estimated coefficients.
Description
In the classical OLS model,, estimated coefficients are:
univariate case:
multivariate case:
Standard errors are the standard deviations of these coefficients.
Classical
Univariate
In the univariate case, standard errors are classically specified as:
Supposing the population Var(ε) is known and errors are homoskedastic, i.e. they are constant across all cases, this can be simplified.
Lastly, rewrite the denominator in terms of Var(X).
Var(ε) is unknown, so this term is estimated as:
,
1 degree of freedom is lost in assuming homoskedasticity of errors, i.e. ; and k degrees of freedom are lost in assuming independence of errors and k independent variables, which is necessarily 1 in the univariate case, i.e.:
This arrives at estimation as:
Multivariate
The classical multivariate specification is expressed in terms of (b-β), as:
That term is rewritten as (XTX)-1Xε.
E[εεT|X] is not a practical matrix to work with, even if known. But if homoskedasticity and independence are assumed, i.e.: , then this simplifies to:
s2 is unknown, so this term is estimated as:
This arrives at estimation as:
Robust
In the presence of heteroskedasticity of errors, the above simplifications cannot apply. In the univariate case, use the original estimator.
This is mostly interesting in the multivariate case, where E[εεT|X] is still not practical. The assumptions made, when incorrect, lead to...
- OLS estimators are not BLUE
- they are unbiased, but no longer most efficient in terms of MSE
- nonlinear GLMs, such as logit, can be biased
- even if the model's estimates are unbiased, statistics derived from those estimates (e.g., conditioned probability distributions) can be biased
Eicker-Huber-White heterskedasticity consistent errors (HCE) assume that errors are still independent but allowed to vary, i.e. Σ = diag(ε1,...εn). Importantly, this is not a function of X, so the standard errors can be estimated as:
Note however that heterskedasticity consistent errors are not always appropriate. To reiterate, for OLS, classical estimators are not biased even given heteroskedasticity; if the model changes with introduction of robust standard errors, there must be a specification error. (For example, an omitted variable leads to heteroskedastic-appearing errors. The variance that could have been explained devolves to the error term, and usually is not constant.) Furthermore, heterskedasticity consistent errors are asymptotically unbiased; they can be biased for small n.