= Standard Errors = '''Standard errors''' are the standard deviations of estimated coefficients. <> ---- == Description == The standard error of some estimate is the variance of that estimate divided by the square root of the sample size. One common use of standard errors is to estimate [[Statistics/MarginsOfError|margins of error]]. For a [[Statistics/BernoulliDistribution|Bernoulli-distributed]] variable, the standard error is ''p(1-p)'' and is maximized at ''p=0.5''. Therefore a conservative standard error is a function of only the sample size. Standard errors are also used in interpreting the estimated coefficients of a regression model. As a reminder, by classical [[Statistics/OrdinaryLeastSquares|OLS]], estimated coefficients are: * univariate case: {{attachment:coef1.svg}} * multivariate case: {{attachment:coef2.svg}} But specific regressions methods require assumptions about variance. Standard errors in this context are much more complicated. ---- == Classical == === Univariate === In the univariate case, standard errors are classically specified as: {{attachment:unispec1.svg}} Supposing the population ''Var(ε)'' is known and errors are homoskedastic, i.e. they are constant across all cases, this can be simplified. {{attachment:unispec2.svg}} Lastly, rewrite the denominator in terms of ''Var(X)''. {{attachment:unispec3.svg}} ''Var(ε)'' is unknown, so this term is estimated as: {{attachment:uniest1.svg}}, {{attachment:uniest2.svg}} 1 degree of freedom is lost in assuming homoskedasticity of errors, i.e. {{attachment:homosked.svg}}; and ''k'' degrees of freedom are lost in assuming independence of errors and ''k'' independent variables, which is necessarily 1 in the univariate case, i.e.: {{attachment:ind.svg}} This arrives at estimation as: {{attachment:uniest3.svg}} === Multivariate === The classical multivariate specification is expressed in terms of ''('''b'''-β)'', as: {{attachment:multspec1.svg}} That term is rewritten as ''('''X'''^T^'''X''')^-1^'''Xε'''''. {{attachment:multspec2.svg}} {{attachment:multspec3.svg}} ''E['''εε'''^T^|'''X''']'' is not a practical matrix to work with, even if known. But if homoskedasticity and independence are assumed, i.e.: {{attachment:homosked_ind.svg}}, then this simplifies to: {{attachment:multspec4.svg}} ''s^2^'' is unknown, so this term is estimated as: {{attachment:multspec5.svg}} This arrives at estimation as: {{attachment:multspec6.svg}} ---- == Robust == In the presence of heteroskedasticity of errors, the above simplifications cannot apply. In the univariate case, use the original estimator. This is mostly interesting in the multivariate case, where ''E['''εε'''^T^|'''X''']'' is still not practical. The assumptions made, when incorrect, lead to... * OLS estimators are not BLUE * they are unbiased, but no longer most efficient in terms of MSE * nonlinear GLMs, such as logit, can be biased * even if the model's estimates are unbiased, statistics derived from those estimates (e.g., conditioned probability distributions) can be biased '''Eicker-Huber-White heterskedasticity consistent errors''' ('''HCE''') assume that errors are still independent but allowed to vary, i.e. '''''Σ''' = diag(ε,,1,,,...ε,,n,,)''. Importantly, this is not a function of '''''X''''', so the standard errors can be estimated as: {{attachment:robust.svg}} Robust errors are only appropriate with large sample sizes. [[SamplingWeightsAndRegressionAnalysis|When fitting a model using data with survey weights, if those weights are a function of predictors including the dependent variable, then heteroskedastic consistent errors should be used.]] [[HowRobustStandardErrorsExposeMethodologicalProblemsTheyDoNotFix|If a model significantly diverges after introducing robust errors, there is likely a specification error.]] ---- == Clustered == '''Liang-Zeger clustered robust standard errors''' assume that errors covary within clusters. {{attachment:cluster1.svg}} where '''''x''',,g,,'' is an ''n,,g,,'' by ''k'' matrix constructed by stacking '''''x''',,i,,'' for all ''i'' belonging to cluster ''g''; and '''''ε''',,g,,'' is an ''n,,g,,'' long vector holding the errors for each cluster ''g''. The estimator becomes: {{attachment:cluster2.svg}} Clustered standard errors should only be used if the sample design or experimental design call for it. * A complex survey sample design leads to differential sampling errors across strata. * A two-stage sample design leads to differential sampling errors for the SSU within each PSU. * Assignment of an experimental treatment at a grouped level often leads to differential errors across those groups. * For time series evaluation of an experimental treatment that is assigned at the individual level, it is generally recommended to cluster at the individual level. There are parallels between [[Statistics/FixedEffectsModel|fixed effects]] and clusters, but use of one does not mandate nor conflict with the other. ---- == Finite Population Correction == Most formulations of standard errors assume the population is unknown and/or infinite. If the population is finite and the sampling rate is high (above 5%), the standard error is too conservative. The '''finite population correction''' ('''FPC''') is an adjustment to correct this: {{attachment:fpc.svg}} Intuitively, the FPC is 0 when ''n = N'' because there is no sampling error in a census. FPC approaches 1 when ''n'' approaches 0, demonstrating that the factor is meaningless for low sampling rates. ---- CategoryRicottone