= Survey Weights = '''Survey weights''' account for the [[Statistics/SurveySampling|design of a survey sample]] and [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]]. <> ---- == Description == The design weight, or base weight, reflects unequal [[Statistics/SurveySampling|probabilities of selection]]. Generally this is simply the inverse of the sampling probability: ''n,,k,,/N'' for all strata ''k''. === Non-Response Adjustments === All real surveys feature [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]], especially non-response. If non-response is uncorrelated with key metrics, it is negligible. There almost always is some observable [[Statistics/NonResponseBias|non-response bias]], i.e. an attribute that is known for the entire population and is correlated with both a key metric and responsivity. This bias can be corrected with a '''non-response adjustment''' to the survey weights. It is also reasonable to expect that there is ''unobserved'' bias, i.e. an attribute that is not known. A non-response adjustment factor generally moves weight from non-respondents to comparable respondents. If there are no significant attributes that can be used to establish comparability, then the adjustment is a flat multiplier: the total of cases over the count of respondents. (Non-respondents have their weight set to 0.) If there are significant attributes, responsivity can be modeled. There are generally two approaches: * '''weighting class adjustment''': The population (or stratum subpopulation) is partitioned into N-tiles according to the predicted responsivity. Each N-tile then receives a separate flat multiplier as described above. * '''propensity score adjustment''': Every respondent's weight is multiplied by the inverse of the predicted responsivity, while non-respondents have their weight set to 0. General practice is then to re-normalize the weights such that they sum to the same total as before applying the adjustment. Modeling on insignificant or uncorrelated attributes does not introduce bias, but it does inflate [[Statistics/Variance|variance]]. === Post-Stratification === Post-stratification is employed in survey weighting for several reasons: * There may be measurable [[Statistics/SurveyInference#Sampling_Error|sampling errors]], such as undercoverage, which can be corrected. * Incorporating auxiliary information, i.e. the known distribution of the population, into survey estimates should increase accuracy. * Post-stratified estimates are consistent. Estimates across surveys will match on e.g. the proportion of women in the population if they are all post-stratified according to the same targets. There are two approaches to this post-stratification: [[TheCalibrationApproachInSurveyTheoryAndPractice|GREG estimation and calibration estimation]]. Calibration is known under a variety of other names: '''raking''', '''iterative proportional fitting''', and '''RIM weighting'''. ---- == Usage == === Weighted Estimators === Survey weights ''w'' are designed such that a population proportion ''μ'' can be calculated using the weighted estimator ''Σ(wx) / Σw''. In the case that all cases have equal weight, [[Statistics/Moments#Description|it is straightforward to show]] that the variance of that estimator is ''w^2^σ^2^''. In any other case, the variance is given by ''Σ(w^2^σ^2^) / (Σw)^2^''. This ratio must then be linearized or simulated to arrive at an approximate variance. [[Calculus/TaylorSeries|Taylor expansion]] is a common strategy for linearization. ---- CategoryRicottone