Differences between revisions 10 and 22 (spanning 12 versions)

Survey Weights

Survey weights account for the design of a survey sample and non-sampling error.

Contents

Survey Weights
1. Description
  1. Non-Response Adjustments
  2. Post-Stratification
2. Usage
  1. Weighted Estimators

Description

The design weight, or base weight, reflects unequal probabilities of selection. Generally this is simply the inverse of the sampling probability: n_k/N for all strata k.

Non-Response Adjustments

All real surveys feature non-sampling error, especially non-response. If non-response is uncorrelated with key metrics, it is negligible. There almost always is some observable non-response bias, i.e. an attribute that is known for the entire population and is correlated with both a key metric and responsivity. This bias can be corrected with a non-response adjustment to the survey weights.

It is also reasonable to expect that there is unobserved bias, i.e. an attribute that is not known.

A non-response adjustment factor generally moves weight from non-respondents to comparable respondents. If there are no significant attributes that can be used to establish comparability, then the adjustment is a flat multiplier: the total of cases over the count of respondents. (Non-respondents have their weight set to 0.)

If there are significant attributes, responsivity can be modeled. There are generally two approaches:

weighting class adjustment: The population (or stratum subpopulation) is partitioned into N-tiles according to the predicted responsivity. Each N-tile then receives a separate flat multiplier as described above.
propensity score adjustment: Every respondent's weight is multiplied by the inverse of the predicted responsivity, while non-respondents have their weight set to 0. General practice is then to re-normalize the weights such that they sum to the same total as before applying the adjustment.

Modeling on insignificant or uncorrelated attributes does not introduce bias, but it does inflate variance.

Post-Stratification

Post-stratification is employed in survey weighting for several reasons:

There may be measurable sampling errors, such as undercoverage, which can be corrected.
Incorporating auxiliary information, i.e. the known distribution of the population, into survey estimates should increase accuracy.
Post-stratified estimates are consistent. Estimates across surveys will match on e.g. the proportion of women in the population if they are all post-stratified according to the same targets.

There are two approaches to this post-stratification: GREG estimation and calibration estimation. Calibration is known under a variety of other names: raking, iterative proportional fitting, and RIM weighting.

Usage

Weighted Estimators

Survey weights w are designed such that a population proportion μ can be calculated using the weighted estimator Σ(wx) / Σw.

In the case that all cases have equal weight, it is straightforward to show that the variance of that estimator is w²σ².

In any other case, the variance is given by Σ(w²σ²) / (Σw)². This ratio must then be linearized or simulated to arrive at an approximate variance. Taylor expansion is a common strategy for linearization.

CategoryRicottone

-  ⇤ ← Revision 10 as of 2020-11-05 18:33:02 → 
  Size: 3721
  Editor: DominicRicottone
  Comment:
+   ← Revision 22 as of 2025-08-10 00:55:28 → ⇥
  Size: 3671
  Editor: DominicRicottone
  Comment: Content
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-Survey weights account for the design of a survey sample and other biases/errors introduced by a survey instrument.
+'''Survey weights''' account for the [[Statistics/SurveySampling|design of a survey sample]] and [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]].
 Line 11:
-== The Basic Process ==
+== Description ==
 Line 13:
-. Set survey dispositions
 2. Calculate base weights
 3. Apply non-response adjustments to base weights
 4. Calibrate the weights
+The design weight, or base weight, reflects unequal [[Statistics/SurveySampling|probabilities of selection]]. Generally this is simply the inverse of the sampling probability: ''n,,k,,/N'' for all strata ''k''.
-Line 18:
+Line 15:
-See [[SurveyDisposition|here]] for details about survey dispositions.
+=== Non-Response Adjustments ===

All real surveys feature [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]], especially non-response. If non-response is uncorrelated with key metrics, it is negligible. There almost always is some observable [[Statistics/NonResponseBias|non-response bias]], i.e. an attribute that is known for the entire population and is correlated with both a key metric and responsivity. This bias can be corrected with a '''non-response adjustment''' to the survey weights.

It is also reasonable to expect that there is ''unobserved'' bias, i.e. an attribute that is not known.

A non-response adjustment factor generally moves weight from non-respondents to comparable respondents. If there are no significant attributes that can be used to establish comparability, then the adjustment is a flat multiplier: the total of cases over the count of respondents. (Non-respondents have their weight set to 0.)

If there are significant attributes, responsivity can be modeled. There are generally two approaches:
 * '''weighting class adjustment''': The population (or stratum subpopulation) is partitioned into N-tiles according to the predicted responsivity. Each N-tile then receives a separate flat multiplier as described above.
 * '''propensity score adjustment''': Every respondent's weight is multiplied by the inverse of the predicted responsivity, while non-respondents have their weight set to 0. General practice is then to re-normalize the weights such that they sum to the same total as before applying the adjustment.

Modeling on insignificant or uncorrelated attributes does not introduce bias, but it does inflate variance.



=== Post-Stratification ===

Post-stratification is employed in survey weighting for several reasons:
 * There may be measurable [[Statistics/SurveyInference#Sampling_Error|sampling errors]], such as undercoverage, which can be corrected.
 * Incorporating auxiliary information, i.e. the known distribution of the population, into survey estimates should increase accuracy.
 * Post-stratified estimates are consistent. Estimates across surveys will match on e.g. the proportion of women in the population if they are all post-stratified according to the same targets.

There are two approaches to this post-stratification: [[TheCalibrationApproachInSurveyTheoryAndPractice|GREG estimation and calibration estimation]]. Calibration is known under a variety of other names: '''raking''', '''iterative proportional fitting''', and '''RIM weighting'''.
-Line 24:
+Line 46:
-== Calculating Weights ==

The base weight is the inverse of the probability of being sampled. ''Think '''desired over actual'''.''

 * For a '''census''', all respondents have a weight of 1.
 * For a '''SRS design''', this is calculated as a simple rate. Given a population of 20,000 and a sample size of 667, the propbability of being sampled is 20,000/667 = '''29.99'''.
 * For a '''STSRS design''', the same process is applied per stratum.

Note that, in each, the sum of base weights should equal the population size.

----
+== Usage ==
-Line 38:
+Line 50:
-== Non-Response Adjustments ==
+=== Weighted Estimators ===
-Line 40:
+Line 52:
-For a number of reasons, it is typically necessary to take non-response into account while weighting data.
+Survey weights ''w'' are designed such that a population proportion ''μ'' can be calculated using the weighted estimator ''Σ(wx) / Σw''.
-Line 42:
+Line 54:
-There are two main methods for adjusting weights:
+In the case that all cases have equal weight, [[Statistics/Moments#Description|it is straightforward to show]] that the variance of that estimator is ''w^2^σ^2^''.
-Line 44:
+Line 56:
-. '''Weighting class adjustments''' involve dividing the sample into discrete classes and applying an adjustment factor by class.
 2. '''Propensity score adjustments''' involve calculating the inverse of the estimated probability to respond and applying that as a secondary weight.



=== Reapportioning Weight ===

The collected measures should reflect the sample (and therefore the population), but incomplete data complicates this. It is common to break the sample into weighting classes based on responsivity, and then reapportion the weight of non-respondents to respondents.

Consider a simple design without eligibility.

||'''Class'''   ||'''Count'''||
||Respondent    ||800        ||
||Non-respondent||200        ||

To re-apportion the weight of non-respondents, the respondents' weight factors would be adjusted by a factor of (800+200)/800 or 1.25. The non-respondents would then be dropped, or assigned weight factors of 0. ''This is, again, a calculation of '''desired over actual'''.''



=== Non-response Bias ===

Responsivity is commonly related to the key measures of a survey, and therefore introduces non-response bias. Weighting can account for this error. The core concept is to use auxiliary frame data (i.e. descriptives known for both respondents ''and'' non-respondents).

Adjustments are applied in phases. Cases with unknown eligibility often cannot be adjusted through these methods, and need to be removed. Ineligible cases often are undesirable in analysis datasets, so weights are further adjusted to account for their removal.

----



== Calibration Adjustments ==

Calibration is a specific type of adjustment, where the intention is to force the measurements to reflect ''known'' descriptives of the population. If the population is known to be 50% female, then the final estimates should reflect that proportion.

Calibration follows from the same basic ideas as above, but involves distinct methods. Weights are often calibrated by many dimensions, requiring a programmed calculation. Methods include:

 * post-stratification (i.e. ''desired over actual'')
 * raking
 * linear calibration (GREG)



=== Raking ===

'''Raking''', or '''RIM weighting''', involves applying post-stratification by each dimension iteratively, until the weights converge. Convergence is defined as the root mean square (RMS) falling below a threshold, typically 0.000005.

Raked weights generally should not be applied if their efficiency falls below 70%.
+In any other case, the variance is given by ''Σ(w^2^σ^2^) / (Σw)^2^''. This ratio must then be linearized or simulated to arrive at an approximate variance. [[Calculus/TaylorSeries|Taylor expansion]] is a common strategy for linearization.

Diff for "Statistics/SurveyWeights"