Survey Weights

Survey weights account for the design of a survey sample and non-sampling error.


Description

Survey weights begin with the inverse of the sampling probability. This is known as the base weight.

The weight of non-respondents, or more generally anyone who cannot be used for analysis, is reallocated to respondents. This is usually done in a manner that accounts for non-sampling error, especially measurable non-response bias. In the simplest case though, if there are no meaningful predictors of response propensity, the weights of non-respondents can be set to 0 and the weights of respondents can be scaled up by a corresponding flat adjustment factor.

The final step is post-stratification. This can address sampling errors such as undercoverage. Typically, post-stratification is done by a large set of discrete dimensions such that the true population counts are not known. An algorithm called raking or calibration is used to approximate the adjustment.


Non-Response Adjustments

Non-response bias exists when non-response is correlated with a metric of interest, introducing bias into the population estimate.

If non-response is measurable, i.e. response propensity can be predicted using auxiliary information known about the entire sample, then it can also be corrected for.

A weighting class adjustment is calculated by using predicted propensity to segment the sample, leading to a response rate per class. Within each class, the inverse of the response rate is the non-response adjustment. Non-respondents have their weight set to 0, as it has been reallocated to respondents that are predicted to be similar in terms of response patterns.

A propensity score adjustment is calculated as the inverse of predicted propensity.

Inclusion of insignificant or uncorrelated predictors does not introduce bias in such an adjustment, but it does decrease precision because the variance is increased. As such, when utilizing a linear model for predictions, it is common to use stepwise removal of covariates.


Post-Stratification

Post-stratification is applied because some characteristics of the true population are known, and furthermore are expected to correlate with the metric of interest. By forcing the survey weights to match the known distribution, they are more likely to correct for biases introduced by sampling errors. The population estimates are also more applicable to the true population.

As a result, there are circumstances where post-stratified weights are not applicable. For example, when modeling non-response, the population of interest is in fact the sample, not the true population.

Post-stratification is often done according to many complex dimensions. For example, the interactions of sex by age bins (male and 18-24; male and 25-34; and so on). True population counts for the margins of these dimensions are usually available, not not necessarily the cells/intersections. Furthermore, some intersections are likely to have so few respondents that the weights would be inappropriately large.

Iterative proportional fitting, more generally known as raking, is an algorithm for post-stratification in such a circumstance. It involves looping over the dimensions, post-stratifying the weights toward those marginal counts one at a time. This small loop is then repeated in a larger loop until a convergence criterion is achieved, or for a pre-determined number of iterations. RIM (random iterative method) weighting is essentially the same thing.

Calibration, or GREG (generalized regression) estimation, is a more generalized algorithm. It utilizes a linear regression model to re-weight towards marginal counts.

In terms of automated convergence criteria, a common choice is to stop when the root mean square (RMS) of the weights themselves falls below a threshold like 0.000005. Another is to stop when the absolute change to the weights themselves falls below a threshold like 0.0001.


CategoryRicottone