Differences between revisions 6 and 17 (spanning 11 versions)

Survey Weights

Survey weights account for the design of a survey sample and other biases/errors introduced by a survey instrument.

Contents

Survey Weights

The Basic Process

Set survey dispositions
Calculate base weights
Apply non-response adjustments to base weights
Calibrate the weights

Base Weights

Base weights incorporate effects from the sampling design. They are the inverse of the probability of being sampled. Think desired over actual.

With a probability sample, base survey weights account for...

probability of selection
probability of responding and providing enough information to confirm eligibility, given selection
probability of being eligible, given selection and response

With a non-probability sample, these probabilities are all unknown. Often this step is skipped altogether.

Examples

Some practical for different probability samples:

For a census, all respondents have a weight of 1.
For a SRS design, this is calculated as a simple rate. Given a population of 20,000 and a sample size of 667, the propbability of being sampled is 20,000/667 = 29.99.
For a STSRS design, the same process is applied per stratum.

Note that, in each, the sum of base weights should equal the population size.

Non-Response Adjustments

Collected measures should reflect the sample (and therefore the population), and incomplete data creates gaps. Therefore it is necessary to take non-response into account while weighting data.

There are two main methods for adjusting weights based on non-response:

Weighting class adjustments involve dividing the sample into discrete classes and applying an adjustment factor by class.
Propensity score adjustments involve calculating the inverse of the estimated probability to respond and applying that as a secondary weight.

Examples

A simple demonstration breaks the sample into weighting classes based on responsivity, and then reapportions the weight of non-respondents to respondents.

Consider a simple design without eligibility.

Class	Count
Respondent	800
Non-respondent	200

To re-apportion the weight of non-respondents, the respondents' weight factors would be adjusted by a factor of (800+200)/800 or 1.25. The non-respondents would then be dropped, or assigned weight factors of 0. This is, again, a calculation of desired over actual.

The abstracted process in Stata looks like:

logistic respondent sampled
predict prop if respondent
generate double wt = 1/prop

Non-response Bias

Responsivity is commonly related to the key measures of a survey, and therefore introduces non-response bias. Weighting can account for this error. The core concept is to use auxiliary frame data (i.e. descriptives known for both respondents and non-respondents).

Adjustments are applied in phases. Cases with unknown eligibility often cannot be adjusted through these methods, and need to be removed. Ineligible cases often are undesirable in analysis datasets, so weights are further adjusted to account for their removal.

Calibration

Calibration forces the measurements to reflect known descriptives of the population. If the population is known to be 50% female, then the final estimates of the population should not contradict that fact.

Calibration follows from the same basic ideas as above, but involves distinct methods. Weights are often calibrated by many dimensions, requiring a programmed calculation. Methods include:

post-stratification (i.e. desired over actual)
raking
linear calibration (GREG)

Selection of Calibration Dimensions

Quota variables should be selected for calibration, especially when the quotas involved oversampling of some groups.

If key descriptives of the sample appear imbalanced when compared to a 'gold standard' data source (i.e. the census), then those should also be selected.

Lastly, any descriptives that predict key measures should be selected.

Raking

Raking, or RIM weighting, involves applying post-stratification by each dimension iteratively, until the weights converge. Convergence is defined as the root mean square (RMS) falling below a threshold, typically 0.000005.

Raked weights generally should not be applied if their efficiency falls below 70%.

CategoryRicottone

-  ⇤ ← Revision 6 as of 2020-10-22 20:00:50 → 
  Size: 1984
  Editor: DominicRicottone
  Comment:
+   ← Revision 17 as of 2025-01-10 16:10:44 → ⇥
  Size: 4740
  Editor: DominicRicottone
  Comment: Killing SurveyStatistics page
-Deletions are marked like this.
+Additions are marked like this.
 Line 13:
-. Set survey dispositions
+. Set [[Statistics/SurveyDisposition|survey dispositions]]
 Line 17:
-See [[SurveyDisposition|here]] for details about survey dispositions.
-Line 24:
+Line 22:
-== Calculating Weights ==
+== Base Weights ==
-Line 26:
+Line 24:
-The base weight is the inverse of the probability of being sampled. Think ''desired over actual''. As such, the sum of base weights should equal the population size.
+'''Base weights''' incorporate effects from the [[SurveySamples#Sample_Type|sampling design]]. They are the inverse of the probability of being sampled. ''Think '''desired over actual'''.''
-Line 28:
+Line 26:
-For a SRS design, this is calculated as a simple rate. Given a population of 20,000 and a sample size of 667, the propbability of being sampled is 20,000/667 = '''29.99'''.
+With a ''probability sample'', base survey weights account for...
-Line 30:
+Line 28:
-For a STSRS design, the same process is applied per stratum.
+. probability of selection
 2. probability of responding and providing enough information to confirm eligibility, given selection
 3. probability of being eligible, given selection and response

With a ''non-probability sample'', these probabilities are all unknown. Often this step is skipped altogether.



=== Examples ===

Some practical for different probability samples:

 * For a ''census'', all respondents have a weight of '''''1'''''.
 * For a ''SRS design'', this is calculated as a simple rate. Given a population of ''20,000'' and a sample size of ''667'', the propbability of being sampled is 20,000/667 = '''''29.99'''''.
 * For a ''STSRS design'', the same process is applied per stratum.

Note that, in each, the sum of base weights should equal the population size.
-Line 38:
+Line 52:
-Survey weights can adjust for non-response bias. The core concept is to use auxiliary frame data (i.e. descriptives known for ''both'' respondents and non-respondents) that is correlated with key measures or responsivity.
+Collected measures should reflect the sample (and therefore the population), and incomplete data creates gaps. Therefore it is necessary to take non-response into account while weighting data.
-Line 40:
+Line 54:
-'''Weighting class adjustments''' divides the sample into weighting classes and applies a class-specific adjustment factor to every case.
+There are two main methods for adjusting weights based on non-response:
-Line 42:
+Line 56:
-'''Propensity score adjustments''' calculates the inverse of the estimated probability to respond and applies that as a secondary weight.
+. '''Weighting class adjustments''' involve dividing the sample into discrete classes and applying an adjustment factor by class.
 2. '''Propensity score adjustments''' involve calculating the inverse of the estimated probability to respond and applying that as a secondary weight.



=== Examples ===

A simple demonstration breaks the sample into weighting classes based on responsivity, and then reapportions the weight of non-respondents to respondents.

Consider a simple design without eligibility.

||'''Class'''   ||'''Count'''||
||Respondent    ||800        ||
||Non-respondent||200        ||

To re-apportion the weight of non-respondents, the respondents' weight factors would be adjusted by a factor of (800+200)/800 or 1.25. The non-respondents would then be dropped, or assigned weight factors of 0. ''This is, again, a calculation of '''desired over actual'''.''

The abstracted process in [[Stata]] looks like:

{{{
logistic respondent sampled
predict prop if respondent
generate double wt = 1/prop
}}}



=== Non-response Bias ===

Responsivity is commonly related to the key measures of a survey, and therefore introduces '''non-response bias'''. Weighting can account for this error. The core concept is to use auxiliary frame data (i.e. descriptives known for both respondents ''and'' non-respondents).
-Line 50:
+Line 93:
-== Calibration Adjustments ==
+== Calibration ==
-Line 52:
+Line 95:
-Survey weights can be adjusted to ensure that known population descriptives are reflected in the estimates.
+'''Calibration''' forces the measurements to reflect ''known'' descriptives of the population. If the population is ''known'' to be 50% female, then the final estimates of the population should not contradict that fact.
-Line 54:
+Line 97:
-Methods include:
+Calibration follows from the same basic ideas as above, but involves distinct methods. Weights are often calibrated by many dimensions, requiring a programmed calculation. Methods include:
-Line 62:
+Line 105:
+=== Selection of Calibration Dimensions ===

Quota variables should be selected for calibration, ''especially'' when the quotas involved oversampling of some groups.

If key descriptives of the sample appear imbalanced when compared to a 'gold standard' data source (i.e. the census), then those should also be selected.

Lastly, any descriptives that predict key measures should be selected.



=== Raking ===

'''Raking''', or '''RIM weighting''', involves applying post-stratification by each dimension iteratively, until the weights converge. Convergence is defined as the root mean square (RMS) falling below a threshold, typically 0.000005.

Raked weights generally should not be applied if their efficiency falls below 70%.

Diff for "SurveyWeights"

Survey Weights

The Basic Process

Base Weights

Examples

Non-Response Adjustments

Examples

Non-response Bias

Calibration

Selection of Calibration Dimensions

Raking