Differences between revisions 3 and 31 (spanning 28 versions)
Revision 3 as of 2020-10-22 19:58:03
Size: 1938
Comment:
Revision 31 as of 2026-02-11 19:20:24
Size: 5205
Comment: Links
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Survey weights account for the design of a survey sample and other biases/errors introduced by a survey instrument. '''Survey weights''' account for the [[Statistics/SurveySampling|survey design]], [[Statistics/SurveyInference#Sampling_Error|sampling error]], and [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]].
Line 11: Line 11:
== The Basic Process == == Description ==
Line 13: Line 13:
 1. Set survey dispositions
 2. Set base weights
 3. Apply non-response adjustments to base weights
 4. Calibrate the weights
Survey data is collected through a mechanism which can be specified statistically. If it is not specified, bias can be introduced and [[Analysis/Estimation|estimates]] can be over-confident.
Line 18: Line 15:
See [[SurveyDisposition|here]] for details about survey dispositions. [[Statistics/InverseVarianceWeights|Inverse variance weights]] are related, but not the same.

Survey weights begin with [[Statistics/DesignWeights|design weights]] reflecting [[Statistics/SurveySampling|probability of selection]]. Generally this is simply the inverse of the sampling probability: ''n,,k,,/N'' for all strata ''k''.

All real surveys feature [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]], especially [[Statistics/SurveyNonresponse|nonresponse]]. If nonresponse is uncorrelated with key metrics, it is negligible. Otherwise there is potential for [[Statistics/NonresponseBias|nonresponse bias]]. This bias can be corrected through survey weights in a few ways:
 * [[Statistics/InverseProbabilityWeights|inverse propensity adjustments]]
 * [[Statistics/WeightingClassAdjustment|weighting class adjustments]]

Modeling on insignificant or uncorrelated attributes does not introduce bias, but it does inflate [[Statistics/Variance|variance]].

[[Statistics/Calibration|Calibration]] can be used to:
 * make estimates be consistent with known true population proportions
 * correct [[Statistics/SurveyInference#Sampling_Error|sampling error]] like undercoverage or overcoverage
 * further correct for non-sampling error like nonresponse bias

The methods here include:
 * raking
 * iterative proportional fitting
 * RIM weighting
 * [[Statistics/GeneralizedRegressionEstimator|GREG estimators]]
Line 24: Line 40:
== Calculating Weights == == Weighted Estimators ==
Line 26: Line 42:
The base weight is the inverse of the probability of being sampled. Think ''desired over actual''. As such, the sum of base weights should equal the population size. Survey weights ''w'' are designed such that a population proportion ''μ'' can be calculated using the weighted estimator ''Σ(wx) / Σw''.
Line 28: Line 44:
For a SRS design, this is calculated as a simple rate. Given a population of 20,000 and a sample size of 667, the propbability of being sampled is 20,000/667 = '''29.99'''. In the case that all cases have equal weight, [[Statistics/Moments#Description|it is straightforward to show]] that the [[Statistics/Variance|variance]] of that estimator is ''w^2^σ^2^''.
Line 30: Line 46:
For a STSRS design, the same process is applied per stratum. In any other case, the variance is given by ''Σ(w^2^σ^2^) / (Σw)^2^''. This ratio must then be linearized or simulated to arrive at an approximate variance. [[Calculus/TaylorSeries|Taylor expansion]] is a common strategy for linearization.
Line 36: Line 52:
== Non-Response Adjustments == == Reading Notes ==
Line 38: Line 54:
Survey weights can adjust for non-response bias. The core concept is to use auxiliary frame data (i.e. descriptives known for ''both'' respondents and non-respondents) that is correlated with key measures or responsivity.

'''Weighting class adjustments''' divides the sample into weighting classes and applies a class-specific adjustment factor to every case.

'''Propensity score adjustments''' calculates the inverse of the estimated probability to respond and applies that as a secondary weight.

Adjustments are applied in phases. Cases with unknown eligibility often cannot be adjusted through these methods, and need to be removed. Ineligible cases often are undesirable in analysis datasets, so weights are further adjusted to account for their removal.

----



== Calibration ==

Survey weights are adjusted again to ensure that known population descriptives are reflected in the estimates.

Methods include:

 * post-stratification
 * raking
 * linear calibration (GREG)
 * [[CalibrationEstimatorsInSurveySampling|Calibration estimators in survey sampling]], Jean-Claude Deville and Carl-Erik Särndal, 1992
 * [[TheEffectOfWeightTrimmingOnNonlinearSurveyEstimates|The Effect of Weight Trimming on Nonlinear Survey Estimates]], Frank J. Potter, 1993
 * [[SamplingWeightsAndRegressionAnalysis|Sampling Weights and Regression Analysis]], Christopher Winship and Larry Radbill, 1994
 * [[ImprovingOnProbabilityWeightingForHouseholdSize|Improving on Probability Weighting for Household Size]], Andrew Gelman and Thomas C. Little, 1998
 * [[RandomEffectsModelsForSmoothingPoststratificationWeights|Random-effects Models for Smoothing Poststratitication Weights]], Laura C. Lazzeroni and Roderick J.A. Little, 1998
 * [[TheGeneralizedExponentialModelForSamplingWeightCalibrationForExtremeValuesNonresponseAndPostStratification|The generalized exponential model for sampling weight calibration for extreme values, nonresponse, and poststratification]], R.E. Folsom and A.C. Singh, 2000
 * [[UsingCalibrationWeightingToAdjustForNonresponseAndCoverageErrors|Using Calibration Weighting to Adjust for Nonresponse and Coverage Errors]], Phillip S. Kott, 2006
 * [[StrugglesWithSurveyWeightingAndRegressionModeling|Struggles with Survey Weighting and Regression Modeling]], Andrew Gelman, 2007
 * [[TheCalibrationApproachInSurveyTheoryAndPractice|The calibration approach in survey theory and practice]], Carl-Erik Särndal, 2007
 * [[ASingleFrameMultiplicityEstimatorForMultipleFrameSurveys|A single frame multiplicity estimator for multiple frame surveys]], Fulvia Mecatti, 2007
 * [[PracticalConsiderationsInRakingSurveyData|Practical Considerations in Raking Survey Data]]; Michael P Battaglia, David C Hoaglin, and Martin R Frankel (and sometimes David Izrael); 2009
 * [[StatisticalParadisesAndParadoxesInBigData|Statistical Paradises and Paradoxes in Big Data]], Xiao-Li Meng, 2018
 * [[ANewParadigmForPolling|A New Paradigm for Polling]], Michael A. Bailey, 2023
 * [[TheLawOfLargePopulationsDoesNotHeraldAParadigmShiftInSurveySampling|The “Law of Large Populations” Does Not Herald a Paradigm Shift in Survey Sampling]], Roderick J. Little, 2023
 * [[SurveysOfConsumersTechnicalReport|Surveys of Consumers Technical Report: Technical Documentation for the 2024 Methodological Transition to Web Surveys]], 2024
 * [[TheEffectOfOnlineInterviewsOnTheUniversityOfMichiganSurveyOfConsumerSentiment|The effect of online interviews on the University of Michigan Survey of Consumer Sentiment]], Ryan Cummings and Ernie Tedeschi, 2024

Survey Weights

Survey weights account for the survey design, sampling error, and non-sampling error.


Description

Survey data is collected through a mechanism which can be specified statistically. If it is not specified, bias can be introduced and estimates can be over-confident.

Inverse variance weights are related, but not the same.

Survey weights begin with design weights reflecting probability of selection. Generally this is simply the inverse of the sampling probability: nk/N for all strata k.

All real surveys feature non-sampling error, especially nonresponse. If nonresponse is uncorrelated with key metrics, it is negligible. Otherwise there is potential for nonresponse bias. This bias can be corrected through survey weights in a few ways:

Modeling on insignificant or uncorrelated attributes does not introduce bias, but it does inflate variance.

Calibration can be used to:

  • make estimates be consistent with known true population proportions
  • correct sampling error like undercoverage or overcoverage

  • further correct for non-sampling error like nonresponse bias

The methods here include:


Weighted Estimators

Survey weights w are designed such that a population proportion μ can be calculated using the weighted estimator Σ(wx) / Σw.

In the case that all cases have equal weight, it is straightforward to show that the variance of that estimator is w2σ2.

In any other case, the variance is given by Σ(w2σ2) / (Σw)2. This ratio must then be linearized or simulated to arrive at an approximate variance. Taylor expansion is a common strategy for linearization.


Reading Notes


CategoryRicottone

Statistics/SurveyWeights (last edited 2026-02-11 19:20:24 by DominicRicottone)