|
Size: 4508
Comment: Clarifications
|
Size: 4617
Comment: Rewrite
|
| Deletions are marked like this. | Additions are marked like this. |
| Line 3: | Line 3: |
| '''Survey weights''' account for the [[Statistics/SurveySampling|design of a survey sample]] and [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]]. | '''Survey weights''' account for the [[Statistics/SurveySampling|survey design]], [[Statistics/SurveyInference#Sampling_Error|sampling error]], and [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]]. |
| Line 13: | Line 13: |
| Survey weights begin with the inverse of the [[Statistics/SurveySampling|sampling probability]]. This is known as the '''base weight'''. | Survey data is collected through a mechanism which can be specified statistically. If it is not specified, bias can be introduced and [[Analysis/Estimation|estimates]] can be over-confident. |
| Line 15: | Line 15: |
| The weight of non-respondents, or more generally anyone who cannot be used for analysis, is reallocated to respondents. This is usually done in a manner that accounts for [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]], especially [[Statistics/NonResponseBias|measurable non-response bias]]. In the simplest case though, if there are no meaningful predictors of response propensity, the weights of non-respondents can be set to 0 and the weights of respondents can be scaled up by a corresponding flat adjustment factor. | [[Statistics/InverseVarianceWeights|Inverse variance weights]] are related, but not the same. |
| Line 17: | Line 17: |
| The final step is [[Statistics/PostStratification|post-stratification]]. This can address [[Statistics/SurveyInference#Sampling_Error|sampling errors]] such as undercoverage. Typically, post-stratification is done by a large set of discrete dimensions such that the true population counts are not known. An algorithm called '''raking''' or '''calibration''' is used to approximate the adjustment. | Survey weights begin with a [[Statistics/DesignWeight|design weight]] reflecting [[Statistics/SurveySampling|probability of selection]]. Generally this is simply the inverse of the sampling probability: ''n,,k,,/N'' for all strata ''k''. All real surveys feature [[Statistics/SurveyInference#Non-sampling_Error|non-sampling error]], especially [[Statistics/SurveyNonresponse|nonresponse]]. If nonresponse is uncorrelated with key metrics, it is negligible. Otherwise there is potential for [[Statistics/NonresponseBias|nonresponse bias]]. This bias can be corrected through survey weights in a few ways: * [[Statistics/InverseProbabilityWeights|inverse propensity adjustments]] * [[Statistics/WeightingClassAdjustment|weighting class adjustments]] Modeling on insignificant or uncorrelated attributes does not introduce bias, but it does inflate [[Statistics/Variance|variance]]. [[Statistics/Calibration|Calibration]] can be used to: * make estimates be consistent with known true population proportions * correct [[Statistics/SurveyInference#Sampling_Error|sampling error]] like undercoverage or overcoverage * further correct for non-sampling error like nonresponse bias The methods here include: * raking * iterative proportional fitting * RIM weighting * [[Statistics/GeneralizedRegressionEstimator|GREG estimators]] |
| Line 23: | Line 40: |
| == Non-Response Adjustments == | == Weighted Estimators == |
| Line 25: | Line 42: |
| Non-response bias exists when non-response is correlated with a metric of interest, introducing bias into the population estimate. | Survey weights ''w'' are designed such that a population proportion ''μ'' can be calculated using the weighted estimator ''Σ(wx) / Σw''. |
| Line 27: | Line 44: |
| If non-response is measurable, i.e. response propensity can be predicted using auxiliary information known about the entire sample, then it can also be corrected for. | In the case that all cases have equal weight, [[Statistics/Moments#Description|it is straightforward to show]] that the [[Statistics/Variance|variance]] of that estimator is ''w^2^σ^2^''. |
| Line 29: | Line 46: |
| A '''weighting class adjustment''' is calculated by using predicted propensity to segment the sample, leading to a response rate per class. Within each class, the inverse of the response rate is the non-response adjustment. Non-respondents have their weight set to 0, as it has been reallocated to respondents that are predicted to be similar in terms of response patterns. A '''propensity score adjustment''' is calculated as the inverse of predicted propensity. Inclusion of insignificant or uncorrelated predictors does not introduce bias in such an adjustment, but it does decrease precision because the variance is increased. As such, when utilizing a linear model for predictions, it is common to use stepwise removal of covariates. |
In any other case, the variance is given by ''Σ(w^2^σ^2^) / (Σw)^2^''. This ratio must then be linearized or simulated to arrive at an approximate variance. [[Calculus/TaylorSeries|Taylor expansion]] is a common strategy for linearization. |
| Line 39: | Line 52: |
| == Post-Stratification == | == Reading Notes == |
| Line 41: | Line 54: |
| Post-stratification is applied because some characteristics of the true population are known, and furthermore are expected to correlate with the metric of interest. By forcing the survey weights to match the known distribution, they are more likely to correct for biases introduced by [[Statistics/SurveyInference#Sampling_Error|sampling errors]]. The population estimates are also more applicable to the true population. As a result, there are circumstances where post-stratified weights are not applicable. For example, when modeling non-response, the population of interest is in fact the sample, ''not'' the true population. Post-stratification is often done according to many complex dimensions. For example, the interactions of sex by age [[Statistics/Binning|bins]] (male and 18-24; male and 25-34; and so on). True population counts for the margins of these dimensions are usually available, not not necessarily the cells/intersections. Furthermore, some intersections are likely to have so few respondents that the weights would be inappropriately large. '''Iterative proportional fitting''', more generally known as '''raking''', is an algorithm for post-stratification in such a circumstance. It involves looping over the dimensions, post-stratifying the weights toward those marginal counts one at a time. This small loop is then repeated in a larger loop until a convergence criterion is achieved, or for a pre-determined number of iterations. '''RIM (random iterative method) weighting''' is essentially the same thing. '''Calibration''', or '''GREG (generalized regression) estimation''', is a more generalized algorithm. It utilizes a linear regression model to re-weight towards marginal counts. In terms of automated convergence criteria, a common choice is to stop when the root mean square (RMS) of the weights themselves falls below a threshold like 0.000005. Another is to stop when the absolute change to the weights themselves falls below a threshold like 0.0001. |
* [[TheEffectOfWeightTrimmingOnNonlinearSurveyEstimates|The Effect of Weight Trimming on Nonlinear Survey Estimates]], Frank J. Potter, 1993 * [[SamplingWeightsAndRegressionAnalysis|Sampling Weights and Regression Analysis]], Christopher Winship and Larry Radbill, 1994 * [[ImprovingOnProbabilityWeightingForHouseholdSize|Improving on Probability Weighting for Household Size]], Andrew Gelman and Thomas C. Little, 1998 * [[UsingCalibrationWeightingToAdjustForNonresponseAndCoverageErrors|Using Calibration Weighting to Adjust for Nonresponse and Coverage Errors]], Phillip S. Kott, 2006 * [[StrugglesWithSurveyWeightingAndRegressionModeling|Struggles with Survey Weighting and Regression Modeling]], Andrew Gelman, 2007 * [[TheCalibrationApproachInSurveyTheoryAndPractice|The calibration approach in survey theory and practice]], Carl-Erik Särndal, 2007 * [[ASingleFrameMultiplicityEstimatorForMultipleFrameSurveys|A single frame multiplicity estimator for multiple frame surveys]], Fulvia Mecatti, 2007 * [[PracticalConsiderationsInRakingSurveyData|Practical Considerations in Raking Survey Data]]; Michael P Battaglia, David C Hoaglin, and Martin R Frankel (and sometimes David Izrael); 2009 * [[StatisticalParadisesAndParadoxesInBigData|Statistical Paradises and Paradoxes in Big Data]], Xiao-Li Meng, 2018 * [[ANewParadigmForPolling|A New Paradigm for Polling]], Michael A. Bailey, 2023 * [[TheLawOfLargePopulationsDoesNotHeraldAParadigmShiftInSurveySampling|The “Law of Large Populations” Does Not Herald a Paradigm Shift in Survey Sampling]], Roderick J. Little, 2023 * [[SurveysOfConsumersTechnicalReport|Surveys of Consumers Technical Report: Technical Documentation for the 2024 Methodological Transition to Web Surveys]], 2024 * [[TheEffectOfOnlineInterviewsOnTheUniversityOfMichiganSurveyOfConsumerSentiment|The effect of online interviews on the University of Michigan Survey of Consumer Sentiment]], Ryan Cummings and Ernie Tedeschi, 2024 |
Survey Weights
Survey weights account for the survey design, sampling error, and non-sampling error.
Description
Survey data is collected through a mechanism which can be specified statistically. If it is not specified, bias can be introduced and estimates can be over-confident.
Inverse variance weights are related, but not the same.
Survey weights begin with a design weight reflecting probability of selection. Generally this is simply the inverse of the sampling probability: nk/N for all strata k.
All real surveys feature non-sampling error, especially nonresponse. If nonresponse is uncorrelated with key metrics, it is negligible. Otherwise there is potential for nonresponse bias. This bias can be corrected through survey weights in a few ways:
Modeling on insignificant or uncorrelated attributes does not introduce bias, but it does inflate variance.
Calibration can be used to:
- make estimates be consistent with known true population proportions
correct sampling error like undercoverage or overcoverage
- further correct for non-sampling error like nonresponse bias
The methods here include:
- raking
- iterative proportional fitting
- RIM weighting
Weighted Estimators
Survey weights w are designed such that a population proportion μ can be calculated using the weighted estimator Σ(wx) / Σw.
In the case that all cases have equal weight, it is straightforward to show that the variance of that estimator is w2σ2.
In any other case, the variance is given by Σ(w2σ2) / (Σw)2. This ratio must then be linearized or simulated to arrive at an approximate variance. Taylor expansion is a common strategy for linearization.
Reading Notes
The Effect of Weight Trimming on Nonlinear Survey Estimates, Frank J. Potter, 1993
Sampling Weights and Regression Analysis, Christopher Winship and Larry Radbill, 1994
Improving on Probability Weighting for Household Size, Andrew Gelman and Thomas C. Little, 1998
Using Calibration Weighting to Adjust for Nonresponse and Coverage Errors, Phillip S. Kott, 2006
Struggles with Survey Weighting and Regression Modeling, Andrew Gelman, 2007
The calibration approach in survey theory and practice, Carl-Erik Särndal, 2007
A single frame multiplicity estimator for multiple frame surveys, Fulvia Mecatti, 2007
Practical Considerations in Raking Survey Data; Michael P Battaglia, David C Hoaglin, and Martin R Frankel (and sometimes David Izrael); 2009
Statistical Paradises and Paradoxes in Big Data, Xiao-Li Meng, 2018
A New Paradigm for Polling, Michael A. Bailey, 2023
The “Law of Large Populations” Does Not Herald a Paradigm Shift in Survey Sampling, Roderick J. Little, 2023
The effect of online interviews on the University of Michigan Survey of Consumer Sentiment, Ryan Cummings and Ernie Tedeschi, 2024
