Struggles with Survey Weighting and Regression Modeling

Struggles with Survey Weighting and Regression Modeling (DOI: http://dx.doi.org/10.1214/088342306000000691) was written by Andrew Gelman in 2007. It was published in Statistical Science (vol. 22, no. 2).

The author demonstrates that, under certain conditions, poststratification and survey weighting are the same procedure.

ps.svg

wt.svg

In contrast, the literature on weighted analysis is confused. In some circumstances, it is recommended to not use survey weights when models of survey data. In addition, weighted standard errors are not trivial to estimate.

Survey weighting also runs into problems as the number of stratified cells grows, potentially to the point where a cell has no respondents.

The authors start from poststratification and try to reverse-engineer weighting methods.

First, a simple OLS fitting of the outcome on the stratifying variables:

regress1.svg

regress2.svg

regress3.svg

Next, expand this to a hierarchical model. The author specifically uses batches of age indicators (i.e., binned into 4 levels), education (i.e. 4 levels), and their interactions.

hier1.svg

hier2.svg

The others further consider special cases of the hierarchical model.

The posterior variance of y can be calculated as any of:

hier3.svg

Reading notes

Note also that the author repeats their argument from here. This is, as before, applicable because of the constraint on weight computation.


CategoryRicottone

StrugglesWithSurveyWeightingAndRegressionModeling (last edited 2025-04-29 19:36:39 by DominicRicottone)