Differences between revisions 5 and 6

Struggles with Survey Weighting and Regression Modeling

Struggles with Survey Weighting and Regression Modeling (DOI: http://dx.doi.org/10.1214/088342306000000691) was written by Andrew Gelman in 2007. It was published in Statistical Science (vol. 22, no. 2).

The author demonstrates that, under certain conditions, poststratification and survey weighting are the same procedure.

Assume large population so that finite population quantities of interest are the same as superpopulation quantities.
Assume the population size of all strata is known: J cells with population size N_j such that Σ N_j = N.
Restrict weight computation to:
- poststratification, because raking methods are not necessary when all cell sizes are known, as stated in the above assumption
- flat nonresponse adjustments within strata
  - This really is a constraint on the available predictors for response propensity. If the only stratifying variables are used, then of course each strata will have a unique response prediction, and therefore a constant adjustment factor.
The poststratified estimate is:

The weighted estimate can be put in terms of unit weights (w_i) or cell weights (W_j = n_jw_i):

In contrast, the literature on weighted analysis is confused. In some circumstances, it is recommended to not use survey weights when models of survey data. In addition, weighted standard errors are not trivial to estimate.

Survey weighting also runs into problems as the number of stratified cells grows, potentially to the point where a cell has no respondents.

The authors start from poststratification and try to reverse-engineer weighting methods.

First, a simple OLS fitting of the outcome on the stratifying variables:

X is the n by k matrix of predictors
X^pop is the J by k matrix of cell estimates
N^pop is the J-long vector of the N_j population sizes
the coefficients (true values β) are estimated as b = (X^TX)^-1X^Ty (see here)
the poststratified cell estimates are given as X^popb, or X^pop(X^TX)^-1X^Ty
the poststratified estimate is:

to fit this into the rough formula of survey weighted estimates:

then the J-long vector of weights, w, must be:

Next, expand this to a hierarchical model. The author specifically uses batches of age indicators (i.e., binned into 4 levels), education (i.e. 4 levels), and their interactions.

a generally noninformative prior is assumed:
- coefficients are distributed β ~ N(0,Σ_β)
- most coefficients are assumed to be independent, such that the precision matrix Σ_β^-1 is fully specified as a diagonal matrix with the estimated σ^-2 for the batched predictors, and 0s for all others
the coefficients are estimated as b = (X^TΣ_y^-1X + Σ_β^-1)^-1X^TΣ_y^-1y
the poststratified estimate is:

the J-long vector of weights w is:

The others further consider special cases of the hierarchical model.

The posterior variance of y can be calculated as any of:

Reading notes

Note also that the author repeats their argument from here. This is, as before, applicable because of the constraint on weight computation.

CategoryRicottone

StrugglesWithSurveyWeightingAndRegressionModeling (last edited 2025-04-29 19:36:39 by DominicRicottone)

-  ⇤ ← Revision 5 as of 2025-04-28 20:56:13 → 
  Size: 3860
  Editor: DominicRicottone
  Comment: Completed?
+   ← Revision 6 as of 2025-04-29 19:36:39 → ⇥
  Size: 4090
  Editor: DominicRicottone
  Comment: Some final clarifications
-Deletions are marked like this.
+Additions are marked like this.
 Line 44:
-Next, expand this to a hierarchical model. The author specifically uses batches of age indicators (i.e., [[Statistics/Binning|binned]] into 4 levels), education (i.e. 4 levels), and their interactions.
+Next, expand this to a [[Statistics/BayesianHierarchicalModel|hierarchical model]]. The author specifically uses batches of age indicators (i.e., [[Statistics/Binning|binned]] into 4 levels), education (i.e. 4 levels), and their interactions.
 Line 46:
- * prior precision matrix '''''Σ''',,β,,^-1^'' is a diagonal matrix with 0s for the constant ''β,,0,,'' and any nonhierarchical predictors, and the estimated ''σ^-2^'' for any hierarchical predictors.
+ * a generally noninformative prior is assumed:
   * coefficients are distributed ''β ~ N(0,'''Σ''',,β,,)''
   * most coefficients are assumed to be independent, such that the [[Statistics/CovarianceMatrices#Precision_Matrices|precision matrix]] '''''Σ''',,β,,^-1^'' is fully specified as a diagonal matrix with the estimated ''σ^-2^'' for the batched predictors, and 0s for all others

Diff for "StrugglesWithSurveyWeightingAndRegressionModeling"

Struggles with Survey Weighting and Regression Modeling

Reading notes