A Comparison of Methods of Weighting Adjustment for Nonresponse

A Comparison of Methods of Weighting Adjustment for Nonresponse was written by Graham Kalton and Dalisay S. Maligalig in 1991. It was published in the proceedings of the 1991 Annual Research Conference of the Census Bureau. The scan can be found online at https://books.google.com/books?lr=&id=Cy22AAAAIAAJ&pg=PA409.

The estimator for population proportion Y̅ is where π_i is the probability of element i being sampled.

Given element nonresponse, the estimator becomes where r is the number of respondents. The bias is approximated by where ϕ_i is the probability of element i responding if sampled. This bias relates to the covariance of Y_i and ϕ_i; if covariance is 0, then bias is 0.

If ϕ_i are known, use the corrected estimator . But realistically we can only estimate those probabilities, and then use . Three methods follow:

The simplest model is to assume constant probability of responding if sampled, i.e. ϕ_i = ϕ ∀ i.
The recommendation is to model ϕ_i using logistic regression, i.e. log(ϕ_i/(1-ϕ_i)) = x_iβ given some auxiliary information x_i.
The remainder of the paper addresses three alternative methods.

Population-based adjustment cell weighting

The first method the authors introduce is a population-based adjustment cell weighting, partitioning the population into cells indexed by h. The estimator is where W_h = N_h/N, r_h is the number of respondents in cell h, and the cell mean is given by . Alternatively, where w_hi is an element's weight. The bias of this estimator is .

Similar to before, this estimator's bias relates to the covariance of Y_hi and ϕ_hi. Importantly though it derives from covariance within the cell. Therefore if the probability to respond is constant within a cell, i.e. ϕ_hi = ϕ_h, there is no bias. E[y̅_p] = Y̅ and MSE(y̅_p) = Var(y̅_p) = ΣW²_hS²_h/r_h where S²_h is the element variance within cell h.

Consider two schemes:

scheme 1 uses H cells
scheme 2 collapses cells 1 and 2 together, rendering H-1 cells

The collapse leads to the second scheme having lower variance. At the same time, the bias becomes . Making assumptions about element variance, MSE(y̅_p1) > MSE(y̅_p2) if .

Sample-based adjustment cell weighting

The authors also introduce a sample-based method. Importantly, making parallel assumptions, they arrive to the same expression for when collapsing yields a lower MSE.

Raking ratio weighting

Finally, the authors introduce a raking method with two dimensions, one indexed by h and the other indexed by k. The estimator is where w̃_hk estimates W_hk in the joint distribution through iterative fitting. More formally, E[w̃_hk] = W_hk. More concretely, at convergence, the weights reflect the marginal distributions expressed as W_k = Σ_hW_hk = Σ_hw̃_hk and W_h = Σ_kW_hk = Σ_kw̃_hk.

The authors make a parallel assumption to the above: that the probability to respond is constant within a cell, i.e. ϕ_hki = ϕ_hk. At the same time, they loosen the assumption that w̃_hk converges to W_hk.

If E[w̃_hk] = W̃_hk, then bias of this estimator is then given by Bias(y̅_p) = ΣΣ(W̃_hk - W_hk)(Y̅_hk - Y̅_h - Y̅_k + Y̅). Therefore, even when w̃_hk is a biased estimator for W_hk, this can be an unbiased estimator for Y̅ if there is no interaction in Y_hk for the two-way classification.

If W_hk are known, the authors demonstrate that variance under adjustment cell weighting is lower than or equal to variance under raking ratio weighting. "An argument advanced for the use of raking is that it deals with the problem of small cells. To the extent that it does so, it operates in an indirect manner. When the W_hk distribution is known, it is not clear why raking should be preferred to adjustment cell weighting. With the latter procedure, weights can be trimmed and cells collapsed in a way that is tailor-made for the survey variables under study and for the particular sample configuration encountered. Further research is needed in this area."

Reading notes

The authors do actually discuss how the selection of estimators occurs after observing the response patterns, so they take r̃ as given, i.e. E[y̅_p|r̃] = Y̅. I omit this from my notes for brevity and because r̃ is inconvenient to type.

CategoryRicottone CategoryReadingNotes

AComparisonOfMethodsOfWeightingAdjustmentForNonresponse (last edited 2025-09-25 20:17:27 by DominicRicottone)