Multiple Frame Surveys
Multiple Frame Surveys was written by H.O. Hartley in 1962. It was part of the proceedings of the American Statistical Association Social Statistics Section.
Compensation for overlaps, adjusting weights to reflect the probability of selection.
Given two sampling frames (G and H), consider the discrete domains to be a, b, and ab (i.e., the overlap). (Note: the author uses A and B to refer to sampling frames, which leads to expressions like NA/na. Not great. I am substituting in G and H here.) Clearly the population total Y of yi is equal to Ya + Yab + Yb. The author introduces the attribute ui to all cases, defined in frame G by:
and in frame H by:
where p + q = 1.
The important consequence is that Y = Ya + pYab + qYab + Yb.
'Case 2': known domain sizes
The author first considers the case where domain sizes (Na, Nb, and Nab) are known. The appropriate estimator for Y is:
Ŷ = Nay̅a + Nab(py̅abG + qy̅abH) + Nby̅b
where y̅abG is y̅ab computed from sampling frame G, and so on.
The variance of Ŷ is in terms of population variances (σa2, σb2, and σab2) as well as proportions of overlap in either sampling frame (α = Nab/NG and β = Nab/NG).
where NG is the size of sampling frame G, nG is the number of cases selected from sampling frame G, and so on.
The author then discusses sampling fractions which cost optimize this point estimate variance.
'Case 3': unknown domain sizes
The author next explores the case where domain sizes are unknown. The appropriate estimator for Y is:
where ya is the total of yi for i in domain a, and so on.
The variance of Ŷ is in terms of population variances, proportions of overlap in either sampling frame, and the difference between mean responses (Y̅) across domains.
The author then discusses sampling fractions which cost optimize this point estimate variance.
