Multiple Frame Surveys

Multiple Frame Surveys was written by H.O. Hartley in 1962. It was part of the proceedings of the American Statistical Association Social Statistics Section.

Compensation for overlaps, adjusting weights to reflect the probability of selection.

Given two sampling frames (G and H), consider the discrete domains to be a, b, and ab (i.e., the overlap). (Note: the author uses A and B to refer to sampling frames, which leads to expressions like N_A/n_a. Not great. I am substituting in G and H here.) Clearly the population total Y of y_i is equal to Y_a + Y_ab + Y_b. The author introduces the attribute u_i to all cases, defined in frame G by:

and in frame H by:

where p + q = 1.

The important consequence is that Y = Y_a + pY_ab + qY_ab + Y_b.

'Case 2': known domain sizes

The author first considers the case where domain sizes (N_a, N_b, and N_ab) are known. The appropriate estimator for Y is:

Ŷ = N_ay̅_a + N_ab(py̅_ab^G + qy̅_ab^H) + N_by̅_b

where y̅_ab^G is y̅_ab computed from sampling frame G, and so on.

The variance of Ŷ is in terms of population variances (σ_a², σ_b², and σ_ab²) as well as proportions of overlap in either sampling frame (α = N_ab/N_G and β = N_ab/N_G).

where N_G is the size of sampling frame G, n_G is the number of cases selected from sampling frame G, and so on.

The author then discusses sampling fractions which cost optimize this point estimate variance.

'Case 3': unknown domain sizes

The author next explores the case where domain sizes are unknown. The appropriate estimator for Y is:

where y_a is the total of y_i for i in domain a, and so on.

The variance of Ŷ is in terms of population variances, proportions of overlap in either sampling frame, and the difference between mean responses (Y̅) across domains.

The author then discusses sampling fractions which cost optimize this point estimate variance.

CategoryRicottone CategoryReadingNotes

MultipleFrameSurveys

Multiple Frame Surveys

'Case 2': known domain sizes

'Case 3': unknown domain sizes