Survey Sampling

Survey sampling is a procedure of selecting prospective respondents for a survey experiment.


Probability Sampling

Probability sampling begins with identification of the frame. A properly-specified frame covers the complete true population. An improperly-specified frame introduces sampling error.

A census is a survey of the complete frame. The probability of selection is 1, so the base survey weight is also 1.

Methods

The baseline for survey sampling is SRS.

Probability proportionate to size (PPS) ensures that chance to be contacted increases with the magnitude of some measure. For example, in a study of utility customers, the largest consumers of that utility should almost always be contacted.

Systematic sampling selects every Nth case from a list.

Stratification

Stratification is the partition of a frame into discrete classes using information that is known for the entire frame. Each stratum is sampled separately, often with differing probabilities of selection according to some allocation method.

Allocation methods include:

When a stratum is purposely allocated more sample than would be prescribed by proportional allocation, it is said to be oversampled. One reason to do this (apart from variance optimization) is to ensure that enough responses are collected from a minority group to support t tests.

Stratification qualifies as a complex survey design because the standard errors must be estimated with attention to strata. As an example, if a stratum happens to be excluded from an estimate, its contribution towards true variance is excluded from a conventional estimator. This is especially common with sub-population estimates, and this is why Stata supports subpop and over options for many estimation commands that can otherwise seem redundant given if expressions. As another example, a stratum may only have one observation (i.e., a singleton stratum), and a conventional estimator will of course fail in this case.

Ideally, stratification uses information that is known to be true, such that there is no reason for cases to be 're-classified'. Manipulating strata in such a way distorts variance estimates.

Multi-stage

Similar to stratification, multi-stage sampling partitions a frame into discrete classes using information that is known for the entire frame. Often this is geographic. From this first stage, primary sampling units (PSU) are selected. The second stage selects secondary sampling units (SSU) from only the selected PSUs.

This method is useful for in-person surveying, as it is logistically necessary to constrain the geography of survey administration.

Multi-stage sampling also qualifies as a complex survey design because there are PSUs with some probability to be selected in one stage, but zero probability to be selected in the second.


Non-probability Sampling

Non-probability sampling involves soliciting responses from a stream of people that differs from the true population. There is no known probability of selection. There are some people with zero probability of responding, and generally there are also some people who respond with certainty (i.e., 'professional' survey takers).


CategoryRicottone

Statistics/SurveySampling (last edited 2025-11-03 02:07:45 by DominicRicottone)