= Sequential Sample Selection Methods =

'''Sequential Sample Selection Methods''' was written by James R. Chromy in 1979. It was part of the proceedings of the American Statistical Association ''Section on Survey Research Methods''.

The author describes a sequential PPS sampling algorithm that can be efficiently programmed.

There are ''N'' sampling units. Each unit, indexed by ''i'', is associated with a size measure as ''S(i)''.

Let ''n(i)'' to be the number of 'sample hits' for unit ''i''. Naturally, ''Σ,,i,,n(i)'' is equal to the sample size, ''n''. In probability non-replacement (PNR) sampling, ''n(i)'' is equal to 1 for ''n'' units and 0 for all others. In probability replacement (PR) sampling, ''n(i)'' can take on higher values (in theory up to ''n'').

It can be shown that ''E[n(i)] = nS(i) / Σ,,i,,S(i)'' and that ''Σ,,i,,E[n(i)] = n''. Henceforth, let ''Σ,,i,,S(i)'' be denoted as ''S(+)''

It follows that a computer algorithm can determine values of ''n(i)'' with these probabilities by sequentially visiting units, rather than operating on the entire set. The author introduces this as probability minimum replacement (PMR). First, calculate a uniform random value for each unit. Then let ''I(i)'' and ''F(i)'' be the integer and fractional parts, respectively, of

{{attachment:eq1.svg}}

This represents the expected number of sample hits for the subset of units up to and including unit ''i''. It follows that 

{{attachment:eq2.svg}}

If the uniform random value is less than the conditional probability given by

{{attachment:eq3.svg}}

then ''n(i)'' is characterized by

{{attachment:eq4.svg}}

otherwise by

{{attachment:eq5.svg}}

Clearly then

{{attachment:eq6.svg}}

There is then some more math for variance estimation, which is going over my head.



----
CategoryRicottone CategoryReadingNotes CategoryTodoRead