Differences between revisions 7 and 12 (spanning 5 versions)

American Community Survey

The American Community Survey (ACS) is a continual survey operated by the Census Bureau.

Technically, the Puerto Rican component of the survey is the Puerto Rico Community Survey (PRCS).

Contents

American Community Survey

Usage

See here for notes on the public use microdata.

Design

Sampling

There are two samples used in this survey. The first is the housing unit (HU) sample. This is drawn from the Census Bureau's Master Address File (MAF). The annual target sample size is about 3.5 million.

No HU is sampled more than once in 5 years. To accommodate this, the list is partitioned into 5 subframes that are assigned to discrete sample years. New addresses are randomly selected into a subframe.

Each annual survey administration is divided into 6-month periods. Period 1 covers the months of January through June and is selected in September, while Period 2 covers July though December and is selected in March. For a Period 1 sample, the appropriate sample year subframe is randomly assigned to either of the periods. For a Period 2 sample, after incorporating the assignments from the Period 1 sample, any new addresses are randomly assigned to either of the periods. This process constitutes the first-stage sampling.

For the second-stage, the appropriate periodic subframe is used. The target sample size is roughly half the annual target sample size. The HU sample is selected from every county independently. Here 'county' is used to mean actual counties, county equivalents, municipalities in PR, and D.C. itself.

A follow-up sample is selected from the set of nonresponding addresses. In some blocks, this sampling rate is 100%.

The second sample is the group quarters (GQ) sample. This is drawn from a list of known multi-resident addresses that is labeled by type. Immediately it is partitioned between small GQ facilities (i.e., 15 or fewer residents) and large GQ facilities (i.e., more than 15). Addresses with an unknown count are treated as small.

The small GQ sample is very similar to the HU sample. The first-stage of sampling is partitioning the list into five sample year subframes. The second-stage selects addresses from every state (plus DC and PR) independently. The number of residents is not taken into account, given there is little variation.

At the time of the interview, the roster of actual residents is identified. If there are more than 15 actual residents, a random sample of 10 is selected.

The list of large GQ facilities is not partitioned into sample years. Residents in such facilities are treated as groups of 10. Addresses then have a calculated GQ measure of size (GQMOS), which is the number of residents divided by 10. Groups are selected from every state (plus DC and PR) independently. If a state's sampling rate is 2.5%, a facility with a GQMOS of 40 (i.e., roughly 400 residents) will have at least one group selected with certainty. This process constitutes the first-stage sampling.

At the time of the interview, the selection of 10 actual residents constitutes the second-stage sampling. The roster of actual residents is identified and a random sample of 10 is selected. If there are fewer than 10 actual residents, all are selected with certainty. This process constitutes the second-stage sampling.

If a large GQ facility has multiple groups selected, they are spread across sample months. In some cases it is still necessary to administer the survey to multiple groups in the same sample month, in which case the second-stage sample is larger than 10.

Note also that this second-stage sample is drawn independent of any other sample months. Residents of a large GQ facility that is included in multiple sample months can be selected repeatedly.

Also note that the process for small GQ facilities that are identified at the time of interview as having more than 15 actual residents mirrors the large GQ sample process where one group was selected.

Lastly, Remote Alaska is handled separately. HU sample months are selected with respect to season and geography, to balance workload. The GQ sample is assigned to either January or July, and both are administered over 6 months.

Mode

Survey invitations are sent by mail to selected HU addresses, and primary residents are encouraged to respond by either mail-back paper survey or to follow a link for a web survey. The survey, like the long-form decennial census that it replaced, is mandatory.

The follow-up effort uses CAPI.

Interviewers go to selected GQ addresses in-person to identify the roster of actual residents and administer the second-stage sampling.

Lastly, Remote Alaska is handled separately. All addresses are surveyed by CAPI.

Frequency

Responses are collected into annual vintages; '1-year estimates' are usually published in the Fall.

Data is further aggregated into a rolling 5-year window for '5-year estimates'.

Weighting

HU sample members carry a base weight equal to the inverse selection probability. Those selected for the follow-up sample have their base weight adjusted to account for this selection probability as well. There is also a correction adjustment for those selected for the follow-up sample but actually did respond (late) to the original survey invite. This correction is applied to every sample member selected for the follow-up sample, and simply un-does the prior adjustment's transfer of weight.

Nonresponse adjustments are calculated with respect to eligible households. That is, vacant or demolished structures that were sampled are not used for this adjustment.

Finally, the HU sample is post-stratified to population controls by race/ethnicity, sex, and age group (13 levels). These controls are modeled at the sub-county-level. These are the HU weights.

Person-level HU weights are also produced that take into account the actual demographic characteristics of HU residents. The HU weights are used as the base.

The GQ data set is prepared specially to account for geographies with zero selected facilities. All large out-of-sample GQ facilities receive imputed whole person records. A random sample of small out-of-sample GQ facilities are selected to similarly receive imputed whole person records.

GQ sample members carry a base weight equal to the inverse selection probability. These are then adjusted with tract-, and county-, and state-constraints, such that the sum of weights match population counts at those levels. The reason that weights did not already sum to those counts is that small GQs are not selected with respect to the number of residents.

Finally, the GQ sample is post-stratified to population controls. These are the GQ person weights.

Geographies

1-year estimates are published for geographic areas with populations of 65,000 or more. This threshold is set so that estimates can be made available for all states, territories, congressional districts, PUMAs, CBSAs, cities, and Native American areas.

5-year estimates are published for geographic areas with much smaller population levels. This includes ZCTAs, census tracts, and census block groups.

History

The ACS was launched in 2005 as a replacement for the long-form U.S. census. It provides more timely data because data collection is continuous, and then published in a periodic aggregation. It is used to allocate federal and state funding.

CategoryRicottone

UnitedStates/CensusBureau/AmericanCommunitySurvey (last edited 2025-09-08 22:46:41 by DominicRicottone)

-  ⇤ ← Revision 7 as of 2024-03-27 15:54:00 → 
  Size: 1498
  Editor: DominicRicottone
  Comment: Publication timeline
+   ← Revision 12 as of 2025-09-08 22:46:41 → ⇥
  Size: 8076
  Editor: DominicRicottone
  Comment: Notes
-Deletions are marked like this.
+Additions are marked like this.
 Line 3:
-The '''American Community Survey''' ('''ACS''') is an annual survey operated by the Census Bureau to establish demographic information about the nation's population.
+The '''American Community Survey''' ('''ACS''') is a continual survey operated by the Census Bureau.

Technically, the [[UnitedStates/PuertoRico|Puerto Rican]] component of the survey is the '''Puerto Rico Community Survey''' ('''PRCS''').
-Line 6:
+Line 8:
+----



== Usage ==

See [[UnitedStates/CensusBureau/AmericanCommunitySurvey/PublicUseMicrodata|here]] for notes on the public use microdata.
-Line 15:
+Line 25:
+=== Sampling ===

There are two samples used in this survey. The first is the '''housing unit''' ('''HU''') sample. This is drawn from the Census Bureau's Master Address File (MAF). The annual target sample size is about 3.5 million.

No HU is sampled more than once in 5 years. To accommodate this, the list is partitioned into 5 subframes that are assigned to discrete sample years. New addresses are randomly selected into a subframe.

Each annual survey administration is divided into 6-month periods. Period 1 covers the months of January through June and is selected in September, while Period 2 covers July though December and is selected in March. For a Period 1 sample, the appropriate sample year subframe is randomly assigned to either of the periods. For a Period 2 sample, after incorporating the assignments from the Period 1 sample, any new addresses are randomly assigned to either of the periods. This process constitutes the first-stage sampling.

For the second-stage, the appropriate periodic subframe is used. The target sample size is roughly half the annual target sample size. The HU sample is selected from every county independently. Here 'county' is used to mean actual counties, county equivalents, municipalities in [[UnitedStates/PuertoRico|PR]], and [[UnitedStates/WashingtonDC|D.C.]] itself.

A follow-up sample is selected from the set of nonresponding addresses. In some blocks, this sampling rate is 100%.

The second sample is the '''group quarters''' ('''GQ''') sample. This is drawn from a list of known multi-resident addresses that is labeled by type. Immediately it is partitioned between small GQ facilities (i.e., 15 or fewer residents) and large GQ facilities (i.e., more than 15). Addresses with an unknown count are treated as small.

The small GQ sample is very similar to the HU sample. The first-stage of sampling is partitioning the list into five sample year subframes. The second-stage selects addresses from every state (plus DC and PR) independently. The number of residents is not taken into account, given there is little variation.

At the time of the interview, the roster of actual residents is identified. If there are more than 15 actual residents, a random sample of 10 is selected.

The list of large GQ facilities is ''not'' partitioned into sample years. Residents in such facilities are treated as groups of 10. Addresses then have a calculated '''GQ measure of size''' ('''GQMOS'''), which is the number of residents divided by 10. Groups are selected from every state (plus DC and PR) independently. If a state's sampling rate is 2.5%, a facility with a GQMOS of 40 (i.e., roughly 400 residents) will have at least one group selected with certainty. This process constitutes the first-stage sampling.

At the time of the interview, the selection of 10 actual residents constitutes the second-stage sampling. The roster of actual residents is identified and a random sample of 10 is selected. If there are fewer than 10 actual residents, all are selected with certainty. This process constitutes the second-stage sampling.

If a large GQ facility has multiple groups selected, they are spread across sample months. In some cases it is still necessary to administer the survey to multiple groups in the same sample month, in which case the second-stage sample is larger than 10.

Note also that this second-stage sample is drawn independent of any other sample months. Residents of a large GQ facility that is included in multiple sample months can be selected repeatedly.

Also note that the process for small GQ facilities that are identified at the time of interview as having more than 15 actual residents mirrors the large GQ sample process where one group was selected.

Lastly, Remote Alaska is handled separately. HU sample months are selected with respect to season and geography, to balance workload. The GQ sample is assigned to either January or July, and both are administered over 6 months.



=== Mode ===

Survey invitations are sent by mail to selected HU addresses, and primary residents are encouraged to respond by either mail-back paper survey or to follow a link for a web survey. The survey, like the long-form decennial census that it replaced, is mandatory.

The follow-up effort uses CAPI.

Interviewers go to selected GQ addresses in-person to identify the roster of actual residents and administer the second-stage sampling.

Lastly, Remote Alaska is handled separately. All addresses are surveyed by CAPI.
-Line 17:
+Line 72:
-The ACS is administered annually, with continuous data collection. No household is sampled more than once in 5 years.
+Responses are collected into annual vintages; '1-year estimates' are usually published in the Fall.
-Line 19:
+Line 74:
--year estimates are usually published in the fall, and 5-year estimates are usually published in the winter.
+Data is further aggregated into a rolling 5-year window for '5-year estimates'.



=== Weighting ===

HU sample members carry a base weight equal to the inverse selection probability. Those selected for the follow-up sample have their base weight adjusted to account for this selection probability as well. There is also a correction adjustment for those selected for the follow-up sample but actually did respond (late) to the original survey invite. This correction is applied to every sample member selected for the follow-up sample, and simply un-does the prior adjustment's transfer of weight.

Nonresponse adjustments are calculated with respect to eligible households. That is, vacant or demolished structures that were sampled are not used for this adjustment.

Finally, the HU sample is post-stratified to population controls by race/ethnicity, sex, and age group (13 levels). These controls are modeled at the sub-county-level. These are the HU weights.

Person-level HU weights are also produced that take into account the actual demographic characteristics of HU residents. The HU weights are used as the base.

The GQ data set is prepared specially to account for geographies with zero selected facilities. All large out-of-sample GQ facilities receive imputed whole person records. A random sample of small out-of-sample GQ facilities are selected to similarly receive imputed whole person records.

GQ sample members carry a base weight equal to the inverse selection probability. These are then adjusted with tract-, and county-, and state-constraints, such that the sum of weights match population counts at those levels. The reason that weights did not already sum to those counts is that small GQs are not selected with respect to the number of residents.

Finally, the GQ sample is post-stratified to population controls. These are the GQ person weights.
-Line 25:
+Line 98:
--year estimates are published for geographic areas with populations of 65,000 or more. This threshold is set so that estimates can be made available for all states, territories, congressional districts, [[UnitedStates/CensusBureau/PublicUseMicrodataAreas|PUMAs]], [[UnitedStates/CensusBureau/CoreBasedStatisticalAreas|CBSAs]], cities, and Native American areas.
+-year estimates are published for geographic areas with populations of 65,000 or more. This threshold is set so that estimates can be made available for all states, territories, congressional districts, [[UnitedStates/CensusBureau#Public_Use_Microdata_Areas|PUMAs]], [[UnitedStates/CensusBureau#Core_Based_Statistical_Areas|CBSAs]], cities, and Native American areas.
-Line 27:
+Line 100:
--year estimates are published for geographic areas with much smaller population levels. This includes [[UnitedStates/CensusBureau/ZipCodeTabulationAreas|ZCTAs]], [[UnitedStates/CensusBureau/Census#Geography|census tracts, and census block groups]].
+-year estimates are published for geographic areas with much smaller population levels. This includes [[UnitedStates/CensusBureau#ZIP_Code_Tabulation_Areas|ZCTAs]], [[UnitedStates/CensusBureau#Tracts|census tracts]], and [[UnitedStates/CensusBureau#Blocks|census block groups]].

Diff for "UnitedStates/CensusBureau/AmericanCommunitySurvey"