CPS Public Use Microdata

The Census Bureau and BLS publish microdata files from the CPS. The data are available for download and online access, through portals such as MDAT.


Variable Names

Variable names follow a scheme:

The first letter indicates an analysis level.

Letter

Meaning

P

Individual

H

Household

G

Geographical unit

The second letter indicates a source.

Letter

Meaning

U

Original, unedited

E

Edited

R

Recoded

T

Topcoded (i.e., any value over a threshold is recoded into the threshold)

||W||Weighting (see Weights section)|

|X

Allocation (see Allocation Flags section)


Missing Values

The following values have indicate a type of missing value:

Value

Meaning

-1

Blank (often out of universe)

-2

Don't know

-3

Refused


Weights

Weights are calculated for and assigned to each record to produce more accurate estimates. There are several weights provided by the Census Bureau, and a use case should be matched to the most appropriate ones.

Weight

Variable Name

Usage

Family weight

PWFMWGT

Used for estimates of families

Longitudinal weight

PWLGWGT

Used for records that are matched month-to-month

Outgoing rotation weight

PWORWGT

Used for estimates making use of only outgoing rotation groups (i.e., months 4 or 8)

Second stage weight

PWSSWGT

Used for calibration

Veterans weight

PWVETWGT

Used for estimates of veterans and nonveterans

Composited weight

PWCMPWGT

Used for estimates of individuals

Household weight

HWHHWGT

Used for estimates of households

Weight variables are stored with 4 implied decimal places; they should be divided by 10,000 before using.

For most estimations, the final composited weights are recommended.

When pulling CPS targets for use in weighting, the second stage weights should be used.


Allocation Flags

For some interviews, it is necessary to impute values. These edits are evident in changes from the unedited variable ("U") to the edited variable ("E"). An allocation flag variable ("X") is provided to make the imputation more evident.

Most individual- ("P") and household-level ("H") variables have a corresponding allocation flag. Some recoded ("R") and topcoded ("T") variables do as well.

All allocation variables follow the same scheme:

The first digit indicates the how.

Digit

Meaning

0

No change

1

Changed to some value

2

Changed to an unedited value from a prior interview

3

Changed to an edited value from a prior interview

4

Changed to an allocated value

5

Changed to be blank

The second digit indicates the why, and largely corresponds to the missing values detailed above.

Digit

Meaning

0

Unedited variable was set to a value

1

Unedited variable was blank

2

Unedited variable indicated "don't know"

3

Unedited variable indicated refusal

As an example, if PXSEX=21, then PESEX was blank and imputed from a prior interview.


Edited Universe

Some questions are intended for a subpopulation of respondents. Missingness (i.e., values of -1) is then enforced through allocation recodes. For example, PEEDUCA has an edited universe of PRPERTYP = 2 or 3; it will be set to -1 for all cases where PRPERTYP = 1.

The edited universe for any variable is noted in the data dictionary.


Data

The most commonly used variables, and recommended uses of them, are:

Variable

Usage

PRPERTYP

Identifies record as child, adult, or member of US Armed Forces

PRTAGE

Age; topcoded at 85

PESEX

Sex

PTDTRACE

Race

PEHSPNON

Hispanic ethnicity

PEEDUCA

Educational attainment; edited universe is PRPERTYP=2 or 3

PEAFEVER

U.S. Armed Forces service history; edited universe is PRTAGE>=17

PEMARITL

Marital status; edited universe is PRTAGE>=15

PENATVTY

Foreign-born status

PEMLR

Employment status; edited universe is PRPERTYP=2

PRWRKSTAT

Work status; edited universe is PEMLR=1 thru 7

PEHRUSL1

Usual hours worked per week at first job;edited universe is PEMLR=1 or 2

PEHRUSL1

Usual hours worked per week at second job (if applicable)

PEHRUSLT

PEHRUSL1 + PEHRUSL2

PEHRWANT

Seeking full-time work; edited universe is PEMLR=1 and PEHRUSLT=0 thru 34

HEFAMINC

Household annual income (16 categories); note the high allocation rate

GESTFIPS

State

GTCBSA

Metropolitan statistical area

GTCO

County

These measures are not collected from the entire sample.

Variable

Usage

PEERNHRO

Usual hours worked per week if paid an hourly wage

PTERNHLY

Hourly wage rate; 2 implied decimal places; topcoded based on the product of this and PEERNHRO

PTERNWA

PEERNHRO * PTERNHLY; 2 implied decimal places

PRDTOCC1

Occupation (23 categories) for first job; recode from PTIO1OCD

PRDTOCC2

Occupation for second job; recode from PTIO2OCD

PRDTIND1

Industry (52 categories) for first job; recode from PEIO1ICD

PRDTIND2

Industry for second job; recode from PEIO2ICD

These categories are either standard definitions or the prevailing usage of the term:

Category

Meaning

Civilian noninstitutional population

PRPERTYP=2 and PRTAGE>=16

Unemployed

PEMLR=3 or 4

Employed

PEMLR=1 or 2

Labor force

PEMLR=1 thru 4

Working part time for economic reasons

PRWRKSTAT=3 or 6

White, Non-Hispanic

PTDTRACE=1 and PEHSPNON=2

Black, Non-Hispanic

PTDTRACE=2 and PEHSPNON=2

AAPI, Non-Hispanic

PTDTRACE=4 or 5 or 15; and PEHSPNON=2

Other, Non-Hispanic

PTDTRACE=3 or 6thru 14 or 16thru 26; and PEHSPNON=2

Hispanic

PEHSPNON=1


See also

Census Bureau's project homepage

BLS's project homepage


CategoryRicottone

UnitedStates/CensusBureau/CurrentPopulationSurvey/PublicUseMicrodata (last edited 2024-08-29 23:58:33 by DominicRicottone)