SPSS Match Files

The MATCH FILES command joins datasets.


Usage

match files
  /file=left
  /file=right
  /by KEYVARS.

The final dataset contains all rows and columns from all datasets. Variables are taken in order, from datasets in order. If a variable is present in more than one dataset, values are taken from the first dataset they appear in and metadata is taken from the first dataset with any (i.e. variable label, value labels, or missing values) metadata set.

The key variables must be defined with the same format in each dataset. Cases must be uniquely identified by the key variables in each dataset, except if using the /TABLE subcommand, in which case this is only required of the dataset specified on the /TABLE subcommand itself.

The requirement for same format includes a requirement for same length of string variables, except in PSPP version 2.0 or later.

Each dataset must be presorted by the key variables.

File

Each /FILE subcommand takes one of:

If the active dataset is included in a join and referenced by a star (*), that dataset will be modified in-place by the join.

If the active dataset is included in a join and referenced by name, the final dataset will retain the name.

If the active dataset is not included, the final dataset is unnamed and becomes the active dataset.

Table

The MATCH FILES command has an extension through the /TABLE subcommand. It can be used to join lookup tables.

match files
  /file=foo
  /table=states
  /by statecode.

Each /TABLE subcommand takes one of:

In

The /IN subcommand mst immediately follow a /FILE subcommand. It creates a flag variable for that dataset: 1 for any case that is present in it, 0 otherwise.

The flag variable will be non-missing for all cases in the final dataset and will be appended to the end of the variables.

Rename

The /RENAME subcommand applies renames to the /FILE subcommand preceding it.

These renames take place before the datasets are joined. The key variables can be renamed to their final names.

First and Last

The /FIRST and /LAST subcommands append flag variables that mark the first and last matches by the key variables. This is generally only useful with /TABLE joins.

match files
  /file=population /first=headofhousehold
  /table=households
  /by id.

The MATCH FILES command can be used with a single dataset, in which case these subcommands can be used to mark the sequence of cases within a group.

match files
 /file=*
 /by id
 /first=PrimaryFirst
 /last=PrimaryLast.
do if PrimaryFirst=1.
  compute MatchSequence = 1 - PrimaryLast.
else.
  compute MatchSequence = MatchSequence + 1.
end if.
leave MatchSequence.

Keep and Drop

The /KEEP and /DROP subcommands specify a list of variables to keep or drop from the final dataset.

Any variable created by a /IN, /FIRST, or /LAST subcommand cannot be dropped.


Data Model

The MATCH FILES command reads all datasets and data files named on /FILE//TABLE subcommands.

The MATCH FILES command recognizes FILTER status and preserves it, although filtered cases are included in the final dataset.


See also

PSPP manual for MATCH FILES


CategoryRicottone

SPSS/MatchFiles (last edited 2024-01-02 17:09:03 by DominicRicottone)