Differences between revisions 3 and 4
Revision 3 as of 2023-01-13 23:15:20
Size: 1711
Comment:
Revision 4 as of 2023-06-09 20:31:28
Size: 1949
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

The '''`MATCH FILES`''' command joins datasets.
Line 11: Line 13:
To join two datasets, try:
Line 15: Line 15:
  /file=LEFT
  /file=RIGHT
  /by KEYVARLIST.
  /file=left
  /file=right
  /by KEYVARS.
Line 20: Line 20:
The final dataset contains all rows and variables from all datasets. Variables are taken in order from the datasets in order. For variables originating from more than one dataset, values are taken from the first dataset they appear in and metadata is taken from the first dataset with any (i.e. variable label, value labels, or missing values) metadata set. The final dataset contains all rows and columns from all datasets. Variables are taken in order, from datasets in order. If a variable is present in more than one dataset, values are taken from the first dataset they appear in and metadata is taken from the first dataset with any (i.e. variable label, value labels, or missing values) metadata set.

The key variables must be defined with the same format (including length for string variables) in each dataset. Cases must be uniquely identified by the key variables in each dataset, except if using the '''`/TABLE`''' subcommand, in which case this is only required of the dataset specified on the `/TABLE` subcommand itself.

Each dataset must be presorted by the key variables.
Line 26: Line 30:
Each `/FILE` subcommand takes one of: Each '''`/FILE`''' subcommand takes one of:
Line 30: Line 34:
 * a filename or file handle  * a filename or [[SPSS/FileHandle|file handle]]
Line 34: Line 38:
If the active dataset is included in a join and referenced by name, the final dataset will retain the name.
Line 35: Line 40:

=== By ===

The `/BY` subcommand specified how cases can be uniquely identified. The `KEYVARLIST` can be one ore more variables.

The folowing are required of key variables:

 * They must be defined and have the same format in each file (including length for string variables)
 * They must uniquely identify a case in each file
 * Each file must be pre-sorted by them

If the `/TABLE` subcommand is used, the key variables specified on the `/BY` subcommand ''only'' need to uniquely identify a case across in each table.
If the active dataset is not included, the final dataset is unnamed and becomes the active dataset.
Line 56: Line 50:
  /file=LEFT
  /table=LOOKUP
  /by ID.
  /file=foo
  /table=states
  /by statecode.
Line 66: Line 60:
----



== See also ==

[[https://www.gnu.org/software/pspp/manual/html_node/MATCH-FILES.html|PSPP manual for MATCH FILES]]

SPSS Match Files

The MATCH FILES command joins datasets.


Usage

match files
  /file=left
  /file=right
  /by KEYVARS.

The final dataset contains all rows and columns from all datasets. Variables are taken in order, from datasets in order. If a variable is present in more than one dataset, values are taken from the first dataset they appear in and metadata is taken from the first dataset with any (i.e. variable label, value labels, or missing values) metadata set.

The key variables must be defined with the same format (including length for string variables) in each dataset. Cases must be uniquely identified by the key variables in each dataset, except if using the /TABLE subcommand, in which case this is only required of the dataset specified on the /TABLE subcommand itself.

Each dataset must be presorted by the key variables.

File

Each /FILE subcommand takes one of:

  • a star (*) indicating the active data set

  • the name of a data set
  • a filename or file handle

If the active dataset is included in a join and referenced by a star (*), that dataset will be modified in-place by the join.

If the active dataset is included in a join and referenced by name, the final dataset will retain the name.

If the active dataset is not included, the final dataset is unnamed and becomes the active dataset.

Table

The MATCH FILES command has an extension through the /TABLE subcommand. It can be used to join lookup tables.

match files
  /file=foo
  /table=states
  /by statecode.

Each /TABLE subcommand takes one of:

  • the name of a data set
  • a filename or file handle


See also

PSPP manual for MATCH FILES


CategoryRicottone

SPSS/MatchFiles (last edited 2024-01-02 17:09:03 by DominicRicottone)