SPSS Reading Data

SPSS has a wide variety of commands for importing, parsing, or otherwise reading in data.

Contents

SPSS Reading Data

SPSS Formats

SPSS has an internal format syntax which is used throughout data import steps. Further details are available here.

For the purposes of this page, all that needs to be understood is:

A20 is a 20-wide string variable
F8 is a 8-wide numeric variable
F8.2 is a 8-wide numeric variable with 2 decimal points, i.e. 12345.78

Columnar Definition

The columnar definition of a fixed-width variable consists of:

the name
for a 1-wide variable: the column index
for a 2+-wide variable: the start and end column indices, separated by a dash (-)
the variable format within parentheses

If a variable format is not specified, the basic numeric format (F) is assumed.

Note that GET DATA does not fully comply to this standard:

GET DATA counts columns starting at 0, not 1
variables must have start and end column indices on a GET DATA command, so 1-wide variables will be specified like 1-1

Decimals

A numeric variable's columnar definition can have a decimal place indicated with the variable format. A survey weight could be defined as final_wt 1-8 (F, 5) and would be imported as F8.5.

Furthermore, because the numeric format is the default, this can be shortened to (5). This isn't necessarily recommended though.

Strings

A string variable's columnar definition would look like Name 1-24 (A). However, a major caveat is that columnar indices are byte-wise. In other words, Unicode data will be treated as discrete bytes rather than characters.

FORTRAN Definition

Data List

The DATA LIST command is used to read in arbitrary data.

If the data is stored in an external file, reference it on a /file subcommand.

If the data is entered in the syntax, it must be bounded by BEGIN DATA and END DATA statements.

Free

The FREE subcommand causes data to be read into rows and columns irrespective of record delimiters.

data list free / CaseNum (F2) Score (F3).
begin data
1, 10, 2, 40, 3, 15, 4, 10, 5, 15, 6,, 7, 25, 8, 10
end data.

This command results in the following dataset:

!CaseNum	Score
1	10
2	40
3	15
4	10
5	15
6
7	25
8	10

If formats were not specified, these variables would be read in using the default (F8.2).

List

The LIST subcommand operates much the same as the FREE subcommand except that record delimiters matter.

This is an equivalent syntax to the above example.

data list list / CaseNum (F2) Score (F3).
begin data
1, 10
2, 40
3, 15
4, 10
5, 15
6
7, 25
8, 10
end data.

Fixed

The FIXED subcommand causes data to be read according to fixed-width columnar formats.

This is an equivalent syntax to the above example.

data list fixed / CaseNum 1-2 Score 4-6.
begin data
 1  10
 2  40
 3  15
 4  10
 5  15
 6
 7  25
 8  10
end data.

Note that columns are numbered starting at 1 for DATA LIST, whereas GET DATA starts at 0.

Multi-record data

When cases are spread across multiple records, it is possible to read them in as a single row.

data list fixed record=2 /1 CaseNum 1-2 Score 4-6 /2 Time 1-4.
 1  10
1200
 2  40
0800
 3  15
1600

This command results in the following dataset:

!CaseNum	Score	Time
1	10	1200
2	40	800
3	15	1600

Get Data

The GET DATA command is used to read in well-structured data. Examples for CSV, tab-delimited, and fixed-width are available.

Note that columns are numbered starting at 0 for GET DATA, whereas DATA LIST starts at 1.

Multi-record data

When cases are spread across multiple records, it may be possible to read them in as a single row. In contrast to the DATA LIST command, this is only possible with delimited data on the GET DATA command. Furthermore, all cases must have the same number of variables.

The /DELCASE subcommand defaults to LINE. It can alternatively be used like:

get data
  /type=txt
  /file="path/to/file"
  /delimiters="\t"
  /delcase=variables 2
  /firstcase=2
  /importcase=all
  /variables=
  VAR1 A1
  VAR2 F2.

CategoryRicottone

SPSS/ReadingData

SPSS Reading Data

SPSS Formats

Columnar Definition

Decimals

Strings

FORTRAN Definition

Data List

Free

List

Fixed

Multi-record data

Get Data

Multi-record data