Differences between revisions 1 and 16 (spanning 15 versions)
Revision 1 as of 2019-12-08 06:36:29
Size: 1234
Comment: Initial commit
Revision 16 as of 2023-01-13 20:44:20
Size: 4246
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Datetime = = SPSS Data Types =
Line 3: Line 3:
== A warning about data input == SPSS exposes numeric and string data types.
Line 5: Line 5:
SPSS tries to be clever about reading data. Certain specialized forms of data are handled by formats that impact visualization and export, not storage. The primary example of this is date and time data.
Line 7: Line 7:
Under the format `TIME`, all of these are read in as `"01:02"`, even though the format is a minimum of 5-wide:
 * `"1:2"`
 * `"01 2"`
 * `"01:02"`
<<TableOfContents>>
Line 12: Line 9:
Under the format `DATE`, all of these are read in as `"28-OCT-90"`:
 * `"28-OCT-90"`
 * `"28/10/1990"`
 * `"28.OCT.90"`
 * `"28 October, 1990"`
----


== Numeric Data ==

'''Numeric''' data is stored as double-precision floating point.
Line 20: Line 18:
== Datetime formats == === Numeric Formats ===
Line 22: Line 20:
||'''Format''' ||'''Appears as...''' ||'''Note''' ||
||DATETIME20 ||`dd-MMM-yyyy hh:mm:ss` ||`MMM` is as JAN, not 001||
||DATETIME18 ||`dd-MMM-yyyy hh:mm` ||`MMM` is as JAN, not 001||
||DATE11 ||`dd-MMM-yyyy` ||`MMM` is as JAN, not 001||
||DATE7 ||`dd-MMM-yy` ||`MMM` is as JAN, not 001||
||TIME8 ||`hh:mm:ss` || ||
||TIME5 ||`hh:mm` || ||
||ADATE10 ||`mm/dd/yyyy` || ||
||ADATE8 ||`mm/dd/yy` || ||
||EDATE10 ||`dd.mm.yyyy` || ||
||EDATE8 ||`dd.mm.yy` || ||
Data formats adjust the visualized/exported representation of a data point, often only truncating the value. Adjusting these data formats does not destroy precision.

Given the literal value 123.45...

||'''General Format''' ||'''Format''' ||'''Representation''' ||
||Fw ||F8 ||123 ||
||Fw.d ||F8.1 ||123.4 ||
||Nw ||N8 ||00000123 ||
||Nw.d ||N8.1 ||000123.4 ||

See [[SPSS/NumericFunctions|here]] for the built-in library of numeric functions.



=== Date and Time Formats ===

'''Date''' data is stored as the number of seconds from midnight, October 14, 1582 to midnight on the specified date.

'''Datetime''' data is stored as the number of seconds to the specified time on the specified date.

'''Time''' data is stored as the number of seconds. This ''can'' be imagined as the number of seconds from midnight to the specified time, but that is not a ''necessary'' construct.

Keep in mind that 1 day = 60 (seconds) * 60 (minutes) * 24 (hours) = 86400 seconds.

||'''Format''' ||'''Representation''' ||
||`DATETIME20` ||`dd-MMM-yyyy hh:mm:ss` ||
||`DATETIME17` ||`dd-MMM-yyyy hh:mm` ||
||`DATE11` ||`dd-MMM-yyyy` ||
||`DATE9` ||`dd-MMM-yy` ||
||`TIME8` ||`hh:mm:ss` ||
||`TIME5` ||`hh:mm` ||
||`ADATE10` ||`mm/dd/yyyy` ||
||`EDATE10` ||`dd.mm.yyyy` ||
||`SDATE10` ||`yyyy/mm/dd` ||

Note that `MMM` appears as `JAN`, not `001`.

Note that `DATE11` and `DATE9` differ in the representation of years ''(4 and 2 digits respectively)''. The American, European, and Sortable date formats have a similar feature by swapping `[AES]DATE10` with `[AES]DATE8`.

See [[SPSS/DateTimeFunctions|here]] for the built-in library of date and time functions.

----



== String Data ==

'''String''' data is stored at a fixed length. This length is defined and adjusted through the format; `Aw` where `w` is the width.



=== String Formats ===

To explicitly re-size a string, try:

{{{
alter type VAR (a100).
}}}

To re-size a string to the smallest possible length (without losing data), try:

{{{
alter type VAR (a=amin).
}}}

The size does ''not'' automatically grow to accept longer values, and string expressions do not automatically strip trailing whitespace. Consider the below example:

{{{
data list list / numeric_zip_code (F5).
begin data
12345
1234
123
end data.

string padding (A2).
if numeric_zip_code lt 10000 padding="0".
if numeric_zip_code lt 1000 padding="00".
string zip_code (A5).
compute zip_code=concat(padding, string(numeric_zip_code,F5)).
execute.
}}}

This will produce:

||'''numeric_zip_code'''||'''padding'''||'''zip_code'''||
||`12345` ||`" "` ||`" 123"` ||
||`1234` ||`"0 "` ||`"0 123"` ||
||`123` ||`"00"` ||`"00123"` ||

The `concat` function is returning a 7-long string value, and SPSS is silently truncating it to 5-long before storing in `zip_code`.

There are better ways to zero-pad a string, but fundamentally the mistake is not wrapping string expressions in `ltrim` and `rtrim` calls.

See [[SPSS/StringFunctions|here]] for the built-in library of string functions.



== String Literals ==

String literals are declared by wrapping a string value in quotes, either single (') or double (").

Quote marks within strings can be handled in one of two ways: either use the opposite quote mark to define the string, or escape the quote mark.

{{{
variable label var1 "Say ""Hello!""".
variable label var2 'Don''t say "Goodbye!"'.
}}}

SPSS Data Types

SPSS exposes numeric and string data types.

Certain specialized forms of data are handled by formats that impact visualization and export, not storage. The primary example of this is date and time data.


Numeric Data

Numeric data is stored as double-precision floating point.

Numeric Formats

Data formats adjust the visualized/exported representation of a data point, often only truncating the value. Adjusting these data formats does not destroy precision.

Given the literal value 123.45...

General Format

Format

Representation

Fw

F8

123

Fw.d

F8.1

123.4

Nw

N8

00000123

Nw.d

N8.1

000123.4

See here for the built-in library of numeric functions.

Date and Time Formats

Date data is stored as the number of seconds from midnight, October 14, 1582 to midnight on the specified date.

Datetime data is stored as the number of seconds to the specified time on the specified date.

Time data is stored as the number of seconds. This can be imagined as the number of seconds from midnight to the specified time, but that is not a necessary construct.

Keep in mind that 1 day = 60 (seconds) * 60 (minutes) * 24 (hours) = 86400 seconds.

Format

Representation

DATETIME20

dd-MMM-yyyy hh:mm:ss

DATETIME17

dd-MMM-yyyy hh:mm

DATE11

dd-MMM-yyyy

DATE9

dd-MMM-yy

TIME8

hh:mm:ss

TIME5

hh:mm

ADATE10

mm/dd/yyyy

EDATE10

dd.mm.yyyy

SDATE10

yyyy/mm/dd

Note that MMM appears as JAN, not 001.

Note that DATE11 and DATE9 differ in the representation of years (4 and 2 digits respectively). The American, European, and Sortable date formats have a similar feature by swapping [AES]DATE10 with [AES]DATE8.

See here for the built-in library of date and time functions.


String Data

String data is stored at a fixed length. This length is defined and adjusted through the format; Aw where w is the width.

String Formats

To explicitly re-size a string, try:

alter type VAR (a100).

To re-size a string to the smallest possible length (without losing data), try:

alter type VAR (a=amin).

The size does not automatically grow to accept longer values, and string expressions do not automatically strip trailing whitespace. Consider the below example:

data list list / numeric_zip_code (F5).
begin data
12345
1234
123
end data.

string padding (A2).
if numeric_zip_code lt 10000 padding="0".
if numeric_zip_code lt 1000  padding="00".
string zip_code (A5).
compute zip_code=concat(padding, string(numeric_zip_code,F5)).
execute.

This will produce:

numeric_zip_code

padding

zip_code

12345

"  "

"  123"

1234

"0 "

"0 123"

123

"00"

"00123"

The concat function is returning a 7-long string value, and SPSS is silently truncating it to 5-long before storing in zip_code.

There are better ways to zero-pad a string, but fundamentally the mistake is not wrapping string expressions in ltrim and rtrim calls.

See here for the built-in library of string functions.

String Literals

String literals are declared by wrapping a string value in quotes, either single (') or double (").

Quote marks within strings can be handled in one of two ways: either use the opposite quote mark to define the string, or escape the quote mark.

variable label var1 "Say ""Hello!""".
variable label var2 'Don''t say "Goodbye!"'.


CategoryRicottone

SPSS/DataTypes (last edited 2023-06-13 05:00:25 by DominicRicottone)