Differences between revisions 16 and 18 (spanning 2 versions)
Revision 16 as of 2023-01-13 20:44:20
Size: 4246
Comment:
Revision 18 as of 2023-06-13 05:00:25
Size: 2443
Comment:
Deletions are marked like this. Additions are marked like this.
Line 12: Line 12:
Line 14: Line 15:
'''Numeric''' data is stored as double-precision floating point. Numeric data is stored as double-precision floating point.

Generally, numeric data uses either a numeric format (`Fw.d` where `w` is the field width and `d` is the number of visible decimal places) or a restricted numeric format (`Nw`) for displaying data. See [[SPSS/DataFormats#Print_Formats|here]] for further information. The default format, which will be applied to any implicitly declared variables, is `F8.2`.
Line 18: Line 21:
=== Numeric Formats ===
Line 20: Line 22:
Data formats adjust the visualized/exported representation of a data point, often only truncating the value. Adjusting these data formats does not destroy precision. === Date and Time Data ===
Line 22: Line 24:
Given the literal value 123.45... Dates are stored as the number of seconds from midnight, October 14, 1582 to midnight on the specified date. Generally the `DATE10` format is used for displaying data.
Line 24: Line 26:
||'''General Format''' ||'''Format''' ||'''Representation''' ||
||Fw ||F8 ||123 ||
||Fw.d ||F8.1 ||123.4 ||
||Nw ||N8 ||00000123 ||
||Nw.d ||N8.1 ||000123.4 ||
Datetimes are stored as the number of seconds to the specified time on the specified date. Generally the `DATETIME20` format is used for displaying data.
Line 30: Line 28:
See [[SPSS/NumericFunctions|here]] for the built-in library of numeric functions.



=== Date and Time Formats ===

'''Date''' data is stored as the number of seconds from midnight, October 14, 1582 to midnight on the specified date.

'''Datetime''' data is stored as the number of seconds to the specified time on the specified date.

'''Time''' data is stored as the number of seconds. This ''can'' be imagined as the number of seconds from midnight to the specified time, but that is not a ''necessary'' construct.
Times are stored as the number of seconds. This ''can'' be imagined as the number of seconds from midnight to the specified time, but that is not a ''necessary'' construct. Generally the `TIME8` format is used for displaying data.
Line 44: Line 32:
||'''Format''' ||'''Representation''' ||
||`DATETIME20` ||`dd-MMM-yyyy hh:mm:ss` ||
||`DATETIME17` ||`dd-MMM-yyyy hh:mm` ||
||`DATE11` ||`dd-MMM-yyyy` ||
||`DATE9` ||`dd-MMM-yy` ||
||`TIME8` ||`hh:mm:ss` ||
||`TIME5` ||`hh:mm` ||
||`ADATE10` ||`mm/dd/yyyy` ||
||`EDATE10` ||`dd.mm.yyyy` ||
||`SDATE10` ||`yyyy/mm/dd` ||
For more information about date and time formats, see [[SPSS/DataFormats#Print_Formats|here]].
Line 55: Line 34:
Note that `MMM` appears as `JAN`, not `001`.

Note that `DATE11` and `DATE9` differ in the representation of years ''(4 and 2 digits respectively)''. The American, European, and Sortable date formats have a similar feature by swapping `[AES]DATE10` with `[AES]DATE8`.

See [[SPSS/DateTimeFunctions|here]] for the built-in library of date and time functions.
See [[SPSS/DatetimeFunctions|here]] for the built-in library of date and time functions.
Line 67: Line 42:
'''String''' data is stored at a fixed length. This length is defined and adjusted through the format; `Aw` where `w` is the width. String data is stored at a fixed length.
Line 69: Line 44:
A string variable can only be explicitly declared, as by the [[SPSS/String|STRING]] command.
Line 70: Line 46:

=== String Formats ===

To explicitly re-size a string, try:

{{{
alter type VAR (a100).
}}}

To re-size a string to the smallest possible length (without losing data), try:

{{{
alter type VAR (a=amin).
}}}

The size does ''not'' automatically grow to accept longer values, and string expressions do not automatically strip trailing whitespace. Consider the below example:

{{{
data list list / numeric_zip_code (F5).
begin data
12345
1234
123
end data.

string padding (A2).
if numeric_zip_code lt 10000 padding="0".
if numeric_zip_code lt 1000 padding="00".
string zip_code (A5).
compute zip_code=concat(padding, string(numeric_zip_code,F5)).
execute.
}}}

This will produce:

||'''numeric_zip_code'''||'''padding'''||'''zip_code'''||
||`12345` ||`" "` ||`" 123"` ||
||`1234` ||`"0 "` ||`"0 123"` ||
||`123` ||`"00"` ||`"00123"` ||

The `concat` function is returning a 7-long string value, and SPSS is silently truncating it to 5-long before storing in `zip_code`.

There are better ways to zero-pad a string, but fundamentally the mistake is not wrapping string expressions in `ltrim` and `rtrim` calls.
Generally, string data uses the string format (`Aw` where `w` is the field width) for displaying data. See [[SPSS/DataFormats#Print_Formats|here]] for further information.
Line 118: Line 52:
== String Literals == === String Literals ===
Line 120: Line 54:
String literals are declared by wrapping a string value in quotes, either single (') or double ("). String literals are declared by wrapping a string value in quotes, either single (`'`) or double (`"`).
Line 122: Line 56:
Quote marks within strings can be handled in one of two ways: either use the opposite quote mark to define the string, or escape the quote mark. Quote marks within strings can be handled in one of two ways: either use the opposite quote mark to wrap the string, or escape the quote mark by doubling it.

SPSS Data Types

SPSS exposes numeric and string data types.

Certain specialized forms of data are handled by formats that impact visualization and export, not storage. The primary example of this is date and time data.


Numeric Data

Numeric data is stored as double-precision floating point.

Generally, numeric data uses either a numeric format (Fw.d where w is the field width and d is the number of visible decimal places) or a restricted numeric format (Nw) for displaying data. See here for further information. The default format, which will be applied to any implicitly declared variables, is F8.2.

Date and Time Data

Dates are stored as the number of seconds from midnight, October 14, 1582 to midnight on the specified date. Generally the DATE10 format is used for displaying data.

Datetimes are stored as the number of seconds to the specified time on the specified date. Generally the DATETIME20 format is used for displaying data.

Times are stored as the number of seconds. This can be imagined as the number of seconds from midnight to the specified time, but that is not a necessary construct. Generally the TIME8 format is used for displaying data.

Keep in mind that 1 day = 60 (seconds) * 60 (minutes) * 24 (hours) = 86400 seconds.

For more information about date and time formats, see here.

See here for the built-in library of date and time functions.


String Data

String data is stored at a fixed length.

A string variable can only be explicitly declared, as by the STRING command.

Generally, string data uses the string format (Aw where w is the field width) for displaying data. See here for further information.

See here for the built-in library of string functions.

String Literals

String literals are declared by wrapping a string value in quotes, either single (') or double (").

Quote marks within strings can be handled in one of two ways: either use the opposite quote mark to wrap the string, or escape the quote mark by doubling it.

variable label var1 "Say ""Hello!""".
variable label var2 'Don''t say "Goodbye!"'.


CategoryRicottone

SPSS/DataTypes (last edited 2023-06-13 05:00:25 by DominicRicottone)