Differences between revisions 6 and 7
Revision 6 as of 2022-03-16 16:34:27
Size: 3120
Comment:
Revision 7 as of 2022-03-16 16:35:28
Size: 3126
Comment:
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
= Data Formats = = Stata Data Formats =

Stata Data Formats


Display Formats

Default Formats

The default display format for each numeric data type is as follows:

Type

Format

double

%10.0g

float

%9.0g

long

%12.0g

int

%8.0g

byte

%8.0g

Numeric Formats

The numeric formats are e, f, and g. The general format (g) indicates that the number of decimal places should be shifted to improve readability. The fixed width format (f) indicates that a fixed number of decimal places should be shown. The scientific format (e) indicates that scientific notation should be used.

Value

With format %9.4g

With format %9.4f

With format %9.2e

3.14159

3.142

3.14

3.14e+00

314.159

314.2

314.16

3.14e+02

A c can be appended to any numeric format to indicate that commas should be shown.

String Formats

Alignment is controlled by the presence or absence of a negative sign (-) ahead of the width. A string variable formatted as %-18s will be left-justified; with a format of %18s it would have been right-justified.


List

The list command examines data to (re-)allocate text width. If the longest value for a string variable with format %18s is 12 characters long, then list will only allocate 12 columns for that variable. This behavior can be disabled using the nocompress option.

Note that the default behavior has an impact on performance, especially for large datasets. As such, there is a fast option which is simply an alias for nocompress.

To truncate string values specifically, use the string option.

list comment, string(10)

String Value Alignment

The list command automatically shifts between two output modes based on the width of the listed variables and the width of the screen.

In table format, the list command right-justifies all string values.

In display format, string values are aligned according to the display format. A string value would be left-justified if the variable had a format of %-18s.

Variable Names

The list command also abbreviates variable names (defaulting to 8 characters). To increase that character limit, use the abbreviate option.

list very_long_variable_name, abbreviate(50)

Value Labels

The list command also uses labels (as opposed to values) when available. To override this behavior, use the nolabel option.

Value labels are aligned in the same way as string values; based on the output mode and the display format. Just as a string value would be left-justified if the variable had a format of %-18s, a label would be justified if the variable had a format of %-8g.


CategoryRicottone

Stata/DataFormats (last edited 2025-03-05 02:10:25 by DominicRicottone)