Stata System Variables

Stata System Variables are details about the current operating context of Stata that are exposed to users through 'underscore variables'. These variables are entirely context sensitive so precautions need to be taken when programming around them.


System Variables

_N stores the number of cases in the current data set. This is affected by commands that operate on a subset of data, such as any command gated by a by block.

_n refers to a case's ordered number in the current data set. This is similarly affected by commands that operate on a subset of data.

To create an 8-digit unique identifier number, try:

generate double UniqueID = 10000000 + _n

These two can be combined in a simple deduplication implementations:

sort KEYVAR
by KEYVAR: generate dup=cond(_N==1,0,_n)

_rc stores the return code of the last command or program. This would commonly be accessed like:

capture noisily assert dup==0
if (_rc!=0) {
  display "There are duplicates!"
}

Statistical programmers should be advised that Stata follows the POSIX standard for return values: 0 indicates success, any other integer value indicates an error.


Model Variables

For the most-recent model, several system variables are stored:

In the context of multiple-equation models, an additional bit of syntax is necessary to indicate the equation number. This can either be specified in brackets preceding the system variable ([#2]_b[VAR]), or inside the brackets preceding the variable specification (_b[#2:VAR]). If an equation number is not specified, #1 is implied. In the context of a single-equation model, #1 is the only valid reference and generally is not specified.

There are several aliases enabled by this syntax. All of the following are equivalent.

_b[VAR]
_coef[VAR]
[#1]_b[VAR]
[#1]_coef[VAR]
[#1][VAR]
_b[#1:VAR]
_coef[#1:VAR]


CategoryRicottone