Stata System Variables
Stata exposes internal data through system variables. These are sometimes also referred to as underscore variables.
System Variables
_N stores the number of cases in the current data set. This is affected by commands that operate on a subset of data, such as any command gated by a by structure.
This enables the direct computation of rates. For example:
count if error==1 assert (r(N)/_N) < .1
_n refers to a case's index in the current data set. This is similarly affected by commands that operate on a subset of data.
To create an 8-digit unique identifier number, try:
generate double UniqueID = 10000000 + _n
_rc stores the return code of the last command or program. This would commonly be accessed like:
capture assert duplicate==0 if (_rc!=0) { display "There are duplicates!" }
Statistical programmers should be advised that Stata follows the POSIX model for return values: 0 indicates success, any other integer value indicates an error.
Model Variables
For the most-recent model, several system variables are stored:
_b[VAR] is the coefficent for VAR
_coef[VAR] is a reference to _b[VAR]
_se[VAR] is the standard error for VAR
_cons is 1 whenever accessed directly, but is variable when accessed indirected (as through _b[_cons])
In the context of multiple-equation models, an additional bit of syntax is necessary to indicate the equation number. This can either be specified in brackets preceding the system variable ([#2]_b[VAR]), or inside the brackets preceding the variable specification (_b[#2:VAR]). If an equation number is not specified, #1 is implied. In the context of a single-equation model, #1 is the only valid reference and generally is not specified.
There are several aliases enabled by this syntax. All of the following are equivalent.
_b[VAR] _coef[VAR] [#1]_b[VAR] [#1]_coef[VAR] [#1][VAR] _b[#1:VAR] _coef[#1:VAR]