Size: 2106
Comment:
|
← Revision 5 as of 2023-06-07 19:05:53 ⇥
Size: 2149
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
'''Stata System Variables''' are details about the current operating context of Stata that are exposed to users through 'underscore variables'. These variables are entirely context sensitive so precautions need to be taken when programming around them. | Stata exposes internal data through '''system variables'''. These are sometimes also referred to as ''underscore variables''. |
Line 13: | Line 13: |
`_N` stores the number of cases in the current data set. This is affected by commands that operate on a subset of data, such as any command gated by a `by` block. | `_N` stores the number of cases in the current data set. This is affected by commands that operate on a subset of data, such as any command gated by a [[Stata/Logic#By|by structure]]. |
Line 15: | Line 15: |
`_n` refers to a case's ordered number in the current data set. This is similarly affected by commands that operate on a subset of data. These two can be combined in a simple deduplication implementations: |
This enables the direct computation of rates. For example: |
Line 20: | Line 18: |
sort KEYVAR by KEYVAR: generate dup=cond(_N==1,0,_n) |
count if error==1 assert (r(N)/_N) < .1 }}} `_n` refers to a case's index in the current data set. This is similarly affected by commands that operate on a subset of data. To create an 8-digit unique identifier number, try: {{{ generate double UniqueID = 10000000 + _n |
Line 27: | Line 33: |
capture noisily assert dup==0 | capture assert duplicate==0 |
Line 33: | Line 39: |
Statistical programmers should be advised that Stata follows the POSIX standard for return values: 0 indicates success, any other integer value indicates an error. | Statistical programmers should be advised that Stata follows the POSIX model for return values: 0 indicates success, any other integer value indicates an error. |
Line 48: | Line 54: |
In the context of multiple-equation models, an additional bit of syntax may be necessary to indicate the equation number. This can either be specified in brackets preceding the system variable (`[#2]_b[VAR]`), or inside the brackets preceding the variable specification (`_b[#2:VAR]`). If an equation number is not specified, `#1` is implied. | In the context of multiple-equation models, an additional bit of syntax is necessary to indicate the '''equation number'''. This can either be specified in brackets preceding the system variable (`[#2]_b[VAR]`), or inside the brackets preceding the variable specification (`_b[#2:VAR]`). If an equation number is not specified, `#1` is implied. In the context of a single-equation model, `#1` is the only valid reference and generally is not specified. |
Line 50: | Line 56: |
There are several aliases enabled in this context. All of the following are equivalent after a multiple-equation model: | There are several aliases enabled by this syntax. All of the following are equivalent. |
Stata System Variables
Stata exposes internal data through system variables. These are sometimes also referred to as underscore variables.
System Variables
_N stores the number of cases in the current data set. This is affected by commands that operate on a subset of data, such as any command gated by a by structure.
This enables the direct computation of rates. For example:
count if error==1 assert (r(N)/_N) < .1
_n refers to a case's index in the current data set. This is similarly affected by commands that operate on a subset of data.
To create an 8-digit unique identifier number, try:
generate double UniqueID = 10000000 + _n
_rc stores the return code of the last command or program. This would commonly be accessed like:
capture assert duplicate==0 if (_rc!=0) { display "There are duplicates!" }
Statistical programmers should be advised that Stata follows the POSIX model for return values: 0 indicates success, any other integer value indicates an error.
Model Variables
For the most-recent model, several system variables are stored:
_b[VAR] is the coefficent for VAR
_coef[VAR] is a reference to _b[VAR]
_se[VAR] is the standard error for VAR
_cons is 1 whenever accessed directly, but is variable when accessed indirected (as through _b[_cons])
In the context of multiple-equation models, an additional bit of syntax is necessary to indicate the equation number. This can either be specified in brackets preceding the system variable ([#2]_b[VAR]), or inside the brackets preceding the variable specification (_b[#2:VAR]). If an equation number is not specified, #1 is implied. In the context of a single-equation model, #1 is the only valid reference and generally is not specified.
There are several aliases enabled by this syntax. All of the following are equivalent.
_b[VAR] _coef[VAR] [#1]_b[VAR] [#1]_coef[VAR] [#1][VAR] _b[#1:VAR] _coef[#1:VAR]