Stata Merge
The merge command joins an external dataset to the active dataset.
Usage
1:1
merge 1:1 ID using "external_dataset.dta"
Many-to-One
To merge a lookup table with the active dataset, try:
merge m:1 GROUPVAR using "lookup_table.dta"
Result Variable
The merge command creates a _merge variable specifying the result of the merge on a case-by-case basis. This variable can be renamed by specifying the generate(varname) option.
The result groups are:
- Case is sourced only from the active dataset
- Case is sourced only from the external dataset
- Case matched between datasets, and the external values were set for any variables found only in the external dataset
As with 3, except if possible missing values are updated with any non-missing values from the external dataset
As with 3, except external values overwrite the active dataset wherever possible (i.e. missing values cannot overwrite non-missing values)
4 is only possible if the update option is specified. 5 is only possible if the replace option is specified.
Assert and Keep
The assert(groups) option raises an error if the merge command results in any rows outside of the specified groups. For example, the following command would raise an error if any cases in the external dataset did not match the active dataset.
merge 1:1 ID using "external_dataset.dta", assert(1 3)
The keep(groups) option selects the cases that will be kept after the merge command. merge ..., keep(3) is equivalent to keep if _merge==3.