Stata Merge

The merge command joins an external dataset to the active dataset.


Usage

1:1

merge 1:1 ID using "external_dataset.dta"

Many-to-One

To merge a lookup table with the active dataset, try:

merge m:1 GROUPVAR using "lookup_table.dta"


Result Variable

The merge command creates a _merge variable specifying the result of the merge on a case-by-case basis. This variable can be renamed by specifying the generate(varname) option.

The result groups are:

  1. Case is sourced only from the active dataset
  2. Case is sourced only from the external dataset
  3. Case matched between datasets, and the external values were set for any variables found only in the external dataset
  4. As with 3, except if possible missing values are updated with any non-missing values from the external dataset

  5. As with 3, except external values overwrite the active dataset wherever possible (i.e. missing values cannot overwrite non-missing values)

4 is only possible if the update option is specified. 5 is only possible if the replace option is specified.

Assert and Keep

The assert(groups) option raises an error if the merge command results in any rows outside of the specified groups. For example, the following command would raise an error if any cases in the external dataset did not match the active dataset.

merge 1:1 ID using "external_dataset.dta", assert(1 3)

The keep(groups) option selects the cases that will be kept after the merge command. merge ..., keep(3) is equivalent to keep if _merge==3.


See also

Stata manual for merge


CategoryRicottone

Stata/Merge (last edited 2023-06-08 01:13:08 by DominicRicottone)