Differences between revisions 2 and 4 (spanning 2 versions)
Revision 2 as of 2023-06-07 20:19:41
Size: 1782
Comment:
Revision 4 as of 2025-10-24 17:28:17
Size: 1850
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
The '''`merge`''' command joins an external dataset to the active dataset. '''`-merge-`''' joins an external dataset to the active dataset.
Line 13: Line 13:


=== 1:1 ===
To join two datasets 1:1, try:
Line 21: Line 19:



=== Many-to-One ===

To merge a lookup table with the active dataset, try:
To join a lookup table with the active dataset, try:
Line 38: Line 31:
The `merge` command creates a `_merge` variable specifying the result of the merge on a case-by-case basis. This variable can be renamed by specifying the '''`generate(varname)`''' option. `-merge-` creates a `_merge` variable specifying the result of the merge on a case-by-case basis. This variable can be renamed by specifying the '''`generate(varname)`''' option.
Line 62: Line 55:
----



== See also ==

[[https://www.stata.com/manuals/dmerge.pdf|Stata manual for -merge-]]

Stata Merge

-merge- joins an external dataset to the active dataset.


Usage

To join two datasets 1:1, try:

merge 1:1 ID using "external_dataset.dta"

To join a lookup table with the active dataset, try:

merge m:1 GROUPVAR using "lookup_table.dta"


Result Variable

-merge- creates a _merge variable specifying the result of the merge on a case-by-case basis. This variable can be renamed by specifying the generate(varname) option.

The result groups are:

  1. Case is sourced only from the active dataset
  2. Case is sourced only from the external dataset
  3. Case matched between datasets, and the external values were set for any variables found only in the external dataset
  4. As with 3, except if possible missing values are updated with any non-missing values from the external dataset

  5. As with 3, except external values overwrite the active dataset wherever possible (i.e. missing values cannot overwrite non-missing values)

4 is only possible if the update option is specified. 5 is only possible if the replace option is specified.

Assert and Keep

The assert(groups) option raises an error if the merge command results in any rows outside of the specified groups. For example, the following command would raise an error if any cases in the external dataset did not match the active dataset.

merge 1:1 ID using "external_dataset.dta", assert(1 3)

The keep(groups) option selects the cases that will be kept after the merge command. merge ..., keep(3) is equivalent to keep if _merge==3.


See also

Stata manual for -merge-


CategoryRicottone

Stata/Merge (last edited 2025-10-24 17:28:17 by DominicRicottone)