Differences between revisions 1 and 4 (spanning 3 versions)
Revision 1 as of 2023-01-13 22:12:34
Size: 1701
Comment:
Revision 4 as of 2025-10-24 16:34:25
Size: 915
Comment: Rewrite
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Firstly the Stata commands are listed with an explanation of syntax. Secondly the equivalents of standard SQL joins are listed. Stata offers several commands for joining datasets.
Line 11: Line 11:
== Merge == == Full Join ==
Line 13: Line 13:
To join two data files using a unique key variable, try: If the left and right datasets may overlap, use [[Stata/Merge|-merge-]] to ensure that duplicates are not created.
Line 17: Line 17:
merge 1:1 keyvarlist using "right.dta" merge 1:1 KEYVARS using "right.dta"
Line 20: Line 20:
`keyvarlist` can be one or more variables, but all cases must be uniquely identified by the combination of values.

The `merge` command creates a `_merge` variable taking the values:

||'''Value'''||'''Meaning''' ||
||1 ||observation appeared in 'master' only ||
||2 ||observation appeared in 'using' only ||
||3 ||observation appeared in both ||
||4 ||observation appeared in both, the `update` option was specified, and missing values were updated ||
||5 ||observation appeared in both, the `update` option was specified, and non-missing values were updated||


=== Many-to-one ===

To merge a file to a lookup table, try:
If they are known to not overlap, consider using [[Stata/Append|-append-]] instead.
Line 37: Line 23:
use "left.dta"
merge m:1 key using "right.dta"
use "cohort1.dta"
append using "cohort2.dta"
Line 41: Line 27:
Note: the inverse operation can be accomplished with 1:m. See also [[Stata/FrLink|-frlink-]].
Line 47: Line 33:
== Append == == Left, Right, or Inner Joins ==
Line 49: Line 35:
To add cases from a file into the active dataset, try: Use [[Stata/Merge|-merge-]] as above and select cases based on the automatically created `_merge` variable.
Line 52: Line 38:
append using "second.dta" use "left.dta"
merge 1:1 KEYVARS using "right.dta"
keep if _merge==1 | _merge==3 // left join
keep if _merge==2 | _merge==3 // right join
keep if _merge==3 // inner join
Line 54: Line 44:

----



== Joins ==

=== Full Join ===

=== Left Join ===

=== Right Join ===

=== Inner Join ===

Joining Data with Stata

Stata offers several commands for joining datasets.


Full Join

If the left and right datasets may overlap, use -merge- to ensure that duplicates are not created.

use "left.dta"
merge 1:1 KEYVARS using "right.dta"

If they are known to not overlap, consider using -append- instead.

use "cohort1.dta"
append using "cohort2.dta"

See also -frlink-.


Left, Right, or Inner Joins

Use -merge- as above and select cases based on the automatically created _merge variable.

use "left.dta"
merge 1:1 KEYVARS using "right.dta"
keep if _merge==1 | _merge==3       // left join
keep if _merge==2 | _merge==3       // right join
keep if _merge==3                   // inner join


CategoryRicottone

Stata/JoiningData (last edited 2025-10-24 16:34:25 by DominicRicottone)