⇤ ← Revision 1 as of 2023-01-13 22:12:34
Size: 1701
Comment:
|
Size: 1038
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
Firstly the Stata commands are listed with an explanation of syntax. Secondly the equivalents of standard SQL joins are listed. | Stata offers several commands for joining datasets. |
Line 11: | Line 11: |
== Merge == | == Full Join == |
Line 13: | Line 13: |
To join two data files using a unique key variable, try: | If the left and right datasets may overlap, use the [[Stata/Merge|merge]] command to ensure that duplicates are not created. |
Line 17: | Line 17: |
merge 1:1 keyvarlist using "right.dta" | merge 1:1 KEYVARS using "right.dta" |
Line 20: | Line 20: |
`keyvarlist` can be one or more variables, but all cases must be uniquely identified by the combination of values. | If they are known to not overlap, the [[Stata/Append|append]] command can be used instead. |
Line 22: | Line 22: |
The `merge` command creates a `_merge` variable taking the values: ||'''Value'''||'''Meaning''' || ||1 ||observation appeared in 'master' only || ||2 ||observation appeared in 'using' only || ||3 ||observation appeared in both || ||4 ||observation appeared in both, the `update` option was specified, and missing values were updated || ||5 ||observation appeared in both, the `update` option was specified, and non-missing values were updated|| |
{{{ use "cohort1.dta" append using "cohort2.dta" }}} |
Line 32: | Line 28: |
=== Many-to-one === | |
Line 34: | Line 29: |
To merge a file to a lookup table, try: | == Left Join == Use the `merge` command as above and select cases based on the created `_merge` variable. |
Line 38: | Line 35: |
merge m:1 key using "right.dta" | merge 1:1 KEYVARS using "right.dta" keep if _merge==1 | _merge==3 |
Line 41: | Line 39: |
Note: the inverse operation can be accomplished with 1:m. | Alternatively, try using the '''`keep(groups)`''' option. |
Line 43: | Line 41: |
---- | {{{ use "left.dta" merge 1:1 KEYVARS using "right.dta", keep(1 3) }}} |
Line 47: | Line 48: |
== Append == | == Right Join == |
Line 49: | Line 50: |
To add cases from a file into the active dataset, try: {{{ append using "second.dta" }}} ---- |
As with the left join, but the groups of interest are 2 and 3. |
Line 59: | Line 54: |
== Joins == | == Inner Join == |
Line 61: | Line 56: |
=== Full Join === === Left Join === === Right Join === === Inner Join === |
As with the left join, but only group 3 is of interest. |
Joining Data with Stata
Stata offers several commands for joining datasets.
Full Join
If the left and right datasets may overlap, use the merge command to ensure that duplicates are not created.
use "left.dta" merge 1:1 KEYVARS using "right.dta"
If they are known to not overlap, the append command can be used instead.
use "cohort1.dta" append using "cohort2.dta"
Left Join
Use the merge command as above and select cases based on the created _merge variable.
use "left.dta" merge 1:1 KEYVARS using "right.dta" keep if _merge==1 | _merge==3
Alternatively, try using the keep(groups) option.
use "left.dta" merge 1:1 KEYVARS using "right.dta", keep(1 3)
Right Join
As with the left join, but the groups of interest are 2 and 3.
Inner Join
As with the left join, but only group 3 is of interest.