Size: 1701
Comment:
|
← Revision 3 as of 2023-06-09 15:44:42 ⇥
Size: 1160
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
Firstly the Stata commands are listed with an explanation of syntax. Secondly the equivalents of standard SQL joins are listed. | Stata offers several commands for joining datasets. |
Line 11: | Line 11: |
== Merge == | == Full Join == |
Line 13: | Line 13: |
To join two data files using a unique key variable, try: | If the left and right datasets may overlap, use the [[Stata/Merge|merge]] command to ensure that duplicates are not created. |
Line 17: | Line 17: |
merge 1:1 keyvarlist using "right.dta" | merge 1:1 KEYVARS using "right.dta" |
Line 20: | Line 20: |
`keyvarlist` can be one or more variables, but all cases must be uniquely identified by the combination of values. The `merge` command creates a `_merge` variable taking the values: ||'''Value'''||'''Meaning''' || ||1 ||observation appeared in 'master' only || ||2 ||observation appeared in 'using' only || ||3 ||observation appeared in both || ||4 ||observation appeared in both, the `update` option was specified, and missing values were updated || ||5 ||observation appeared in both, the `update` option was specified, and non-missing values were updated|| === Many-to-one === To merge a file to a lookup table, try: |
If they are known to not overlap, the [[Stata/Append|append]] command can be used instead. |
Line 37: | Line 23: |
use "left.dta" merge m:1 key using "right.dta" |
use "cohort1.dta" append using "cohort2.dta" |
Line 41: | Line 27: |
Note: the inverse operation can be accomplished with 1:m. | See also the [[Stata/FrLink|frlink]] command. |
Line 47: | Line 33: |
== Append == | == Left Join == |
Line 49: | Line 35: |
To add cases from a file into the active dataset, try: | Use the `merge` command as above and select cases based on the created `_merge` variable. |
Line 52: | Line 38: |
append using "second.dta" | use "left.dta" merge 1:1 KEYVARS using "right.dta" keep if _merge==1 | _merge==3 |
Line 54: | Line 42: |
Alternatively, try using the '''`keep(groups)`''' option. {{{ use "left.dta" merge 1:1 KEYVARS using "right.dta", keep(1 3) }}} See also the [[Stata/FrLink|frlink]] command. |
|
Line 59: | Line 56: |
== Joins == | == Right Join == |
Line 61: | Line 58: |
=== Full Join === | As with the left join, but the groups of interest are 2 and 3. |
Line 63: | Line 60: |
=== Left Join === | ---- |
Line 65: | Line 62: |
=== Right Join === | |
Line 67: | Line 63: |
=== Inner Join === | == Inner Join == As with the left join, but only group 3 is of interest. |
Joining Data with Stata
Stata offers several commands for joining datasets.
Full Join
If the left and right datasets may overlap, use the merge command to ensure that duplicates are not created.
use "left.dta" merge 1:1 KEYVARS using "right.dta"
If they are known to not overlap, the append command can be used instead.
use "cohort1.dta" append using "cohort2.dta"
See also the frlink command.
Left Join
Use the merge command as above and select cases based on the created _merge variable.
use "left.dta" merge 1:1 KEYVARS using "right.dta" keep if _merge==1 | _merge==3
Alternatively, try using the keep(groups) option.
use "left.dta" merge 1:1 KEYVARS using "right.dta", keep(1 3)
See also the frlink command.
Right Join
As with the left join, but the groups of interest are 2 and 3.
Inner Join
As with the left join, but only group 3 is of interest.