SAS Sort
A procedure for sorting and de-duplicating tables.
Usage
To sort a table and write data into a new table, try:
proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2; by VAR; run;
If the OUT option is not specified, the dataset is sorted in place.
NoDupKey
To remove duplicate cases by VAR, use the NODUPKEY option.
proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2 nodupkey; by VAR; run;
DupOut
To preserve duplicate cases, specify an alternate output on the DUPOUT option. Removed cases will instead be moved into this dataset.
proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2 nodupkey dupout=LIBREF.TABLE3; by VAR; run;
Deduplication
Basic
proc sort data=LIBREF.OLDTABLE out=LIBREF.NEWTABLE nodupkey dupout=LIBREF.REMOVEDCASES by ID; run;
Perfect Duplicates
To remove perfect duplicates...
use _all_ on the BY statement
substitute the NODUPKEY option with NODUPRECS
Advanced
If there should be a preference between which case is kept, use another SORT procedure before the one specifying NODUPKEY.
proc sort data=LIBREF.OLDTABLE out=LIBREF.TABLE1; by ID descending QUALITY DATE; run; proc sort data=LIBREF.TABLE1 out=LIBREF.NEWTABLE nodupkey dupout=LIBREF.TABLE2 by ID; run;