SAS Sort

A procedure for sorting and de-duplicating tables.


Usage

To sort a table and write data into a new table, try:

proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2;
    by VAR;
run;

If the OUT option is not specified, the dataset is sorted in place.

NoDupKey

To remove duplicate cases by VAR, use the NODUPKEY option.

proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2 nodupkey;
    by VAR;
run;

DupOut

To preserve duplicate cases, specify an alternate output on the DUPOUT option. Removed cases will instead be moved into this dataset.

proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2 nodupkey dupout=LIBREF.TABLE3;
    by VAR;
run;


Deduplication

Basic

proc sort data=LIBREF.OLDTABLE out=LIBREF.NEWTABLE nodupkey dupout=LIBREF.REMOVEDCASES
  by ID;
run;

Perfect Duplicates

To remove perfect duplicates...

  1. use _all_ on the BY statement

  2. substitute the NODUPKEY option with NODUPRECS

Advanced

If there should be a preference between which case is kept, use another SORT procedure before the one specifying NODUPKEY.

proc sort data=LIBREF.OLDTABLE out=LIBREF.TABLE1;
    by ID descending QUALITY DATE;
run;

proc sort data=LIBREF.TABLE1 out=LIBREF.NEWTABLE nodupkey dupout=LIBREF.TABLE2
    by ID;
run;


CategoryRicottone

SAS/Sort (last edited 2023-03-30 20:12:08 by DominicRicottone)