= SAS Sort = A procedure for sorting and de-duplicating tables. <> ---- == Usage == To sort a table and write data into a new table, try: {{{ proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2; by VAR; run; }}} If the '''`OUT`''' option is not specified, the dataset is sorted in place. === NoDupKey === To remove duplicate cases by `VAR`, use the '''`NODUPKEY`''' option. {{{ proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2 nodupkey; by VAR; run; }}} === DupOut === To preserve duplicate cases, specify an alternate output on the '''`DUPOUT`''' option. Removed cases will instead be moved into this dataset. {{{ proc sort data=LIBREF.TABLE1 out=LIBREF.TABLE2 nodupkey dupout=LIBREF.TABLE3; by VAR; run; }}} ---- == Deduplication == === Basic === {{{ proc sort data=LIBREF.OLDTABLE out=LIBREF.NEWTABLE nodupkey dupout=LIBREF.REMOVEDCASES by ID; run; }}} === Perfect Duplicates === To remove perfect duplicates... 1. use `_all_` on the `BY` statement 2. substitute the `NODUPKEY` option with `NODUPRECS` === Advanced === If there should be a preference between which case is kept, use another `SORT` procedure ''before'' the one specifying `NODUPKEY`. {{{ proc sort data=LIBREF.OLDTABLE out=LIBREF.TABLE1; by ID descending QUALITY DATE; run; proc sort data=LIBREF.TABLE1 out=LIBREF.NEWTABLE nodupkey dupout=LIBREF.TABLE2 by ID; run; }}} ---- CategoryRicottone