Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort data using Data step in SAS

Tags:

sas

datastep

I want to sort data in SAS data step. What exactly I mean is: the work of proc sort should be done in data step. Is there any solution?

like image 789
Saran Avatar asked Dec 18 '17 07:12

Saran


People also ask

Can you sort in a data step SAS?

No, you can't sort your data in a Data Step. With a Data Step, you can only create new datasets and new variables, filter observations, and select columns. You need the PROC SORT procedure to sort your data.

How do you sort data in PROC SQL?

You can specify a sort order for each column in the ORDER BY clause. libname sql 'SAS-library'; proc sql outobs=12; title 'World Topographical Features'; select Name, Type from sql. features order by Type desc, Name; Note: The ASC keyword is optional because the PROC SQL default sort order is ascending.

How does proc sort work in SAS?

What Does the SORT Procedure Do? The SORT procedure orders SAS data set observations by the values of one or more character or numeric variables. The SORT procedure either replaces the original data set or creates a new data set. PROC SORT produces only an output data set.

How do I arrange variables in SAS dataset?

You can control the order in which variables are displayed in SAS output by using the ATTRIB statement. Use the ATTRIB statement prior to the SET, MERGE, or UPDATE statement in order for you to reorder the variables. Variables not listed in the ATTRIB statement retain their original position.


2 Answers

If you're looking for a data-step only solution, you can do the work of PROC SORT with a hash table. The caveat is that you need enough memory to do it.

If you want to do a simple sort, you would load the hash table with the ordered:'yes' option and output it to a new table. By default, ordered:yes will sort the data in ascending order. You can specify descending as well.

Simple Sort

data _null_;

    /* Sets up PDV without loading the table */
    if(0) then set sashelp.class;

    /* Load sashelp.class into memory ordered by Height. Do not remove duplicates. */
    dcl hash sortit(dataset:'sashelp.class', ordered:'yes', multidata:'yes');

        sortit.defineKey('Height');     * Order by height;
        sortit.defineData(all:'yes');   * Keep all variables in the output dataset;

    sortit.defineDone();

    /* Output to a dataset called class_sorted */
    sortit.Output(dataset:'class_sorted');
run;

De-duping

To remove duplicates, do the exact same operation, except remove the multidata option. In the below table, observations (8, 9) and (15, 16) are duplicates of each other. Observations 9 and 16 will be eliminated.

data _null_;

    /* Sets up PDV without loading the table */
    if(0) then set sashelp.class;

    /* Load sashelp.class into memory ordered by Height. Do not keep duplicates. */
    dcl hash sortit(dataset:'sashelp.class', ordered:'yes');

        sortit.defineKey('Height');     * Order by height;
        sortit.defineData(all:'yes');   * Keep all variables in the output dataset;
    sortit.defineDone();

    /* Output to a dataset called class_sorted */
    sortit.Output(dataset:'class_sorted');
run;
like image 119
Stu Sztukowski Avatar answered Nov 16 '22 02:11

Stu Sztukowski


Stu beat me to it, but provided that your dataset contains a unique key, and you can fit the whole thing in memory, you can use a hash sort, e.g.:

data _null_;
  if 0 then set sashelp.class;
  declare hash h(dataset:"sashelp.class",ordered:"a");
  rc = h.definekey("age","sex","name");
  rc = h.definedata(ALL:'yes');
  rc = h.definedone();
  rc = h.output(dataset:"class_sorted");
  stop;
run;

If you are really determined to avoid using any built-in sort methods, a particularly silly approach is to load the whole dataset into a series of temporary arrays, sort the arrays using a hand-coded algorithm, and export again:

https://codereview.stackexchange.com/questions/79952/quicksort-in-sas-for-sorting-datasets

like image 25
user667489 Avatar answered Nov 16 '22 03:11

user667489