Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to loop over SAS datasets?

Tags:

sas

sas-macro

I have 60 sas datasets that contain data on consumers individual characteristics such as id, gender, age, amountSpent, .... Each dataset shows data only for one time period (data1 is Jan, data2 is Feb...). I cannot merge them because of the size and some other issues.

How can I write a multiple loop to go through each of the datasets, do some manipulations and save the estimated values to a temporary file.

SAS does not have a for loop. How can I use do?

like image 929
Buras Avatar asked Aug 01 '13 02:08

Buras


People also ask

Do loops SAS dataset?

The iterative DO statement enables you to execute a group of statements between DO and END repetitively based on the value of an index variable. There are multiple forms of the iterative DO loop. The DO UNTIL and DO WHILE statements enable you to conditionally select data from a SAS data set.

How do I apply a loop in SAS?

SAS Do Loop Example:-data A; do i = 1 to 4; y = i**2; /* values are 2, 5, 9, 16, 25 */ output; end; run; data A; do i = 1 to 4; y = i**2; /* values are 2, 5, 9, 16, 25 */ output; end; run; The END statement marks the end of the SAS loop.

How many times does SAS iterate through the data step?

Iterative DO loops are the simplest form of DO loops that can be executed within a SAS Data Step. The actions of an iterative DO loop are unconditional, meaning that if you define a loop to execute 50 times, it will execute 50 times without stopping (unless an error occurs during processing).

How do I iterate through a column in SAS?

Re: Loop Through ColumnsThe first DATA step only reads the header row, extracting the list of country variables/columns, creating a SAS macro variable. Then a second DATA step uses the macro variable (in the INPUT statement) to read your input file, starting with row #2 (FIRSTOBS= parameter on INFILE).


2 Answers

But sas does have a do while macro loop. So basically you need 3 things: 1. In some way, a listing of your datasets. 2. A macro that loops over this listing. 3. The stuff you want to do.

E.g., let us presume you have a dataset WORK.DATASET_LIST that contains a variable library (libname) and a variable member (dataset name) for every dataset you want to loop across.

Then you could do:

%macro loopOverDatasets();
    /*imho good practice to declare macro variables of a macro locally*/
    %local datasetCount iter inLibref inMember;

    /*get number of datasets*/
    proc sql noprint;
        select count(*)
         into :datasetCount
        from WORK.DATASET_LIST;
    quit;

    /*initiate loop*/
    %let iter=1;
    %do %while (&iter.<= &datasetCount.);
        /*get libref and dataset name for dataset you will work on during this iteration*/
        data _NULL_;
            set WORK.DATASET_LIST (firstobs=&iter. obs=&iter.); *only read 1 record;
            *write the libname and dataset name to the macro variables;
            call symput("inLibref",strip(libname));
            call symput("inMember",strip(member));
            *NOTE: i am always mortified by the chance of trailing blanks torpedoing my code, hence the strip function;
        run;

        /*now you can apply your logic to the dataset*/
        data &inLibref..&inMember.; *assuming you want to apply the changes to the dataset itself;
            set &inLibref..&inMember.;
            /*** INSERT YOUR LOGIC HERE ***/
        run;

        /*** ANY OTHER PROCS/DATA STEPS ***/
        /*just remember to use &inLibref..&inMember. to refer to the current dataset*/

        /*increment the iterator of the loop*/
        %let iter=%eval(&iter.+1);
    %end;
%mend;

/*call the macro*/
%loopOverDatasets()

That is the idea. Maybe you want to gather the list of your datasets in a different way. e.g., a macro variable containing them all. In that case you'll have to use the %scan function in the loop to pick a dataset. Or maybe there is logic in the naming, e.g., dataset1, dataset2, dataset3..., in which case you could simply make use of the &iter. macro variable.

like image 153
mvherweg Avatar answered Oct 09 '22 16:10

mvherweg


Another option is to create a view that combines all the datasets, you won't get any data size problems with this approach (although I don't know if the other issues you refer to would be a problem here). You'll need a list of the relevant datasets, which may be obtained from DICTIONARY.TABLES in PROC SQL.

proc sql noprint;
select memname into :ds_list separated by ' '
from dictionary.tables
where libname='XXXXX';
quit;

data combined / view=combined;
set &ds_list;
run;

Then just run your summary against the view, so there's no need to loop through each dataset. I assume your datasets have a date variable, otherwise you'll need to add in some extra functionality (this applies to any solution). It would be interesting to see how this performs compared to the other solutions here.

like image 41
Longfish Avatar answered Oct 09 '22 16:10

Longfish