Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does using the KEEP option on SAS datasets improve read performance?

Tags:

sas

Suppose I am trying to sum up one variable (call it var_1) in a very large dataset (nearly a terabyte). The dataset is both long and wide. My code would look like this:

PROC MEANS DATA=my_big_dataset SUM;
    VAR var_1;
RUN;

Would I get any performance gain at all by using the KEEP option on the dataset being read? That is:

PROC MEANS DATA=my_big_dataset (KEEP=var_1) SUM;
    VAR var_1;
RUN;

In terms of disk I/O, I imagine that each record must be read in its entirety no matter what. But perhaps less memory needs to be allocated to read the records. Any advice is appreciated.

like image 436
sparc_spread Avatar asked May 03 '12 12:05

sparc_spread


1 Answers

Yes it does make a difference. Most of the time it's not a large difference but if you start to have very wide or very long datasets you will start to see some benefit.

Search for keep= on the link below...

http://support.sas.com/techsup/technote/ts298.html

If you're having performance issues then this may shave fractions of seconds or seconds off what you are doing but it's not going to cut your processing time in half. Look for other optimization techniques if you need that.

like image 146
Robert Penridge Avatar answered Nov 16 '22 04:11

Robert Penridge