Does using the KEEP option on SAS datasets improve read performance?

Question

Suppose I am trying to sum up one variable (call it var_1) in a very large dataset (nearly a terabyte). The dataset is both long and wide. My code would look like this:

PROC MEANS DATA=my_big_dataset SUM;
    VAR var_1;
RUN;

Would I get any performance gain at all by using the KEEP option on the dataset being read? That is:

PROC MEANS DATA=my_big_dataset (KEEP=var_1) SUM;
    VAR var_1;
RUN;

In terms of disk I/O, I imagine that each record must be read in its entirety no matter what. But perhaps less memory needs to be allocated to read the records. Any advice is appreciated.

Robert Penridge · Accepted Answer

Yes it does make a difference. Most of the time it's not a large difference but if you start to have very wide or very long datasets you will start to see some benefit.

Search for keep= on the link below...

http://support.sas.com/techsup/technote/ts298.html

If you're having performance issues then this may shave fractions of seconds or seconds off what you are doing but it's not going to cut your processing time in half. Look for other optimization techniques if you need that.

Does using the KEEP option on SAS datasets improve read performance?

Tags:

sas

sparc_spread

1 Answers

Robert Penridge

Recent Activity

Donate For Us

Does using the KEEP option on SAS datasets improve read performance?

Tags:

sas

sparc_spread

1 Answers

Robert Penridge

Related questions

Recent Activity

Donate For Us