I have a very large table that contains an ID field and a datetime field. The table is ordered by the ID field, and INDEXED on the datetime field.
I want to quickly find the maximum datetime value but I can't find any good way to do this.
Sample data:
data x;
do id=1 to 10000000;
created_datetime = datetime() + (ranuni(1)*100000);
output;
end;
format created_datetime datetime22.;
run;
proc sql noprint;
create index created_datetime on x;
quit;
Attempt #1: PROC SQL
and the max()
function
For some reason I thought that this would instantly return the result but I found what actually happens was counter-intuitive (to me at least). Using the max()
function doesn't use an index - it can't! Where clauses etc can make use of indexes but the max() function can't. Even if you force the use of an index it's still going to process all rows in the table, just in the order that using the index returns them.
option msglevel=i;
proc sql noprint;
select max(created_datetime) from x(idxname=x);
quit;
Attempt #2: By-group processing
The below easily returns the first row using an index:
data min;
set x;
by created_datetime;
output;
stop;
run;
But I can't use the descending keyword to work backwards through the list to get the last row:
data min;
set x;
by descending created_datetime;
output;
stop;
run;
SAS also doesn't seem to support descending indexes, so I can't use that approach either.
Attempt #3: Use metadata about the index and a WHERE
statement
I looked in SASHELP.VINDEX
hoping that maybe the max values might be stored in the metadata somewhere that I could then use in a where statement. No luck there.
EDIT :
Attempt #4: PROC SQL
with inobs
or outobs
@DomPazz's below answer inspired me to revisit some other SQL based solutions. I thought perhaps the order by
statement in PROC SQL
may interact with the inobs or outobs options to achieve my goal. It didn't work though. The ordering looks like it is applied to the output of the query and in no way affects the order in which the rows are actually read in.
/* Uncomment options as necessary */
proc sql noprint /*inobs=1 outobs=1*/;
create table temp as
select created_datetime
from x
order by created_datetime desc;
quit;
Help!
This gives you the centiles - the last one should be the 100% mark. This requires the index to have been recreated using the UPDATECENTILES
option after any additions/deletions to the data.
proc contents data=have centiles;
run;
You can grab that with ODS OUTPUT if you want it as a dataset (the output table name is "INDEXES"):
ods output indexes=temp;
proc contents data=have centiles ;
run;
See Michael Raithel's papers, in particular The Basics Of Using SAS Indexes, for more details.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With