Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set in the first row, why do I have two output rows?

Tags:

sas

I tried this code in SAS but the output isn't the same as I expect.

data temp;
   input sumy;
   datalines;
36
;
run;

data aaa;   
   if _n_ = 1 then 
      set temp;
run;

proc print data = aaa;
run;

May I ask why there are two observations, sas have "set" twice? How does the "set" and PDV work here during iteration? Thank you in advance.

like image 495
Darienzhang Avatar asked Jul 13 '15 03:07

Darienzhang


2 Answers

Because you executed the implied OUTPUT at the end of the data step twice. The first time, with _N_=1, you read one observation from the input dataset. The second time you did not read a new observation, since _N_ now is 2, and the values from the previous observation were retained. After the second observation SAS stops because it has detected that your data step is in a loop.

If you want only one observation then either add a STOP statement before the RUN statement or recode the data step to use OBS=1 dataset option on the input dataset instead of the IF statement.

Note that if the input data set was empty then you would have output zero observations because the data step would have stopped when the SET statement read past the end of the input dataset.

like image 78
Tom Avatar answered Oct 03 '22 08:10

Tom


There are two observations because during the second DATA Step iteration no read operation occurred.

The SET statement has two roles.

  1. Compile time role (unconditional) - the data set header is read by the compiler and its variables are added the the steps PDV.
  2. Run time role (conditional) - a row from the data set is read and the values are placed in the PDV each time the code reaches the SET statement.

Additionally, every variable coming from (or corresponding to) a SET statement has it's value automatically retained. That is why the second observation created by your sample code has sumy=36.

Additional detail from the SAS support site:

Usage Note 8914: DATA step stopped due to looping message

If a DATA step is written such that no data reading statements (e.g. SET, INPUT) are executed, the step is terminated after one iteration and the following message is written to the SAS log:

NOTE: DATA STEP stopped due to looping.

like image 26
Richard Avatar answered Oct 03 '22 07:10

Richard