Given two simple datasets A and B as follows
DATA A; INPUT X @@;
CARDS;
1 2 3 4
RUN;
DATA B; INPUT Y @@;
CARDS;
1 2
RUN;
I am trying to create two datasets named C and D, one using repeated SET and OUTPUT statements and another using DO loop.
DATA C;
SET B;
K=1; DO; SET A; OUTPUT; END;
K=K+1; DO; SET A; OUTPUT; END;
K=K+1;
RUN;
DATA D;
SET B;
DO K = 1 TO 2;
SET A; OUTPUT;
END;
RUN;
I thought that C and D should be the same as the DO loop is supposed to be repeating those statements as shown in the DATA step for C, but it turns out that they are different.
Dataset C:
Obs Y K X
1 1 1 1
2 1 2 1
3 2 1 2
4 2 2 2
Dataset D:
Obs Y K X
1 1 1 1
2 1 2 2
3 2 1 3
4 2 2 4
Could someone please explain this?
The two SET A statements in the first data step are independent. So on each iteration of the data step they will both read the same observation. So it is as if you ran this step instead.
data c;
set b;
set a;
do k=1 to 2; output; end;
run;
The SET A statement in the second data step will execute twice on the first iteration of the data step. So it will read two observations from A for each iteration of the data step.
If you really wanted to do a cross-join you would need to use point= option so that you could re-read one of the data sets.
data want ;
set b ;
do p=1 to nobs ;
set a point=p nobs=nobs ;
output;
end;
run;
Your Table B has two obs so your code will only do two iterations:
Retain keyword.Debugging:
Iteration 1 current view:
Obs Table X
1 A 1
Obs Table Y k
1 B 1 1
Output:
K=1; DO; SET A; OUTPUT; END;
Obs Y K X
1 1 1 1
K=K+1; DO; SET A; OUTPUT; END;
Obs Y K X
2 1 2 1
Iteration 2 current view:
Obs Table X
2 A 2
Obs Table Y k
2 B 2 1
Output:
K=1; DO; SET A; OUTPUT; END;
Obs Y K X
3 2 1 2
K=K+1; DO; SET A; OUTPUT; END;
Obs Y K X
4 2 2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With