Here is my Input:
ID Color 1 green 1 red 1 orange 1 green 1 red 2 red 2 red 2 blue 3 green 3 red
Here is what I want in my output - a count of records by ID for each color:
ID green red orange blue 1 2 2 1 0 2 0 2 0 1 3 1 1 0 0
I know I can get the information using proc freq, but I want to output a dataset exactly like the one I have written above. I can't seem to figure out how to make the colors the columns in this output dataset.
Cross Tabulation. A crosstabulation or a contingency table shows the relationship between two or more variables by recording the frequency of observations that have multiple characteristics. Crosstabulation tables show us a wealth of information on the relationship between the included variables.
Syntax for Cross Tabulation in SASPROC FREQ DATA = dataset; TABLES variable1*Variable2; PROC FREQ DATA = dataset; TABLES variable1*Variable2; The requests in the SAS TABLES statement can be one variable name or a list of variable names separated by asterisks.
A crosstab query calculates a sum, average, or other aggregate function, and then groups the results by two sets of values— one set on the side of the datasheet and the other set across the top.
first, generate the data.
data data;
format ID 8. Color $8.;
input id color;
datalines;
1 green
1 red
1 orange
1 green
1 red
2 red
2 red
2 blue
3 green
3 red
run;
next, summarize color counts by id.
proc freq data=data noprint;
table id*color / out=freq;
run;
make the table flat.
proc transpose data=freq out=freq_trans(drop=_:);
id color;
by id;
var count;
run;
optionally, fill in missing cells with 0.
data freq_trans_filled;
set freq_trans;
array c(*) green red orange blue;
do i = 1 to dim(c);
if c[i]=. then c[i]=0;
end;
drop i;
run;
You can fill the missing cells with zero's using the SPARSE
option to the PROC FREQ
's TABLE
statement. This way, you don't need another DATA
step. The order of the colors can also be controlled by the ORDER=
option to PROC FREQ
.
data one;
input id color :$8.;
datalines;
1 green
1 red
1 orange
1 green
1 red
2 red
2 red
2 blue
3 green
3 red
run;
proc freq data=one noprint order=data;
table id*color /out=freq sparse;
run;
proc transpose data=freq out=two(drop=_:);
id color;
by id;
var count;
run;
proc print data=two noobs;
run;
/* on lst
id green red orange blue
1 2 2 1 0
2 0 2 0 1
3 1 1 0 0
*/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With