Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Equivalent of Access Crosstab Query in SAS?

Tags:

sas

Here is my Input:

ID  Color
1   green
1   red
1   orange
1   green
1   red
2   red
2   red
2   blue
3   green
3   red

Here is what I want in my output - a count of records by ID for each color:

ID  green  red  orange blue
1   2      2    1      0
2   0      2    0      1
3   1      1    0      0

I know I can get the information using proc freq, but I want to output a dataset exactly like the one I have written above. I can't seem to figure out how to make the colors the columns in this output dataset.

like image 751
oob Avatar asked Aug 09 '10 19:08

oob


People also ask

What is cross tabulation in SAS?

Cross Tabulation. A crosstabulation or a contingency table shows the relationship between two or more variables by recording the frequency of observations that have multiple characteristics. Crosstabulation tables show us a wealth of information on the relationship between the included variables.

How do you make a cross tab in SAS?

Syntax for Cross Tabulation in SASPROC FREQ DATA = dataset; TABLES variable1*Variable2; PROC FREQ DATA = dataset; TABLES variable1*Variable2; The requests in the SAS TABLES statement can be one variable name or a list of variable names separated by asterisks.

What is Access crosstab query?

A crosstab query calculates a sum, average, or other aggregate function, and then groups the results by two sets of values— one set on the side of the datasheet and the other set across the top.


2 Answers

first, generate the data.

data data;
    format ID 8. Color $8.;
    input id color;
datalines;
1   green
1   red
1   orange
1   green
1   red
2   red
2   red
2   blue
3   green
3   red
run;

next, summarize color counts by id.

proc freq data=data noprint;
    table id*color / out=freq;
run;

make the table flat.

proc transpose data=freq out=freq_trans(drop=_:);
    id color;
    by id;
    var count;
run;

optionally, fill in missing cells with 0.

data freq_trans_filled;
    set freq_trans;
    array c(*) green red orange blue;
    do i = 1 to dim(c);
        if c[i]=. then c[i]=0;
    end;
    drop i;
run;
like image 95
rkoopmann Avatar answered Sep 20 '22 20:09

rkoopmann


You can fill the missing cells with zero's using the SPARSE option to the PROC FREQ's TABLE statement. This way, you don't need another DATA step. The order of the colors can also be controlled by the ORDER= option to PROC FREQ.

data one;
  input id color :$8.;
datalines;
1   green
1   red
1   orange
1   green
1   red
2   red
2   red
2   blue
3   green
3   red
run;
proc freq data=one noprint order=data;
  table id*color /out=freq sparse;
run;
proc transpose data=freq out=two(drop=_:);
    id color;
    by id;
    var count;
run;
proc print data=two noobs;
run;
/* on lst
id    green    red    orange    blue
 1      2       2        1        0
 2      0       2        0        1
 3      1       1        0        0
*/
like image 26
Chang Chung Avatar answered Sep 20 '22 20:09

Chang Chung