I have three columns in a dataset: spend
, age_bucket
, and multiplier
. The data looks something like...
spend age_bucket multiplier
10 18-24 2x
120 18-24 2x
1 35-54 3x
I'd like a dataset with the columns as the age buckets, the rows as the multipliers, and the entries as the sum (or other aggregate function) of the spend column. Is there a proc
to do this? Can I accomplish it easily using proc SQL
?
You can create a new table without rows by using the CREATE TABLE statement to define the columns and their attributes. You can specify a column's name, type, length, informat, format, and label. The table NewStates has three columns and 0 rows. The char(2) modifier is used to change the length for State.
You can use SAS data sets or OLAP cubes to create a PivotTable report. You can customize the PivotTable report by using the PivotTable toolbar in Excel. For more information about PivotTable reports, see the Microsoft Excel Help.
In simple terms creating a pivot table is similar to creating a cross-tab report in SAS using PROC Tabulate or PROC Report – tabular data are summarized with one (or more) variable in the table becoming rows in the report, and one (or more) variables becoming columns in the report.
There are a few ways to do this.
data have;
input spend age_bucket $ multiplier $;
datalines;
10 18-24 2x
120 18-24 2x
1 35-54 3x
10 35-54 2x
;
proc summary data=have;
var spend;
class age_bucket multiplier;
output out=temp sum=;
run;
First you can use PROC SUMMARY
to calculate the aggregation, sum in this case, for the variable in question. The CLASS
statement gives you things to sum by. This will calculate the N-Way sums and the output data set will contain them all. Run the code and look at data set temp
.
Next you can use PROC TRANSPOSE
to pivot the table. We need to use a BY
statement so a PROC SORT
is necessary. I also filter to the aggregations you care about.
proc sort data=temp(where=(_type_=3));
by multiplier;
run;
proc transpose data=temp out=want(drop=_name_);
by multiplier;
var spend;
id age_bucket;
idlabel age_bucket;
run;
In traditional mode 35-54
is not a valid SAS variable name. SAS will convert your columns to proper names. The label on the variable will retain the original value. Just be aware if you need to reference the variable later, the name has changed to be valid.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With