Some time ago I asked a question about creating market basket data. Now I would like to create a similar data.frame, but based on a third variable. Unfortunately I run into problems trying. Previous question: Effecient way to create market basket matrix in R
@shadow and @SimonO101 gave me good answers, but I was not able to alter their anwser correctly. I have the following data:
Customer <- as.factor(c(1000001,1000001,1000001,1000001,1000001,1000001,1000002,1000002,1000002,1000003,1000003,1000003))
Product <- as.factor(c(100001,100001,100001,100004,100004,100002,100003,100003,100003,100002,100003,100008))
input <- data.frame(Customer,Product)
I can create a contingency table now the following way:
input_df <- as.data.frame.matrix(table(input))
However I have a third (numeric) variable which I want as output in the table.
Number <- c(3,1,-4,1,1,1,1,1,1,1,1,1)
input <- data.frame(Customer,Product,Number)
Now the code (of course, now there are 3 variables) does not work anymore. The result I am looking for has unique Customer as row names and unique Product as column names. And has Number as value (or 0 if not present), this number could be calculated by:
input_agg <- aggregate( Number ~ Customer + Product, data = input, sum)
Hope my question is clear, please comment if something is not clear.
You can use xtabs
for that :
R> xtabs(Number~Customer+Product, data=input)
Product
Customer 100001 100002 100003 100004 100008
1000001 0 1 0 2 0
1000002 0 0 3 0 0
1000003 0 1 1 0 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With