I have dataset with SKU IDs and their counts, i need to feed this data into a machine learning algorithm, in a way that SKU IDs become columns and COUNTs are at the intersection of transaction id and SKU ID. Can anyone suggest how to achieve this transformation.
CURRENT DATA
TransID SKUID COUNT
1 31 1
1 32 2
1 33 1
2 31 2
2 34 -1
DESIRED DATA
TransID 31 32 33 34
1 1 2 1 0
2 2 0 0 -1
In R, we can use either xtabs
xtabs(COUNT~., df1)
# SKUID
#TransID 31 32 33 34
# 1 1 2 1 0
# 2 2 0 0 -1
Or dcast
library(reshape2)
dcast(df1, TransID~SKUID, value.var="COUNT", fill=0)
# TransID 31 32 33 34
#1 1 1 2 1 0
#2 2 2 0 0 -1
Or spread
library(tidyr)
spread(df1, SKUID, COUNT, fill=0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With