I have a table looks like:
date item_id store_id sale_num
1/1/15 33 1 10
1/1/15 33 2 12
1/1/15 33 3 15
1/1/15 44 1 54
1/1/15 44 3 66
1/2/15 33 1 14
....
I want to cast the table, in order to put store_id into multiple columns, and value is the sale_num. The table should be like:
date item_id store1 store2 store3
1/1/15 33 10 12 15
1/1/15 44 54 NA 66
1/2/15 33 14 NA NA
......
When I do this using cast function in a small scale, 1000 rows in original table, there is no problem.
However, the original table has 38,000,000 rows and comsumes 1.5 GB memory in R. When I use cast function, the function cost around 34 GB memory, and it runs endlessly.
What is the problem of it? Is there any alternative way?
We can use the dcast from data.table. It should be more efficient than the cast from reshape. We convert the 'data.frame' to 'data.table' (setDT(df1)) and then use dcast.
library(data.table)
dcast(setDT(df1), date+item_id~ paste0("store",
store_id), value.var="sale_num")
# date item_id store1 store2 store3
#1: 1/1/15 33 10 12 15
#2: 1/1/15 44 54 NA 66
#3: 1/2/15 33 14 NA NA
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With