Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String grouping (aggregation) with data.table (R 3.1.1)

Input: I have this data:

library(data.table)
ids <- c(10, 10, 10, 11, 12, 12)
items <- c('soup', 'rice', 'lemon', 'chicken', 'lamb', 'noodles')
orders <- as.data.table(list(id=ids, item=items))

> orders
   id    item
1: 10    soup
2: 10    rice
3: 10   lemon
4: 11 chicken
5: 12    lamb
6: 12 noodles

Goal: Need to arrive at this (group all items by their id):

   id        items
1: 10    soup,rice,lemon
2: 11    chicken
3: 12    lamb,noodles

What I did: I am using data.table on R 3.1.1 (latest release) - tried the below method, which should work:

orders[,list(items=list(item)), by=id]

But getting the below (incorrect) output:

   id       items
1: 10 lamb,noodles,lemon
2: 11 lamb,noodles,lemon
3: 12 lamb,noodles,lemon    

What am I doing wrong, and what is the right way to group strings correctly with data.table?

like image 990
Gopalakrishna Palem Avatar asked Jan 11 '23 05:01

Gopalakrishna Palem


1 Answers

The syntax for what it sounds like you're looking for is a little bit awkward, but makes sense when you think about how you would normally use list.

Try the following:

orders[, list(item = list(item)), by = "id"]
#    id            item
# 1: 10 soup,rice,lemon
# 2: 11         chicken
# 3: 12    lamb,noodles
str(.Last.value)
# Classes ‘data.table’ and 'data.frame':  3 obs. of  2 variables:
#  $ id  : num  10 11 12
#  $ item:List of 3
#   ..$ : chr  "soup" "rice" "lemon"
#   ..$ : chr "chicken"
#   ..$ : chr  "lamb" "noodles"
#  - attr(*, ".internal.selfref")=<externalptr> 
like image 79
A5C1D2H2I1M1N2O1R2T1 Avatar answered Jan 20 '23 04:01

A5C1D2H2I1M1N2O1R2T1