given a data.table
object I would to collapse the values of some grouped columns into a single object and insert the resulting objects into a new colum.
dt <- data.table(
c('A|A', 'B|A', 'A|A', 'B|A', 'A|B'),
c(0, 0, 1, 1, 0),
c(22.7, 1.2, 0.3, 0.4, 0.0)
)
setnames(dt, names(dt), c('GROUPING', 'NAME', 'VALUE'))
dt
# GROUPING NAME VALUE
# 1: A|A 0 22.7
# 2: B|A 0 1.2
# 3: A|A 1 0.3
# 4: B|A 1 0.4
# 5: A|B 0 0.0
I think that to do this is first necessary to specify the column for which you want to group, so I should start with something like dt[, OBJECTS := <expr>, by = GROUPING]
.
Unfortunately, I don't know the expression <expr>
to use so that the result is as follows:
# GROUPING OBJECTS
# 1: A|A <vector>
# 2: B|A <vector>
# 3: A|B <vector>
Each <vector>
must contain the values of the other columns. E.g the first <vector>
have to be a named vector equivalent to:
eg <- c(22.7, 0.3)
names(eg) <- c('0', '1')
# 0 1
# 22.7 0.3
Working inside of j
: If you want to have the values of a column be a vector, you need to wrap the output in list(.)
.
j
itself requires a call to list
, so your final expression will resemble a nested list
, eg:
dt[, list(allNames=list(NAME), allValues=list(VALUE)), by=GROUPING]
# GROUPING allNames allValues
# 1: A|A 0,1 22.7,0.3
# 2: B|A 0,1 1.2,0.4
# 3: A|B 0 0
As @Mnel points out, equivalently:
dt[, lapply(.SD, list), by=GROUPING]
If you want it in long form, then the structure of your <expr>
will be:list( c( list(), list(), ..., list() ) )
eg:
dt[, list(c(list(NAME), list(VALUE))), by=GROUPING]
# GROUPING V1
# 1: A|A 0,1
# 2: A|A 22.7,0.3
# 3: B|A 0,1
# 4: B|A 1.2,0.4
# 5: A|B 0
# 6: A|B 0
Or equivalently:
dt[, list(lapply(.SD, c)), by=GROUPING]
I think that this is what you are looking for:
dt1 <- dt[, list(list(setNames(VALUE, NAME))), by = GROUPING]
dt1
# GROUPING V1
# 1: A|A 22.7,0.3
# 2: B|A 1.2,0.4
# 3: A|B 0
str(dt1)
# Classes ‘data.table’ and 'data.frame': 3 obs. of 2 variables:
# $ GROUPING: chr "A|A" "B|A" "A|B"
# $ V1 :List of 3
# ..$ : Named num 22.7 0.3
# .. ..- attr(*, "names")= chr "0" "1"
# ..$ : Named num 1.2 0.4
# .. ..- attr(*, "names")= chr "0" "1"
# ..$ : Named num 0
# .. ..- attr(*, "names")= chr "0"
# - attr(*, ".internal.selfref")=<externalptr>
dt1$V1
# [[1]]
# 0 1
# 22.7 0.3
#
# [[2]]
# 0 1
# 1.2 0.4
#
# [[3]]
# 0
# 0
As @Arun points out in the comments, the "data.table" alternative to setNames
in this case is setattr(VALUE, 'names', NAME)
, making another solution:
dt1 <- dt[, list(list(setattr(VALUE, 'names', NAME))), by = GROUPING]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With