Create nested data.tables by collapsing rows into new data.tables

Example

set.seed(7908)
dt <- data.table(x=1:5)[,list(y=letters[1:x],z=sample(1:100,x)),by=x]

dt
##     x y  z
##  1: 1 a 13
##  2: 2 a 27
##  3: 2 b 87
##  4: 3 a 85
##  5: 3 b 98
##  6: 3 c  1
##  7: 4 a 53
##  8: 4 b 81
##  9: 4 c 64
## 10: 4 d 45
## 11: 5 a 28
## 12: 5 b 26
## 13: 5 c 52
## 14: 5 d 55
## 15: 5 e 12

Desired output

For each unique value of x in dt, collapse the rows and create a data.table with columns y and z that is contained in a single column of dt. The result should look like this:

##    x        dt.yz
## 1: 1 <data.table>
## 2: 2 <data.table>
## 3: 3 <data.table>
## 4: 4 <data.table>
## 5: 5 <data.table>

In my actual example I've got several data tables with differing columns that I want to store in a single meta data table.

619

asked Aug 21 '14 16:08

dnlbrky

1 Answers

Create the data.table using y and z as the columns, and then wrap that in a list so it can be "stuffed" in a single row. Wrap that in yet another list, where you assign the resulting column name. Use by=x to do this for each unique value of x.

dt2 <- dt[, list(dt.yz=list(data.table(y, z))), by=x]
dt2
##    x        dt.yz
## 1: 1 <data.table>
## 2: 2 <data.table>
## 3: 3 <data.table>
## 4: 4 <data.table>
## 5: 5 <data.table>

As Arun points out, using .SD is shorter and faster, and may be more convenient:

dt2 <- dt[, list(dt.yz=list(.SD)), by=x]
## dt.yz will include all columns not in the `by=`;
## Use `.SDcols=` to select specific columns

To get the value of a data.table later, subset the meta data.table (dt2) based on the desired value of x, and then get the first element in the list (which is the nested data.table) of the dt.yz column.

dt2[x==5,dt.yz[[1]]]
##    y  z
## 1: a 28
## 2: b 26
## 3: c 52
## 4: d 55
## 5: e 12

101

answered Nov 11 '22 01:11

dnlbrky

Related questions
                            
                                Hurst exponent with R
                            
                                Prevent R from using virtual memory on unix/linux?
                            
                                Spacing in axis label when using expression(paste(...))
                            
                                R: add alpha-value to png-image
                            
                                Add text to a faceted plot in ggplot2 with dates on X axis
                            
                                dbWriteTable(..., append = T) is overwritng in R
                            
                                data.table error when used through knitr, gWidgetsWWW
                            
                                Plotting average of multiple variables in time-series using ggplot
                            
                                subset unbalanced (hetero replicated replication) to complete or balance dataset in r
                            
                                How to create a dataframe of user defined S4 classes in R
                            
                                Converting package using S3 to S4 classes, is there going to be performance drop?
                            
                                Deprecation of multicore (mclapply) in R 3.0
                            
                                Calculation of Moran's I with 4000 records
                            
                                Knitr xtable row color result in 2nd table in the row header cell blacked out
                            
                                R - put ggplot grid lines in foreground [duplicate]
                            
                                Memory issue with foreach loop in R on Windows 8 (64-bit) (doParallel package)
                            
                                creating a heatmap where the data has NaN values in it
                            
                                How to make a log-file of an R-session which combines commands, results and warnings/messages/errors from the R-console
                            
                                Plot decision boundaries with ggplot2?
                            
                                incorrect number of subscripts on matrix in R

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Create nested data.tables by collapsing rows into new data.tables

Tags:

r

nested

data.table

Example

Desired output

dnlbrky

People also ask

1 Answers

dnlbrky

Recent Activity

Donate For Us