Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create nested data.tables by collapsing rows into new data.tables

How can you create a data.table that contains nested data.tables?

Example

set.seed(7908)
dt <- data.table(x=1:5)[,list(y=letters[1:x],z=sample(1:100,x)),by=x]

dt
##     x y  z
##  1: 1 a 13
##  2: 2 a 27
##  3: 2 b 87
##  4: 3 a 85
##  5: 3 b 98
##  6: 3 c  1
##  7: 4 a 53
##  8: 4 b 81
##  9: 4 c 64
## 10: 4 d 45
## 11: 5 a 28
## 12: 5 b 26
## 13: 5 c 52
## 14: 5 d 55
## 15: 5 e 12

Desired output

For each unique value of x in dt, collapse the rows and create a data.table with columns y and z that is contained in a single column of dt. The result should look like this:

##    x        dt.yz
## 1: 1 <data.table>
## 2: 2 <data.table>
## 3: 3 <data.table>
## 4: 4 <data.table>
## 5: 5 <data.table>

In my actual example I've got several data tables with differing columns that I want to store in a single meta data table.

like image 619
dnlbrky Avatar asked Aug 21 '14 16:08

dnlbrky


People also ask

What is a nested table explain giving an example?

Nested tables in oracle are similar to one dimensional array except the former's size has no upper bound and can be increased dynamically. They are one column database tables where the rows of a nested table are not stored in a particular order. Example: Initialize a nested table.

What is nested Datatable?

A nested table is represented in the case table as a special column that has a data type of TABLE. For any particular case row, this kind of column contains selected rows from the child table that pertain to the parent table. The data in a nested table can be used for prediction or for input, or for both.

How add Colspan to Datatable?

DataTables fully supports colspan and rowspan in the table's header, assigning the required order listeners to the TH element suitable for that column. Each column must have one TH cell which is unique to it for the listeners to be added.


1 Answers

Create the data.table using y and z as the columns, and then wrap that in a list so it can be "stuffed" in a single row. Wrap that in yet another list, where you assign the resulting column name. Use by=x to do this for each unique value of x.

dt2 <- dt[, list(dt.yz=list(data.table(y, z))), by=x]
dt2
##    x        dt.yz
## 1: 1 <data.table>
## 2: 2 <data.table>
## 3: 3 <data.table>
## 4: 4 <data.table>
## 5: 5 <data.table>

As Arun points out, using .SD is shorter and faster, and may be more convenient:

dt2 <- dt[, list(dt.yz=list(.SD)), by=x]
## dt.yz will include all columns not in the `by=`;
## Use `.SDcols=` to select specific columns

To get the value of a data.table later, subset the meta data.table (dt2) based on the desired value of x, and then get the first element in the list (which is the nested data.table) of the dt.yz column.

dt2[x==5,dt.yz[[1]]]
##    y  z
## 1: a 28
## 2: b 26
## 3: c 52
## 4: d 55
## 5: e 12
like image 101
dnlbrky Avatar answered Nov 11 '22 01:11

dnlbrky