Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Applying a function to each row of a data.table

Tags:

I looking for a way to efficiently apply a function to each row of data.table. Let's consider the following data table:

library(data.table) library(stringr)  x <- data.table(a = c(1:3, 1), b = c('12 13', '14 15', '16 17', '18 19')) > x    a     b 1: 1 12 13 2: 2 14 15 3: 3 16 17 4: 1 18 19 

Let's say I want to split each element of column b by space (thus yielding two rows for each row in the original data) and join the resulting data tables. For the example above, I need the following result:

   a V1 1: 1 12 2: 1 13 3: 2 14 4: 2 15 5: 3 16 6: 3 17 7: 1 18 8: 1 19 

The following would work if column a has only unique values:

x[, list(str_split(b, ' ')[[1]]), by = a] 

The following almost works (unless there are some identical rows in the original data table), but is ugly when x has many columns and copies column b to the result, which I would like to avoid.

>     x[, list(str_split(b, ' ')[[1]]), by = list(a,b)]    a     b V1 1: 1 12 13 12 2: 1 12 13 13 3: 2 14 15 14 4: 2 14 15 15 5: 3 16 17 16 6: 3 16 17 17 7: 1 18 19 18 8: 1 18 19 19 

What would be the most efficient and idiomatic way to solve this problem?

like image 835
Victor K. Avatar asked Mar 28 '13 03:03

Victor K.


People also ask

How will you apply a function to a row of pandas DataFrame?

Use apply() function when you wanted to update every row in pandas DataFrame by calling a custom function. In order to apply a function to every row, you should use axis=1 param to apply(). By applying a function to each row, we can create a new column by using the values from the row, updating the row e.t.c.

How do you add a row of data to a table?

Other ways to add rows and columns Add a row or column to a table by typing in a cell just below the last row or to the right of the last column, by pasting data into a cell, or by inserting rows or columns between existing rows or columns.


2 Answers

How about :

x    a     b 1: 1 12 13 2: 2 14 15 3: 3 16 17 4: 1 18 19  x[,list(a=rep(a,each=2), V1=unlist(strsplit(b," ")))]    a V1 1: 1 12 2: 1 13 3: 2 14 4: 2 15 5: 3 16 6: 3 17 7: 1 18 8: 1 19 

Generalized solution given comment :

x[,{s=strsplit(b," ");list(a=rep(a,sapply(s,length)), V1=unlist(s))}] 
like image 120
Matt Dowle Avatar answered Oct 20 '22 00:10

Matt Dowle


x[, .(a,strsplit(b,' ')), by=1:nrow(x)] 

by=nrow(x) is a simple way to force 1 row per by-group

like image 45
Aaron McDaid Avatar answered Oct 20 '22 01:10

Aaron McDaid