Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a list of data.frame and apply a function to one column?

I have a small question about apply functions. For example I have:

l <- list(a = data.frame(A1=rep(10,5),B1=c(1,1,1,2,2),C1=c(5,10,20,7,30)),
          b = data.frame(A1=rep(20,5),B1=c(3,3,4,4,4),C1=c(3,5,10,20,30)))

I want to find a minimum C1 for each B1. The result should be

$a
  A1 B1 C1
  10  1  5
  10  2  7

$b
  A1 B1 C1
  20  3  3
  20  4  10

I know how to do it with 'for', but it have to be a easier way with 'lapply', but I couldn't make it works.

Please help

like image 764
Tali Avatar asked Feb 12 '13 09:02

Tali


2 Answers

What about combining lapply and tapply:

lapply(l, function(i) tapply(i$C1, i$B1, min))
$a
1 2 
5 7 

$b
3  4 
3 10 

The trick to thinking about multiple operations is to split the task into bits. SO,

  1. Minimum C1 for each B1. How do we do this for a single data frame?

    i = l[[1]]
    tapply(i$C1, i$B1, min)
    
  2. Each element of a list? Just use lapply:

    lapply(l, function(i) tapply(i$C1, i$B1, min))
    

If you can't do step 1, you won't be able to manage step 2.

like image 155
csgillespie Avatar answered Sep 21 '22 17:09

csgillespie


Having recently succumbed to the siren song of the data.table package and its combination of versatility and speed for doing operations like this, I submit yet another solution:

library(data.table)
lapply(l, function(dat) {
    data.table(dat, key="B1,C1")[list(unique(B1)), mult="first"]
})

If retaining the original column order is important, for some reason, the data.table() call could be wrapped by setcolorder(..., names(dat)).

like image 39
regetz Avatar answered Sep 20 '22 17:09

regetz