Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to flatten R data frame that contains lists?

Tags:

r

I want to find the best "R way" to flatten a dataframe that looks like this:

  CAT    COUNT     TREAT
   A     1,2,3     Treat-a, Treat-b
   B     4,5       Treat-c,Treat-d,Treat-e

So it will be structured like this:

   CAT   COUNT1  COUNT2 COUNT3  TREAT1   TREAT2   TREAT3
    A    1       2      3       Treat-a  Treat-b  NA 
    B    4       5      NA      Treat-c  Treat-d  Treat-e 

Example code to generate the source dataframe:

df<-data.frame(CAT=c("A","B"))
df$COUNT <-list(1:3,4:5) 
df$TREAT <-list(paste("Treat-", letters[1:2],sep=""),paste("Treat-", letters[3:5],sep=""))

I believe I need a combination of rbind and unlist? Any help would be greatly appreciated. - Tim

like image 836
Tim Avatar asked Dec 10 '15 15:12

Tim


People also ask

Can a data frame contain a list R?

Data frame columns can contain lists Taking into account the list structure of the column, we can type the following to change the values in a single cell. You can also create a data frame having a list as a column using the data.

How do I transpose a DataFrame in R?

To interchange rows with columns, you can use the t() function. For example, if you have the matrix (or dataframe) mat you can transpose it by typing t(mat) . This will, as previously hinted, result in a new matrix that is obtained by exchanging the rows and columns.


1 Answers

Here is a solution using base R, accepting vectors of any length inside your list and no need to specify which columns of the dataframe you want to collapse. Part of the solution was generated using this answer.

df2 <- do.call(cbind,lapply(df,function(x){
  #check if it is a list, otherwise just return as is
  if(is.list(x)){
    return(data.frame(t(sapply(x,'[',seq(max(sapply(x,length)))))))
  } else{
  return(x)
  }
}))

As of R 3.2 there is lengths to replace sapply(x, length) as well,

df3 <- do.call(cbind.data.frame, lapply(df, function(x) {
  # check if it is a list, otherwise just return as is
  if (is.list(x)) {
    data.frame(t(sapply(x,'[', seq(max(lengths(x))))))
  } else {
   x
 }
}))

data used:

df <- structure(list(CAT = structure(1:2, .Label = c("A", "B"), class = "factor"), 
    COUNT = list(1:3, 4:5), TREAT = list(c("Treat-a", "Treat-b"
    ), c("Treat-c", "Treat-d", "Treat-e"))), .Names = c("CAT", 
"COUNT", "TREAT"), row.names = c(NA, -2L), class = "data.frame")
like image 85
Heroka Avatar answered Sep 20 '22 13:09

Heroka