Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting a data.frame to a list of lists

Tags:

dataframe

r

How can I convert a data.frame

df <- data.frame(id=c("af1", "af2"), start=c(100, 115), end=c(114,121))

To a list of lists

LoL <- list(list(id="af1", start=100, end=114), list(id="af2", start=115, end=121))

I've tried things like

not.LoL <- as.list(as.data.frame(t(df)))

and I'm really not sure what I end up with after this, but it isn't quite right. My requirement is that I can access, say, the first start by the command

> LoL[[1]]$start
[1] 100

the not.LoL that I currently have gives me the following error:

> not.LoL[[1]]$start
Error in not.LoL[[1]]$start : $ operator is invalid for atomic vectors

Explanations and/or solutions would be greatly appreciated.

Edit: I should have made it clear that "id" here is actually non-unique - there can be multiple elements under a single ID. So I could do with a solution that doesn't depend on unique IDs to split on.

like image 841
MattLBeck Avatar asked Feb 06 '13 13:02

MattLBeck


4 Answers

You can use apply to turn your data frame into a list of lists like this:

LoL <- apply(df,1,as.list)

However, this will change all your data to text, as it passes a single atomic vector to the function.

like image 70
MvG Avatar answered Nov 11 '22 13:11

MvG


In base R, it's quite a bit faster to use mapply instead of split or lapply - however, you have to invoke it via do.call so that each column is used independently.

df <- sleep

f <- function(df) {
  lapply(seq_len(nrow(df)), function(row) {
    df[row, , drop = FALSE]
  })
}

f2 <- function(df) {
  do.call("mapply", c(list, df, SIMPLIFY = FALSE, USE.NAMES=FALSE))
}

f3 <- function(df) {
  split(df, seq(nrow(df)))
}

microbenchmark::microbenchmark(f(df), f2(df), f3(df))
#> Unit: microseconds
#>    expr     min       lq     mean   median       uq       max neval
#>   f(df) 573.799 607.8375 759.1721 626.0095 752.9465  2861.961   100
#>  f2(df) 114.819 123.5190 155.5185 129.9210 141.4340  1375.573   100
#>  f3(df) 598.774 625.6025 813.6837 634.5855 684.3825 11230.678   100

Created on 2019-10-09 by the reprex package (v0.3.0)

like image 23
Neal Fultz Avatar answered Nov 11 '22 11:11

Neal Fultz


LMAo <- lapply(split(df,df$id), function(x) as.list(x)) # is one way

# more succinctly
# LMAo <- lapply(split(df,df$id), as.list)

An edited solution as per your comment:

lapply( split(df,seq_along(df[,1])), as.list)
like image 28
user1317221_G Avatar answered Nov 11 '22 12:11

user1317221_G


Using plyr , you can do this

dlply(df,.(id),c)

To avoid grouping by id , if there are multiple ( maybe you need to change column name , id is unique for me)

dlply(df,1,c)
like image 35
agstudy Avatar answered Nov 11 '22 13:11

agstudy