Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert data frame to list

I am trying to go from a data frame to a list structure in R (and I know technically a data frame is a list). I have a data frame containing reference chemicals and their mechanisms different targets. For example, estrogen is an estrogen receptor agonist. What I would like is to transform the data frame to a list, because I am tired of typing out something like:

refchem$chemical_id[refchem$target=="AR" & refchem$mechanism=="Agonist"]

every time I need to access the list of specific reference chemicals. I would much rather access the chemicals by:

refchem$AR$Agonist

I am looking for a general answer, even though I have given a simplified example, because not all targets have all mechanisms.

This is really easy to accomplish with a loop:

example <- data.frame(target=rep(c("t1","t2","t3"),each=20),
                      mechan=rep(c("m1","m2"),each=10,3),
                      chems=paste0("chem",1:60))
oneoption <- list()
for(target in unique(example$target)){
  oneoption[[target]] <- list()
  for(mech in unique(example$mechan)){
    oneoption[[target]][[mech]] <- as.character(example$chems[ example$target==target & example$mechan==mech ])
  }
}

I am just wondering if there is a more clever way to do it. I tried playing around with lapply and did not make any progress.

like image 772
dayne Avatar asked Dec 11 '22 11:12

dayne


2 Answers

Using split:

split(refchem, list(refchem$target, refchem$mechanism))

Should do the trick.

The new way to access would be refchem$AR.Agonist

like image 100
Señor O Avatar answered Jan 04 '23 19:01

Señor O


If you make a keyed data.table instead, ...

  • you'll still have all the data in one data.frame (instead of a possibly-nested list of many);
  • you may find iterating over these subsets nicer; and
  • the syntax is pretty clean:

To access a subset:

DT[.('AR','Agonist')] 

To do something for each group, that will be rbinded together in the result:

DT[,{do stuff},by=key(DT)]

Similar to aggregate(), any list of vectors of the correct length can go into the by, not just the key.

Finally, DT came from...

 require(data.table)
 DT <- data.table(refchem,key=c('target','mechanism'))
like image 45
Frank Avatar answered Jan 04 '23 20:01

Frank