Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterating over lists stored in data.frame in R

I think this is a beginner question, but I don't appear to have the right vocabulary for an effective Google search.

I have a data.frame, final, which contains a list of clusters, each of which is a list of strings.

I would like to iterate over the list of strings in each cluster: a for loop within a for loop.

for (j in final$clusters){
    for (i in final$clusters$`j`){
        print final$clusters$`j`[i]
    }
}

j corresponds to the lists in clusters, and i corresponds to the items in clusters[j]

I was trying to do this by using the length of each cluster, which I thought would be something like length(final$clusters[1]), but that gives 1, not the length of list.

Also, final$clusters[1] gives $'1', and on the next line, all the strings in cluster 1.

Thanks.

EDIT: output of dput(str(final)), as requested:

List of 2
 $ clusters     :List of 1629
  ..$ 1   :
  ..$ 2   : 
  ..$ 3   : 
  ..$ 4   : 
  ..$ 5   : 
  ..$ 6   : 
  ..$ 7   : 
  ..$ 8   : 
  ..$ 9   : 
  ..$ 10  : 
  .. [list output truncated]
 $ cluster_stats: num [1:1629, 1:6] 0.7 0.7 0.7 0.7 0.7 0.7 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:1629] "1" "2" "3" "4" ...
  .. ..$ : chr [1:6] "min" "qu1" "median" "mean" ...
NULL
like image 470
blep Avatar asked Feb 09 '13 03:02

blep


People also ask

How do I combine lists into data frames in R?

To combine data frames stored in a list in R, we can use full_join function of dplyr package inside Reduce function.

Can you store a list in a DataFrame R?

Convert List to DataFrame using data. data. frame() is used to create a DataFrame in R that takes a list, vector, array, etc as arguments, Hence, we can pass a created list to the data. frame() function to convert list to DataFrame. It will store the elements in a single row in the DataFrame.

How does Rbind work in R?

The rbind() function represents a row bind function for vectors, data frames, and matrices to be arranged as rows. It is used to combine multiple data frames for data manipulation.


2 Answers

I think you confuse a list and a data.frame. I guess that your final is object is a list.

To iterate over the list You can use rapply. It is a recursive version of lapply.

For example:

## I create some reproducible example

cluster1 <- list(a='a',b='b')
cluster2 <- list(c='aaa',d='bbb')
clusters <- list(cluster1,cluster2)
final <- list(clusters)

So using rapply

rapply(final,f=print)
[1] "a"
[1] "b"
[1] "aaa"
[1] "bbb"
    a     b     c     d 
  "a"   "b" "aaa" "bbb" 

Update after edit by OP

Using lapply, I loop through the name of the list. For each name, I get the element list using [[ ( you can use [ if you wand to get names and heder for files), then I write the file using write.table. Here I use the name of the element in the list to create the file name. in your case you will have file name as number.(1.txt,...)

    lapply(names(final$clusters),
                      function(x)
                             write.table(x=final$clusters[[x]],
                                         file=paste(x,'.txt',sep='')))
like image 54
agstudy Avatar answered Oct 02 '22 07:10

agstudy


I think the primary problem here is that the way you iterate here is wrong.

I think that something like this would work better:

for (j in final$clusters){
    for (i in final$clusters[j]){
        print i
    }
}

here is the documentation for loops: http://manuals.bioinformatics.ucr.edu/home/programming-in-r#TOC-For-Loop for subsetting: http://www.statmethods.net/management/subset.html

good luck

like image 44
dval Avatar answered Oct 02 '22 08:10

dval