Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to subset data.frames stored in a list?

I created a list and I stored one data frame in each component. Now I would like to filter those data frames keeping only the rows that have NA in a specific column. I would like the result of this operation to be another list containing data frames with only those rows having NA in that column.

Here is some code to clarify what I am saying. Assume d1 and d2 are my data frames

set.seed(1)

d1<-data.frame(a=rnorm(5), b=c(rep(2006, times=4),NA))
d2<-data.frame(a=1:5, b=c(2007, 2007, NA, NA, 2007))  

print(d1)
 a    b
 1.3011543 2006
 0.3780023 2006
-0.3101449 2006
-1.3927445 2006
-1.0726218   NA

print(d2)
a    b
1 2007
2 2007
3   NA
4   NA
5 2007

which I place in a list with a for loop

ls<-list()

for (i in 1:2){ 

  str<-paste("d", i, sep="")
  dat<-get(str)
  ls[[str]]<-dat

}

Now I would like to filter each list component so to leave only rows of column b that contain NA. To do this I tried using the following command, knowing from the beginning it would have failed. My problem is that I don't know if subset() is the right function to use and, in case it is, I don't know how to qualify each data frame (that is, the first element of the subset function)

lsNA<-lapply(ls, subset(ls, is.na(b)))

Can you please help me get past my severe limitations?

like image 736
Riccardo Avatar asked Nov 13 '13 11:11

Riccardo


People also ask

How do you subset data in a list?

To subset lists we can utilize the single bracket [ ] , double brackets [[ ]] , and dollar sign $ operators. Each approach provides a specific purpose and can be combined in different ways to achieve the following subsetting objectives: Subset list and preserve output as a list.

Can you subset a list in R?

Lists in R can be subsetted using all three of the operators mentioned above, and all three are used for different purposes. The [[ operator can be used to extract single elements from a list. Here we extract the first element of the list.

Can you have a list of data frames?

To create a list of Dataframes we use the list() function in R and then pass each of the data frame you have created as arguments to the function.


2 Answers

lapply's second argument is a function (subset) and extra arguments to subset are passed as the ... arguments to lapply. Hence:

my.ls <- list(d1 = d1, d2 = d2)
my.lsNA <- lapply(my.ls, subset, is.na(b))

(I am also showing you how to easily create the list of data.frames without using get, and recommend you don't use ls as a variable name since it is also the name of a rather common function.)

like image 112
flodel Avatar answered Oct 10 '22 01:10

flodel


Regarding the question in @Riccardo's last comment, try:

lapply(my.ls, "[", 1)

or alternately:

lapply(my.ls, "[[", 1)

depending on whether you want the output to be a list of dataframes or a list of vectors.

like image 33
Scott Kaiser Avatar answered Oct 10 '22 01:10

Scott Kaiser