Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove columns with NAs from all dataframes in list

Tags:

dataframe

r

na

I have a list made up of several data frames. I would like to remove all of the columns with NAs in each data frame. Note the columns to be removed are not the same in each data frame. Sample data provided below. Any suggestions much appreciated.

WW1_Data <- structure(list(Alnön = structure(list(Site_Name = structure(1L, .Label =
c("Alnön","Ammarnäs", "Anjan", "Bäcksand", "Fittjebodarna", "Flatruet",
"Glen", "Idre", "Klångstavallen", "Kramfors", "Ljungdalen", "Ljungris",
"Mårdsund", "Mörtsjön", "Nordmaling", "Öster_Galåbodarna", "Ramundberget",
"Rätan", "Särvfjället", "Smedstorp", "Söderhamn", "Stensoffa",
"Storulvån", "Sveg", "Tanna", "Tänndalen", "Vålådalen", "Vemdalsskalet"
), class = "factor"), X1996 = 0.307692307692308, X1997 = NA_real_,
X2000 = 0.260869565217391, X2001 = NA_real_, X2002 = NA_real_,
X2003 = NA_real_, X2008 = NA_real_, X2009 = NA_real_, X2010 = 0.0833333333333333,
X2011 = NA_real_), .Names = c("Site_Name", "X1996", "X1997",
"X2000", "X2001", "X2002", "X2003", "X2008", "X2009", "X2010",
"X2011"), row.names = 1L, class = "data.frame"), Ammarnäs = structure(list(
Site_Name = structure(2L, .Label = c("Alnön", "Ammarnäs",
"Anjan", "Bäcksand", "Fittjebodarna", "Flatruet", "Glen",
"Idre", "Klångstavallen", "Kramfors", "Ljungdalen", "Ljungris",
"Mårdsund", "Mörtsjön", "Nordmaling", "Öster_Galåbodarna",
"Ramundberget", "Rätan", "Särvfjället", "Smedstorp", "Söderhamn",
"Stensoffa", "Storulvån", "Sveg", "Tanna", "Tänndalen", "Vålådalen",
"Vemdalsskalet"), class = "factor"), X1996 = 0.75, X1997 = NA_real_,
X2000 = NA_real_, X2001 = NA_real_, X2002 = NA_real_, X2003 = NA_real_,
X2008 = NA_real_, X2009 = NA_real_, X2010 = NA_real_, X2011 = 0.8), .Names =
c("Site_Name", "X1996", "X1997", "X2000", "X2001", "X2002", "X2003", "X2008",
"X2009", "X2010", "X2011"), row.names = 2L, class = "data.frame"),
Anjan = structure(list(Site_Name = structure(3L, .Label = c("Alnön",
"Ammarnäs", "Anjan", "Bäcksand", "Fittjebodarna", "Flatruet",
"Glen", "Idre", "Klångstavallen", "Kramfors", "Ljungdalen",
"Ljungris", "Mårdsund", "Mörtsjön", "Nordmaling", "Öster_Galåbodarna",
"Ramundberget", "Rätan", "Särvfjället", "Smedstorp", "Söderhamn",
"Stensoffa", "Storulvån", "Sveg", "Tanna", "Tänndalen", "Vålådalen",
"Vemdalsskalet"), class = "factor"), X1996 = NA_real_, X1997 = NA_real_,
X2000 = NA_real_, X2001 = NA_real_, X2002 = NA_real_,
X2003 = NA_real_, X2008 = NA_real_, X2009 = 0.52, X2010 = 0.5,
X2011 = NA_real_), .Names = c("Site_Name", "X1996", "X1997",
"X2000", "X2001", "X2002", "X2003", "X2008", "X2009", "X2010",
"X2011"), row.names = 3L, class = "data.frame"), Bäcksand = structure(list(
Site_Name = structure(4L, .Label = c("Alnön", "Ammarnäs",
"Anjan", "Bäcksand", "Fittjebodarna", "Flatruet", "Glen",
"Idre", "Klångstavallen", "Kramfors", "Ljungdalen", "Ljungris",
"Mårdsund", "Mörtsjön", "Nordmaling", "Öster_Galåbodarna",
"Ramundberget", "Rätan", "Särvfjället", "Smedstorp",
"Söderhamn", "Stensoffa", "Storulvån", "Sveg", "Tanna",
"Tänndalen", "Vålådalen", "Vemdalsskalet"), class = "factor"),
X1996 = NA_real_, X1997 = NA_real_, X2000 = 0.0833333333333333,
X2001 = NA_real_, X2002 = NA_real_, X2003 = NA_real_,
X2008 = NA_real_, X2009 = NA_real_, X2010 = 0.375, X2011 = NA_real_), .Names =   
c("Site_Name", "X1996", "X1997", "X2000", "X2001", "X2002", "X2003", "X2008",
"X2009", "X2010", "X2011"), row.names = 4L, class = "data.frame"),
Fittjebodarna = structure(list(Site_Name = structure(5L, .Label = c("Alnön",
"Ammarnäs", "Anjan", "Bäcksand", "Fittjebodarna", "Flatruet",
"Glen", "Idre", "Klångstavallen", "Kramfors", "Ljungdalen",
"Ljungris", "Mårdsund", "Mörtsjön", "Nordmaling", "Öster_Galåbodarna",
"Ramundberget", "Rätan", "Särvfjället", "Smedstorp", "Söderhamn",
"Stensoffa", "Storulvån", "Sveg", "Tanna", "Tänndalen", "Vålådalen",
"Vemdalsskalet"), class = "factor"), X1996 = NA_real_, X1997 = NA_real_,
X2000 = NA_real_, X2001 = NA_real_, X2002 = NA_real_,
X2003 = NA_real_, X2008 = 0.4, X2009 = 0.423076923076923,
X2010 = NA_real_, X2011 = NA_real_), .Names = c("Site_Name",
"X1996", "X1997", "X2000", "X2001", "X2002", "X2003", "X2008",
"X2009", "X2010", "X2011"), row.names = 5L, class = "data.frame")), .Names = c("Alnön",
"Ammarnäs", "Anjan", "Bäcksand", "Fittjebodarna"))
like image 227
Keith W. Larson Avatar asked Aug 03 '12 14:08

Keith W. Larson


2 Answers

Try this

lapply(WW1_Data, function(x) x[, !is.na(x)])
$Alnön
  Site_Name     X1996     X2000      X2010
1     Alnön 0.3076923 0.2608696 0.08333333

$Ammarnäs
  Site_Name X1996 X2011
2  Ammarnäs  0.75   0.8
...
like image 88
Julius Vainora Avatar answered Sep 19 '22 23:09

Julius Vainora


Julius' answer is great, but since I already started writing...

Your question is a little ambiguous, but if you want to remove any row with an NA from each data.frame in your list:

lapply(WW1_Data, na.omit)

Or you can use your own function, assuming each data.frame in your list only has one row like these do:

myfun <- function(x) {
  x[, !is.na(x)]
}

lapply(WW1_Data, myfun)

Or switch to a single data.frame, melt it and remove rows:

out <- do.call(rbind, WW1_Data)
out.m <- melt(out, id.vars='Site_Name')
na.omit(out.m)
like image 40
Justin Avatar answered Sep 18 '22 23:09

Justin