Say I have the following dataframe df
name <- c("Bill", "Rob", "Joe", "Joe")
address <- c("123 Main St", "234 Broad St", NA, "456 North Ave")
favteam <- c("Dodgers", "Mets", "Pirates", NA)
df <- data.frame(name = name,
address = address,
favteam = favteam)
df
Which looks like:
name address favteam
1 Bill 123 Main St Dodgers
2 Rob 234 Broad St Mets
3 Joe <NA> Pirates
4 Joe 456 North Ave <NA>
What I want to do is collapse (coalesce) rows by name (or in general, any number of grouping variables) and have any other value than NA
replace the NA
value in the final data, like so:
df_collapse <- foo(df)
name address favteam
1 Bill 123 Main St Dodgers
2 Rob 234 Broad St Mets
3 Joe 456 North Ave Pirates
Here's an option with dplyr:
library(dplyr)
df %>%
group_by(name) %>%
summarise_each(funs(first(.[!is.na(.)]))) # or summarise_each(funs(first(na.omit(.))))
#Source: local data frame [3 x 3]
#
# name address favteam
#1 Bill 123 Main St Dodgers
#2 Joe 456 North Ave Pirates
#3 Rob 234 Broad St Mets
And with data.table:
library(data.table)
setDT(df)[, lapply(.SD, function(x) x[!is.na(x)][1L]), by = name]
# name address favteam
#1: Bill 123 Main St Dodgers
#2: Rob 234 Broad St Mets
#3: Joe 456 North Ave Pirates
Or
setDT(df)[, lapply(.SD, function(x) head(na.omit(x), 1L)), by = name]
Edit:
You say in your actual data you have varying numbers of non-NA responses per name. In that case, the following approach may be helpful.
Consider this modified sample data (look at last row):
name <- c("Bill", "Rob", "Joe", "Joe", "Joe")
address <- c("123 Main St", "234 Broad St", NA, "456 North Ave", "123 Boulevard")
favteam <- c("Dodgers", "Mets", "Pirates", NA, NA)
df <- data.frame(name = name,
address = address,
favteam = favteam)
df
# name address favteam
#1 Bill 123 Main St Dodgers
#2 Rob 234 Broad St Mets
#3 Joe <NA> Pirates
#4 Joe 456 North Ave <NA>
#5 Joe 123 Boulevard <NA>
Then, you can use this data.table approach to get the non-NA responses that can be varying in number by name:
setDT(df)[, lapply(.SD, function(x) unique(na.omit(x))), by = name]
# name address favteam
#1: Bill 123 Main St Dodgers
#2: Rob 234 Broad St Mets
#3: Joe 456 North Ave Pirates
#4: Joe 123 Boulevard Pirates
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With