Problem: To do some survey analysis on prescription drug use in R
, I need to turn multiple rows of the same person (ID) into one, indicating TRUE
if any of said rows has TRUE
in it.
Here's the data:
df <- data.frame(ID = c("a","a","a","a","a","a"),
cardiovasc = c(T,T,T,T,T,T),
beta_blockers = c(F,F,F,F,F,F),
antibiotics = c(T,F,F,F,F,F),
stringsAsFactors=FALSE)
Here's what I'd like it to look like:
goal <- data.frame(ID = c("a"),
cardiovasc = c(T),
beta_blockers = c(F),
antibiotics = c(T),
stringsAsFactors=FALSE)
As you can tell, even though df$antibiotics
only had 1 TRUE
in the dataset, I'd like that to count as TRUE
when the ID has been collapsed into one row.
What I've tried:
Mainly, I've been trying to work off this post, and while I feel I'm close, I nevertheless get an error. Here's my attempt:
df <- df[, lapply(.SD, paste0, collapse=""), by=ID]
Which yields unused argument (by = ID)
. I've tried another approach from the same post, but that's even messier and requires me to make the data a data.table
. I need to keep things as a data.frame
.
Any ideas?
We can use any
instead of paste
as any
will check for any TRUE elements in the column, grouped by 'ID'
library(data.table)
setDT(df)[, lapply(.SD, any), ID]
-output
# ID cardiovasc beta_blockers antibiotics
#1: a TRUE FALSE TRUE
Or you can use this tidyverse
solution:
library(dplyr)
df %>%
group_by(ID) %>%
summarise(across(cardiovasc:antibiotics, ~ any(.x)))
# A tibble: 1 x 4
ID cardiovasc beta_blockers antibiotics
<chr> <lgl> <lgl> <lgl>
1 a TRUE FALSE TRUE
Updated
Thank you dear @Ray for bringing up a very likely scenario:
In case the column values were 1
& 0
instead of TRUE
& FALSE
and also taking into account the presence of NA
values among them we could use the following solution:
df %>%
group_by(ID) %>%
summarise(across(cardiovasc:antibiotics, ~ any(.x[!is.na(.x)] == 1)))
# A tibble: 1 x 4
ID cardiovasc beta_blockers antibiotics
<chr> <lgl> <lgl> <lgl>
1 a TRUE FALSE TRUE
Data
df <- data.frame(ID = c("a","a","a","a","a","a"),
cardiovasc = c(1,1,1,1,1,1),
beta_blockers = c(0,0,0,0,0,0),
antibiotics = c(1,0,0,0,0,0),
stringsAsFactors=FALSE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With