If this is my dataset
Id Weight Category
1 10.2 Pre
1 12.1 Post
2 11.3 Post
3 12.9 Pre
4 10.3 Post
4 12.3 Pre
5 11.8 Pre
How Do I get rid of duplicate IDs that are also Category=Pre. My final expected dataset would be
Id Weight Category
1 12.1 Post
2 11.3 Post
3 12.9 Pre
4 10.3 Post
5 11.8 Pre
You may arrange the data and then use distinct.
library(dplyr)
df %>% arrange(Id, Category) %>% distinct(Id, .keep_all = TRUE)
# Id Weight Category
#1 1 12.1 Post
#2 2 11.3 Post
#3 3 12.9 Pre
#4 4 10.3 Post
#5 5 11.8 Pre
This works because 'Pre' > 'Post'.
Using by, split dat by Id and select Post, then rbind result.
do.call(rbind, by(dat, dat$Id, function(x)
if (nrow(x) == 2) x[x$Category == 'Post', ] else x))
# Id Weight Category
# 1 1 12.1 Post
# 2 2 11.3 Post
# 3 3 12.9 Pre
# 4 4 10.3 Post
# 5 5 11.8 Pre
Data:
dat <- read.table(header=T, text='
Id Weight Category
1 10.2 Pre
1 12.1 Post
2 11.3 Post
3 12.9 Pre
4 10.3 Post
4 12.3 Pre
5 11.8 Pre
')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With