I have a data frame as below:
id = c("a2887", "a2887", "a5511","a5511","a2806", "a1491", "a1491", "a4309", "a4309")
plan = c("6V", "6V", "25HS", "50HS", "25HS", "250Mbps", "250Mbps", "15Mbps", "15Mbps")
df = data.frame(id, plan)
It looks like:
id plan
a2887 6V
a2887 6V
a5511 25HS
a5511 50HS
a2806 25HS
a1491 250Mbps
a1491 250Mbps
a4309 15Mbps
a4309 15Mbps
I'd like to remove rows have same id but with different value in column plan, only keep rows with unique ID/plan match and create a new dataframe looks like:
id plan
a2887 6V
a2806 25HS
a1491 250Mbps
a4309 15Mbps
Is there any elegant way to achieve this? thanks!
We can use tidyverse
. After grouping by 'id', filter
the groups of 'id' having only a single unique value for 'plan' and get the distinct
rows
library(dplyr)
df %>%
group_by(id) %>%
filter(n_distinct(plan)==1) %>%
distinct()
# A tibble: 4 x 2
# Groups: id [4]
# id plan
# <fctr> <fctr>
#1 a2887 6V
#2 a2806 25HS
#3 a1491 250Mbps
#4 a4309 15Mbps
data.table
solution:
library(data.table)
setDT(df)
df <- unique(df)
df[, idx := .N, by = id]
df <- df[!(idx > 1), ]
df[, idx := NULL]
id plan
1: a2887 6V
2: a2806 25HS
3: a1491 250Mbps
4: a4309 15Mbps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With