Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove rows with same id but different value in another column in r

Tags:

r

I have a data frame as below:

id = c("a2887", "a2887", "a5511","a5511","a2806", "a1491", "a1491", "a4309", "a4309") 
plan = c("6V", "6V", "25HS", "50HS", "25HS", "250Mbps", "250Mbps", "15Mbps", "15Mbps") 
df = data.frame(id, plan)

It looks like:

   id    plan 
a2887      6V
a2887      6V
a5511    25HS
a5511    50HS
a2806    25HS
a1491 250Mbps
a1491 250Mbps
a4309  15Mbps
a4309  15Mbps

I'd like to remove rows have same id but with different value in column plan, only keep rows with unique ID/plan match and create a new dataframe looks like:

   id    plan
a2887      6V
a2806    25HS
a1491 250Mbps
a4309  15Mbps

Is there any elegant way to achieve this? thanks!

like image 446
D Jay Avatar asked Jan 02 '23 17:01

D Jay


2 Answers

We can use tidyverse. After grouping by 'id', filter the groups of 'id' having only a single unique value for 'plan' and get the distinct rows

library(dplyr)
df %>%
   group_by(id) %>%
   filter(n_distinct(plan)==1) %>%
   distinct()
# A tibble: 4 x 2
# Groups: id [4]
#  id     plan   
#  <fctr> <fctr> 
#1 a2887  6V     
#2 a2806  25HS   
#3 a1491  250Mbps
#4 a4309  15Mbps 
like image 126
akrun Avatar answered Jan 05 '23 07:01

akrun


data.table solution:

library(data.table)
setDT(df)
df <- unique(df)
df[, idx := .N, by = id]
df <- df[!(idx > 1), ]
df[, idx := NULL]

     id    plan
1: a2887      6V
2: a2806    25HS
3: a1491 250Mbps
4: a4309  15Mbps
like image 42
sm925 Avatar answered Jan 05 '23 07:01

sm925