Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to delete duplicates in string for each row

Tags:

r

Here is my sample data:

V1
"a b c c c d"
"a a b b c d"
"a b c d e f"

I want this output:

V1
"a b c d"
"a b c d"
"a b c d e f"

paste(unique(unlist(strsplit(x, split=" "))))

gets rid of duplicates from the entire dataframe, while I need it to be row by row.

like image 242
auto Avatar asked Jan 18 '26 12:01

auto


1 Answers

Use sapply instead of unlist

df$V2 <- sapply(strsplit(df$V1, " "), function(x) paste0(unique(x), collapse = " "))

df
#           V1          V2
#1 a b c c c d     a b c d
#2 a a b b c d     a b c d
#3 a b c d e f a b c d e f

data

df <- structure(list(V1 = c("a b c c c d", "a a b b c d", "a b c d e f"
)), row.names = c(NA, -3L), class = "data.frame")
like image 105
Ronak Shah Avatar answered Jan 21 '26 02:01

Ronak Shah



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!