Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subtracting Two Strings in R

Tags:

string

r

I have this data in R:

string_1 = c("newyork 123", "california 123", "washington 123")
string_2 = c("123 red", "123 blue", "123 green")
my_data = data.frame(string_1, string_2)

I want to "subtract" string_2 from string_1. The result would look something like this:

"newyork", "california", "washington"

I tried to do this:

library(tidyverse)

# did not work as planned
> str_remove(string_1, "string_2")

[1] "newyork 123"    "california 123" "washington 123"

But this is not performing a "full" subtraction.

  • Does anyone know how to do this?
  • Should I try to do this with an ANTI JOIN in SQL?

Thank you!

like image 423
stats_noob Avatar asked Aug 30 '25 16:08

stats_noob


2 Answers

You could split both strings and find the set difference of them.

mapply(setdiff, strsplit(string_1, "\\s+"), strsplit(string_2, "\\s+"))

# [1] "newyork"    "california" "washington"
like image 169
Darren Tsai Avatar answered Sep 02 '25 10:09

Darren Tsai


library(purrr)

list1 <- str_split(string_1, pattern = " ")
list2 <- str_split(string_2, pattern = " ")

a <- map2(list1, list2, function(x, y){
    
    output <- setdiff(x, y)
    return(output)
  }) %>% unlist()
like image 22
Chemist learns to code Avatar answered Sep 02 '25 08:09

Chemist learns to code