Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove first occurrence of elements in a vector from another vector

Tags:

r

duplicates

I have a character vector, including some elements that are duplicates e.g.

v <- c("d09", "d11", "d13", "d01", "d02", "d10", "d13")

And another vector that includes single counts of those characters e.g.

x <- c("d10", "d11", "d13")

I want to remove only the first occurrence of each element in x from the 2nd vector v. In this example, d13 occurs in x and twice in v, but only the first match is removed from v and the duplicate is kept. Thus, I want to end up with:

"d09", "d01", "d02", "d13"

I've been trying various things e.g. z <- v[!(v %in% x)] but it keeps removing all instances of the characters in x, not just the first, so I end up with this instead:

"d09", "d01", "d02"

What can I do to only remove one instance of a duplicated element?

like image 971
rw2 Avatar asked May 08 '15 17:05

rw2


People also ask

How do I remove the first element of a vector?

To remove first element of a vector, you can use erase() function. Pass iterator to first element of the vector as argument to erase() function.

Which method is used to remove first occurence of element from the list?

The remove() method will remove the first instance of a value in a list.

How do you remove values from a vector?

The erase() function can remove an element from the beginning, within, or end of the vector. In order to remove all the elements from the vector, using erase(), the erase() function has to be repeated the number of times there are elements, beginning from the first element.


1 Answers

You can use match and negative indexing.

v[-match(x, v)]

produces

[1] "d09" "d01" "d02" "d13"

match only returns the location of the first match of a value, which we use to our advantage here.

Note that %in% and is.element are degenerate versions of match. Compare:

match(x, v)            # [1] 6 2 3
match(x, v) > 0        # [1] TRUE TRUE TRUE
x %in% v               # [1] TRUE TRUE TRUE
is.element(x, v)       # [1] TRUE TRUE TRUE

The last three are all the same, and are basically the coerced to logical version of the first (in fact, see code for %in% and is.element). In doing so you lose key information, which is the location of the first match of x in v and are left only knowing that x values exist in v.

The converse, v %in% x means something different from what you want, which is "which values in v are in x", which won't meet your requirement since all duplicate values will satisfy that condition.

like image 67
BrodieG Avatar answered Oct 11 '22 01:10

BrodieG