Extract multiple instances of a pattern from a string in R

Question

I have a character vector t as follows.

t <- c("GID456 SPK711", "GID456 GID667 VINK", "GID45345 DNP990 GID2345", 
    "GID895 GID895 K350")

I would like to extract all the strings starting with GID and followed by a sequence of digits.

This works, but does not retrieve multiple instances.

gsub(".*(GID\d+).*", "\1", t)
[1] "GID456"  "GID667"  "GID2345" "GID895"

How to extract all the strings in this case? The desired output is as follows

out <- c("GID456", "GID456", "GID667", "GID45345", "GID2345", 
        "GID895", "GID895")

Paul · Accepted Answer

I'm late to the party, but this tidyverse one-liner might be useful for someone.

With stringr + dplyr:

t <- c("GID456 SPK711", "GID456 GID667 VINK", "GID45345 DNP990 GID2345", "GID895 GID895 K350")
str_extract_all(t, regex("GID\d+")) %>% unlist()

gives:

[1] "GID456" "GID456" "GID667" "GID45345" "GID2345" "GID895" "GID895"

Ronak Shah · Answer

I have used str_split function from the stringr package

library(stringr)
word.list = str_split(t, '\s+') 
new_list <- unlist(word.list)
new_list[grep("GID", new_list)]

I hope this helps.

Tyler Rinker · Answer

Here's an approach using a package I maintain qdapRegex (I prefer this or stringi/stringr) to base for consistency and ease of use. I also show a base approach. In any event I'd look at this more as an "extraction" problem than a subbing problem.

y <- c("GID456 SPK711", "GID456 GID667 VINK", "GID45345 DNP990 GID2345", 
    "GID895 GID895 K350")

library(qdapRegex)
unlist(ex_default(y, pattern = "GID\d+"))

## [1] "GID456"   "GID456"   "GID667"   "GID45345" "GID2345"  "GID895"   "GID895"

In base R:

unlist(regmatches(y, gregexpr("GID\d+", y)))

Extract multiple instances of a pattern from a string in R

Tags:

regex

r

Crops

3 Answers

Paul

Ronak Shah

Tyler Rinker

Recent Activity

Donate For Us

Extract multiple instances of a pattern from a string in R

Tags:

regex

r

Crops

3 Answers

Paul

Ronak Shah

Tyler Rinker

Related questions

Recent Activity

Donate For Us