Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filter a vector of strings based on string matching

Tags:

r

I have the following vector:

X <- c("mama.log", "papa.log", "mimo.png", "mentor.log")

How do I retrieve another vector that only contains elements starting with "m" and ending with ".log"?

like image 897
Andrey Adamovich Avatar asked Aug 25 '11 08:08

Andrey Adamovich


4 Answers

you can use grepl with regular expression:

X[grepl("^m.*\\.log", X)]
like image 79
kohske Avatar answered Sep 25 '22 19:09

kohske


Try this:

grep("^m.*[.]log$", X, value = TRUE)
## [1] "mama.log"   "mentor.log"

A variation of this is to use a glob rather than a regular expression:

grep(glob2rx("m*.log"), X, value = TRUE)
## [1] "mama.log"   "mentor.log"
like image 33
G. Grothendieck Avatar answered Sep 26 '22 19:09

G. Grothendieck


The documentation on the stringr package says:

str_subset() is a wrapper around x[str_detect(x, pattern)], and is equivalent to grep(pattern, x, value = TRUE). str_which() is a wrapper around which(str_detect(x, pattern)), and is equivalent to grep(pattern, x).

So, in your case, the more elegant way to accomplish your task using tidyverse instead of base R is as following.

library(tidyverse)

c("mama.log", "papa.log", "mimo.png", "mentor.log") %>% 
   str_subset(pattern = "^m.*\\.log")

which produces the output:

[1] "mama.log"   "mentor.log"
like image 22
Alexander Avatar answered Sep 24 '22 19:09

Alexander


Using pipes...

library(tidyverse)

c("mama.log", "papa.log", "mimo.png", "mentor.log") %>%
 .[grepl("^m.*\\.log$", .)]
[1] "mama.log"   "mentor.log"
like image 25
user3357059 Avatar answered Sep 26 '22 19:09

user3357059