Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pattern matching using a wildcard

How do I identify a string using a wildcard?

I've found glob2rx, but I don't quite understand how to use it. I tried using the following code to pick the rows of the data frame that begin with the word blue:

# make data frame a <- data.frame( x =  c('red','blue1','blue2', 'red2'))  # 1 result <- subset(a, x == glob2rx("blue*") )  # 2 test = ls(pattern = glob2rx("blue*")) result2 <- subset(a, x == test )  # 3 result3 <- subset(a, x == pattern("blue*") ) 

However, neither of these worked. I'm not sure if I should be using a different function to try and do this.

like image 600
djq Avatar asked Apr 28 '11 18:04

djq


People also ask

What is wildcard pattern matching?

A wildcard pattern is a series of characters that are matched against incoming character strings. You can use these patterns when you define pattern matching criteria. Matching is done strictly from left to right, one character or basic wildcard pattern at a time.

Which operator is used for pattern matching or to do wildcard search?

LIKE operator is used for pattern matching, and it can be used as -. % – It matches zero or more characters.

What are the two special wildcard characters used in pattern matching?

_ The Underscore Here, X, is any specified starting pattern such as the single character of more and _ matches exactly one character. The underscore '_' wildcard can be used, alone or in combination with %, in many ways with the specified pattern.


1 Answers

If you want to examine elements inside a dataframe you should not be using ls() which only looks at the names of objects in the current workspace (or if used inside a function in the current environment). Rownames or elements inside such objects are not visible to ls() (unless of course you add an environment argument to the ls(.)-call). Try using grep() which is the workhorse function for pattern matching of character vectors:

result <- a[ grep("blue", a$x) , ]  # Note need to use `a$` to get at the `x` 

If you want to use subset then consider the closely related function grepl() which returns a vector of logicals can be used in the subset argument:

subset(a, grepl("blue", a$x))       x 2 blue1 3 blue2 

Edit: Adding one "proper" use of glob2rx within subset():

result <- subset(a,  grepl(glob2rx("blue*") , x) ) result       x 2 blue1 3 blue2 

I don't think I actually understood glob2rx until I came back to this question. (I did understand the scoping issues that were ar the root of the questioner's difficulties. Anybody reading this should now scroll down to Gavin's answer and upvote it.)

like image 158
IRTFM Avatar answered Oct 02 '22 11:10

IRTFM