Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use regex to search inside sentence -not a case sensitive

I'm a newbie to Regular expression in Python :
I have a list that i want to search if it's contain a employee name.

The employee name can be :

  • it can be at the beginning followed by space.
  • followed by ®
  • OR followed by space
  • OR Can be at the end and space before it
  • not a case sensitive

ListSentence = ["Steve®", "steveHotel", "Rob spring", "Car Daniel", "CarDaniel","Done daniel"]
ListEmployee = ["Steve", "Rob", "daniel"]

The output from the ListSentence is:

["Steve®", "Rob spring", "Car Daniel", "Done daniel"]
like image 506
mongotop Avatar asked Jun 17 '13 04:06

mongotop


1 Answers

First take all your employee names and join them with a | character and wrap the string so it looks like:

(?:^|\s)((?:Steve|Rob|Daniel)(?:®)?)(?=\s|$) enter image description here

By first joining all the names together you avoid the performance overhead of using a nested set of for next loops.

I don't know python well enough to offer a python example, however in powershell I'd write it like this

[array]$names = @("Steve", "Rob", "daniel")
[array]$ListSentence = @("Steve®", "steveHotel", "Rob spring", "Car Daniel", "CarDaniel","Done daniel")

# build the regex, and insert the names as a "|" delimited string
$Regex = "(?:^|\s)((?:" + $($names -join "|") + ")(?:®)?)(?=\s|$)" 

# use case insensitive match to find any matching array values
$ListSentence -imatch $Regex

Yields

Steve®
Rob spring
Car Daniel
Done daniel
like image 124
Ro Yo Mi Avatar answered Nov 02 '22 23:11

Ro Yo Mi