Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the most efficient case-insensitive grep usage?

My objective is to match email addresses that belong to the Yahoo! family of domains. In *nix systems (I will be using Ubuntu), what are the benefits and drawbacks to any one of these methods for matching the pattern?

And if there is another, more elegant solution that I haven't been capable of imagining, please share.

Here they are:

  • Use grep with option -i:

grep -Ei "@(yahoo|(y|rocket)mail|geocities)\.com"

  • Translate characters to all upper case or lower case then grep:

tr [:upper:] [:lower:] < /path/to/file.txt | grep -E "@(yahoo|(y|rocket)mail|geocities)\.com"

  • Include a character set for each character in the pattern (the below would of course not match something like "@rOcketmail.com", but you get the idea of what it would become if I checked each character for case):

grep -E "@([yY]ahoo|([yY]|[rR]ocket)[mM]ail|[gG]eo[cC]ities)\.[cC][oO][mM]" /path/to/file.txt

like image 780
sblack89 Avatar asked Apr 07 '14 22:04

sblack89


People also ask

How do you grep a case insensitive?

Case Insensitive Search By default, grep is case sensitive. This means that the uppercase and lowercase characters are treated as distinct. To ignore case when searching, invoke grep with the -i option (or --ignore-case ).

Does grep match case?

Grep is case-sensitive by default hence it shows the perceptibility of both upper and lower cases in the file.


1 Answers

grep -i turned out to be significantly slower than translating to lowers before grepping, so I ended up using a variation of #2.

Thanks @mike-w for reminding me that a simple test goes a long way.

like image 143
sblack89 Avatar answered Oct 03 '22 00:10

sblack89