Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

grep regex whitespace behavior

Tags:

regex

grep

gnu

I have a textfile, containing something like:

12,34 EUR   5,67 EUR  ... 

There is one whitespace before 'EUR' and I ignore 0,XX EUR.

I tried:

grep '[1-9][0-9]*,[0-9]\{2\}\sEUR' => didn't match !

grep '[1-9][0-9]*,[0-9]\{2\} EUR' => worked !

grep '[1-9][0-9]*,[0-9]\{2\}\s*EUR' => worked !

grep '[1-9][0-9]*,[0-9]\{2\}\s[E]UR' => worked !

Can somebody explain me pls, why I can't use \s but \s* and \s[E] matched?

OS: Ubuntu 10.04, grep v2.5

like image 885
Milde Avatar asked Nov 20 '10 14:11

Milde


People also ask

How do you grep whitespace?

For any specific space character, you just use it. If you want to allow for ANY space character (tab, space, newline, etc), then if you have a “grep” that supports EXTENDED regular expressions (with the '-E' option), you can use '[[:space:]]' to represent any space character.

How to grep using regex in Linux?

Grep Regular Expression In its simplest form, when no regular expression type is given, grep interpret search patterns as basic regular expressions. To interpret the pattern as an extended regular expression, use the -E ( or --extended-regexp ) option.

How do you grep at the end of a line?

Matching the lines that end with a string : The $ regular expression pattern specifies the end of a line. This can be used in grep to match the lines which end with the given string or pattern. 11. -f file option Takes patterns from file, one per line.

How do you grep tabs?

just use grep "<Ctrl+V><TAB>" , it works (if first time: type grep " then press Ctrl+V key combo, then press TAB key, then type " and hit enter, voilà!)


1 Answers

This looks like a behavior difference in the handling of \s between grep 2.5 and newer versions (a bug in old grep?). I confirm your result with grep 2.5.4, but all four of your greps do work when using grep 2.6.3 (Ubuntu 10.10).

Note:

GNU grep 2.5.4 echo "foo bar" | grep "\s"    (doesn't match) 

whereas

GNU grep 2.6.3 echo "foo bar" | grep "\s" foo bar 

Probably less trouble (as \s is not documented):

Both GNU greps echo "foo bar" | grep "[[:space:]]" foo bar 

My advice is to avoid using \s ... use [ \t]* or [[:space:]] or something like it instead.

like image 159
Kamal Avatar answered Sep 24 '22 22:09

Kamal