I have a textfile, containing something like: <pre class="prettyprint"><code>12,34 EUR 5,67 EUR ... </code></pre> There is one whitespace before 'EUR' and I ignore 0,XX EUR. I tried: <code>grep '[1-9][0-9]*,[0-9]\{2\}\sEUR' => didn't match !</code> <code>grep '[1-9][0-9]*,[0-9]\{2\} EUR' => worked !</code> <code>grep '[1-9][0-9]*,[0-9]\{2\}\s*EUR' => worked !</code> <code>grep '[1-9][0-9]*,[0-9]\{2\}\s[E]UR' => worked !</code> Can somebody explain me pls, why I can't use <code>\s</code> but <code>\s*</code> and <code>\s[E]</code> matched? OS: Ubuntu 10.04, grep v2.5

This looks like a behavior difference in the handling of <code>\s</code> between grep 2.5 and newer versions (a bug in old grep?). I confirm your result with grep 2.5.4, but all four of your greps do work when using grep 2.6.3 (Ubuntu 10.10). Note: <pre class="prettyprint"><code>GNU grep 2.5.4 echo "foo bar" | grep "\s" (doesn't match) </code></pre> whereas <pre class="prettyprint"><code>GNU grep 2.6.3 echo "foo bar" | grep "\s" foo bar </code></pre> Probably less trouble (as <code>\s</code> is not documented): <pre class="prettyprint"><code>Both GNU greps echo "foo bar" | grep "[[:space:]]" foo bar </code></pre> My advice is to avoid using <code>\s</code> ... use <code>[ \t]*</code> or <code>[[:space:]]</code> or something like it instead.

grep regex whitespace behavior

Tags:

regex

grep

gnu

I have a textfile, containing something like:

12,34 EUR   5,67 EUR  ...

There is one whitespace before 'EUR' and I ignore 0,XX EUR.

I tried:

grep '[1-9][0-9]*,[0-9]\{2\}\sEUR' => didn't match !

grep '[1-9][0-9]*,[0-9]\{2\} EUR' => worked !

grep '[1-9][0-9]*,[0-9]\{2\}\s*EUR' => worked !

grep '[1-9][0-9]*,[0-9]\{2\}\s[E]UR' => worked !

Can somebody explain me pls, why I can't use \s but \s* and \s[E] matched?

OS: Ubuntu 10.04, grep v2.5

885

asked Nov 20 '10 14:11

Milde

1 Answers

This looks like a behavior difference in the handling of \s between grep 2.5 and newer versions (a bug in old grep?). I confirm your result with grep 2.5.4, but all four of your greps do work when using grep 2.6.3 (Ubuntu 10.10).

Note:

GNU grep 2.5.4 echo "foo bar" | grep "\s"    (doesn't match)

whereas

GNU grep 2.6.3 echo "foo bar" | grep "\s" foo bar

Probably less trouble (as \s is not documented):

Both GNU greps echo "foo bar" | grep "[[:space:]]" foo bar

My advice is to avoid using \s ... use [ \t]* or [[:space:]] or something like it instead.

159

answered Sep 24 '22 22:09

Kamal

Related questions
                            
                                Which regular expression operator means 'Don't' match this character?
                            
                                Javascript regular expression: remove first and last slash
                            
                                Python regex - r prefix
                            
                                Get the index of a pattern in a string using regex
                            
                                Remove part of a string
                            
                                Is there a difference between /\s/g and /\s+/g?
                            
                                Validate email address in Dart? [duplicate]
                            
                                Is it possible for a computer to "learn" a regular expression by user-provided examples?
                            
                                Getting the text that follows after the regex match
                            
                                How do I do a case insensitive regular expression in Go?
                            
                                A regex for version number parsing
                            
                                Number of occurrences of a character in a string [duplicate]
                            
                                How do you validate a URL with a regular expression in Python?
                            
                                How can I recognize an evil regex?
                            
                                Java regular expression OR operator
                            
                                javascript regular expression to not match a word
                            
                                How to determine if a string is a valid v4 UUID? [duplicate]
                            
                                how to use sed, awk, or gawk to print only what is matched?
                            
                                Is gcc 4.8 or earlier buggy about regular expressions?
                            
                                How to extract string following a pattern with grep, regex or perl [duplicate]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With