Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

egrep search for whitespace

Tags:

I'm trying to use egrep with a regex pattern to match whitespace.

I've used RegEx with Perl and C# before and they both support the pattern \s to search for whitespace. egrep (or at least the version I'm using) does not seem to support this pattern.

In a few articles online I've come across a shorthand [[:space:]], but this does not seem to work. Any help is appreciated.

Using: SunOS 5.10

like image 497
user32474 Avatar asked Jan 15 '09 23:01

user32474


People also ask

How do you grep a word with spaces?

For any specific space character, you just use it. If you want to allow for ANY space character (tab, space, newline, etc), then if you have a “grep” that supports EXTENDED regular expressions (with the '-E' option), you can use '[[:space:]]' to represent any space character.

What is white space in regex?

\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.

How do you grep in regex?

Grep Regular Expression In its simplest form, when no regular expression type is given, grep interpret search patterns as basic regular expressions. To interpret the pattern as an extended regular expression, use the -E ( or --extended-regexp ) option.

How do you grep at the end of a line?

Matching the lines that end with a string : The $ regular expression pattern specifies the end of a line. This can be used in grep to match the lines which end with the given string or pattern. 11. -f file option Takes patterns from file, one per line.


2 Answers

I see the same issue on SunOS 5.10. /usr/bin/egrep does not support extended regular expressions.

Try using /usr/xpg4/bin/egrep:

$ echo 'this line has whitespace thislinedoesnthave' | /usr/xpg4/bin/egrep '[[:space:]]' this line has whitespace 

Another option might be to just use perl:

$ echo 'this line has whitespace thislinedoesnthave' | perl -ne 'chomp;print "$_\n" if /[[:space:]]/' this line has whitespace 
like image 163
Jon Ericson Avatar answered Oct 15 '22 05:10

Jon Ericson


If you're using 'degraded' versions of grep (I quote the term because most UNIX'es I work on still use the original REs, not those fancy ones with "\s" or "[[:space:]]" :-), you can just revert to the lowest form of RE.

For example, if :space: is defined as spaces and tabs, just use:

egrep '[ ^I]' file 

That ^I is an actual tab character, not the two characters ^ and I.

This is assuming :space: is defined as tabs and spaces, otherwise adjust the choices within the [] characters.

The advantage of using degraded REs is that they should work on all platforms (at least for ASCII; Unicode or non-English languages may have different rules but I rarely find a need).

like image 37
paxdiablo Avatar answered Oct 15 '22 07:10

paxdiablo