I'm trying to use egrep with a regex pattern to match whitespace.
I've used RegEx with Perl and C# before and they both support the pattern \s
to search for whitespace. egrep (or at least the version I'm using) does not seem to support this pattern.
In a few articles online I've come across a shorthand [[:space:]], but this does not seem to work. Any help is appreciated.
Using: SunOS 5.10
For any specific space character, you just use it. If you want to allow for ANY space character (tab, space, newline, etc), then if you have a “grep” that supports EXTENDED regular expressions (with the '-E' option), you can use '[[:space:]]' to represent any space character.
\s stands for “whitespace character”. Again, which characters this actually includes, depends on the regex flavor. In all flavors discussed in this tutorial, it includes [ \t\r\n\f]. That is: \s matches a space, a tab, a carriage return, a line feed, or a form feed.
Grep Regular Expression In its simplest form, when no regular expression type is given, grep interpret search patterns as basic regular expressions. To interpret the pattern as an extended regular expression, use the -E ( or --extended-regexp ) option.
Matching the lines that end with a string : The $ regular expression pattern specifies the end of a line. This can be used in grep to match the lines which end with the given string or pattern. 11. -f file option Takes patterns from file, one per line.
I see the same issue on SunOS 5.10. /usr/bin/egrep
does not support extended regular expressions.
Try using /usr/xpg4/bin/egrep
:
$ echo 'this line has whitespace thislinedoesnthave' | /usr/xpg4/bin/egrep '[[:space:]]' this line has whitespace
Another option might be to just use perl:
$ echo 'this line has whitespace thislinedoesnthave' | perl -ne 'chomp;print "$_\n" if /[[:space:]]/' this line has whitespace
If you're using 'degraded' versions of grep (I quote the term because most UNIX'es I work on still use the original REs, not those fancy ones with "\s
" or "[[:space:]]
" :-), you can just revert to the lowest form of RE.
For example, if :space:
is defined as spaces and tabs, just use:
egrep '[ ^I]' file
That ^I
is an actual tab character, not the two characters ^
and I
.
This is assuming :space:
is defined as tabs and spaces, otherwise adjust the choices within the []
characters.
The advantage of using degraded REs is that they should work on all platforms (at least for ASCII; Unicode or non-English languages may have different rules but I rarely find a need).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With