I've been trying various ways to do some basic things with sed on OS X. Here are the results of some simple tests.
echo "foo bar 2011-03-17 17:31:47 foo bar" | sed 's/foo/FOUND/g'
returns (as expected)
FOUND bar 2011-03-17 17:31:47 FOUND bar
but
echo "foo bar 2011-03-17 17:31:47 foo bar" | sed -E 's/\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}/FOUND/g'
returns
foo bar 2011-03-17 17:31:47 foo bar
and (even more irritatingly)
echo "food bar 2011-03-17 17:31:47 food bar" | sed -E 's/\d/FOUND/g'
returns
fooFOUND bar 2011-03-17 17:31:47 fooFOUND bar
Now, the man sed
pages say that
The following options are available: -E Interpret regular expressions as extended (modern) regular expressions rather than basic regular expressions (BRE's). The re_format(7) manual page fully describes both formats.
and man re_format
says
\d Matches a digit character. This is equivalent to `[[:digit:]]'.
And indeed:
echo "foo bar 2011-03-17 17:31:47 foo bar" | sed -E 's/[[:digit:]]{4}/FOUND/g'
gives me
foo bar FOUND-03-17 17:31:47 foo bar
...but this is annoying. Either because I'm being dense, or because the man pages are lying to me (to be honest, I'd prefer the former).
A quick literature review here on SO suggests that I am not alone in this, and that many recommend installing GNU coreutils
(or indeed use something else - say perl -pe
) -- however, I'd like to be certain:
Do EREs work with sed
as it is bundled with OS X -- as implied by the man
pages -- or not?
(I'm on 10.8 and 10.6.8)
A regular expression is a string that can be used to describe several sequences of characters. Regular expressions are used by several different Unix commands, including ed, sed, awk, grep, and to a more limited extent, vi.
As Avinash Raj has pointed out, sed uses basic regular expression (BRE) syntax by default, (which requires ( , ) , { , } to be preceded by \ to activate its special meaning), and -r option switches over to extended regular expression (ERE) syntax, which treats ( , ) , { , } as special without preceding \ .
sed does not support "non greedy" operator. You have to use "[]" operator to exclude "/" from match. P.S. there is no need to backslash "/".
On most versions of sed (but not all), the 'r' (read) and 'w' (write) commands must be followed by exactly one space, then the filename, and then terminated by a newline. Any additional characters before or after the filename are interpreted as part of the filename.
On macOS, \d
is part of a regex feature set called enhanced features - note the distinction in name: enhanced, which is NOT the same as extended.
Instead, enhanced features are a separate dimension from basic vs. extended, which can be activated for both basic and extended regexes. In other words: you can have enhanced basic regexes as well as enhanced extended regexes.
However, it appears that whether enhanced features are available in a given utility is precompiled into it; in other words: a given utility either supports enhanced features or it doesn't - no option can change that. (Options only allow you to choose between basic and extended, such as -E
for sed
and grep
.)
For a description of all enhanced features, see section ENHANCED FEATURES
in man re_format
.
It should also be noted that if POSIX compatibility is important, enhanced features should be avoided with sed
.
There are POSIX utilities, such as awk
, that do support EREs (extended regular expressions), but (a), the POSIX spec explicitly has to state so, and (b) the syntax is limited to POSIX EREs, which are less powerful than the EREs offered by specific platforms.
In practice:
Sadly, the man
pages for the various utilities do NOT state whether a given utility supports enhanced regex features, so it comes down to trial and error.
As of macOS 10.15:
macOS sed
does NOT support enhanced features, which explains the OP's experience.
sed -E 's/\d//g' <<<'a10'
has no effect, because \d
isn't recognized as representing a digit (only [[:digit:]]
is).I have found only one utility that supports enhanced features: grep
:
grep -o '\d\+' <<<'a10' # -> '10' - enhanced basic regex grep -E -o '\d+' <<<'a10' # -> '10' - enhanced extended regex
If you know of others that do, please let us know.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With