How do I make awk recognise character classes?
For example, this:
echo "a\n1\nb\n2\nc" | awk '/1/'
outputs 1
as expected, but this:
echo "a\n1\nb\n2\nc" | awk '/\d/'
outputs nothing where I expected both 1
and 2
to survive the filter.
I thought this might be something to do with shell escaping (zsh) but awk '/\\d/'
also doesn't work.
You could try using spelled-out character classes:
[ghoti@pc ~]$ printf "a\n1\nb\n2\nc\n" | awk '/[[:digit:]]/'
1
2
[ghoti@pc ~]$
As far as I'm aware, notation like \d
isn't actually part of ERE, which is the regex dialect understood by most awk variants (as well as The One True Awk).
UPDATE:
As was pointed out in comments, some distributions of Linux may have mawk
installed, masquerading as awk
. mawk is NOT the same as awk. It is a minimal-featured awk clone, designed for execution speed rather than functionality. And despite claims in its man page that it supports Extended Regular Expressions, mawk fails to implement POSIX "classes", like [:digit:]
, [:upper:]
, [:lower:]
, etc.
If you run systems that provide non-standard tools like mawk
in place of standard ones, then you should expect to live in interesting times. A developer of Awk scripts expects any binary at /usr/bin/awk
to behave like awk. If it does not, the system is broken.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With