Make awk use character classes

Question

How do I make awk recognise character classes?

For example, this:

echo "a
1
b
2
c" | awk '/1/'

outputs 1 as expected, but this:

echo "a
1
b
2
c" | awk '/\d/'

outputs nothing where I expected both 1 and 2 to survive the filter.

I thought this might be something to do with shell escaping (zsh) but awk '/\d/' also doesn't work.

ghoti · Accepted Answer

You could try using spelled-out character classes:

[ghoti@pc ~]$ printf "a
1
b
2
c
" | awk '/[[:digit:]]/'
1
2
[ghoti@pc ~]$

As far as I'm aware, notation like \d isn't actually part of ERE, which is the regex dialect understood by most awk variants (as well as The One True Awk).

UPDATE:

As was pointed out in comments, some distributions of Linux may have mawk installed, masquerading as awk. mawk is NOT the same as awk. It is a minimal-featured awk clone, designed for execution speed rather than functionality. And despite claims in its man page that it supports Extended Regular Expressions, mawk fails to implement POSIX "classes", like [:digit:], [:upper:], [:lower:], etc.

If you run systems that provide non-standard tools like mawk in place of standard ones, then you should expect to live in interesting times. A developer of Awk scripts expects any binary at /usr/bin/awk to behave like awk. If it does not, the system is broken.

Make awk use character classes

Tags:

regex

awk

character-class

spraff

1 Answers

ghoti

Recent Activity

Donate For Us

Make awk use character classes

Tags:

regex

awk

character-class

spraff

1 Answers

ghoti

Related questions

Recent Activity

Donate For Us