Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an "illegal primary" in awk?

Tags:

regex

awk

Awk gives me the following error:

awk: illegal primary in regular expression (?<=\>)(.*?)(?=\<) at <=\>)(.*?)(?=\<)
source line number 10 source file transpile.awk
context is
    match($0, >>>  /(?<=\>)(.*?)(?=\<)/) <<< 

But what is an "illegal primary"?

like image 567
303 Avatar asked May 01 '18 20:05

303


1 Answers

A "primary", in awk parlance, is the basic unit of a regex.

A regex consists of an alternative of (1 or more) branches. Each branch consists of a concatenation of (0 or more) primaries.

A primary is either a normal character (e.g. a), or an escaped special character (e.g. \*), or a character class ([...]), or a dot (.), or an anchor (^ or $), or a parenthesized subexpression ((...)). Most of these can have a quantifier (?, +, *), too.

The problem with your regex is that (?<=\>) parses as ( first, which starts a subgroup. The next item then needs to be a primary. ? is not a valid primary, hence you get an error.

Awk does not support look-ahead or look-behind.

like image 69
melpomene Avatar answered Oct 02 '22 02:10

melpomene