When should regex backtracking controls, like (*PRUNE), be used?

Question

Some regex engines support backtracking-related verbs: (*PRUNE), (*SKIP), (?{doSomeCode();})*, etc. I already know what these verbs do, from Reference - What does this regex mean?.

I'm inclined to think that these verbs are somewhat esoteric, or at least an unnecessary step to a more low-level type of programming. Instead of needing a (*PRUNE), wouldn't it be better (reducing the complexity for both regex writer/programmer and reader) to have more optimizations behind the scenes (like in the compiler or engine)?

So, in what situation is it useful to include a backtracking-related verb in a regex, in practice? Is there ever a benefit to do so?

* While not technically a backtracking control verb, many examples that execute arbitrary code in regexes do it in a way that affects or is affected by backtracking.

Background

These features were originally experimental, though they are no longer labeled as such in the Perl regex tutorial. It's no wonder that I am unable to find much about these constructs on the Internet (especially when the search is clogged with irrelevant results that have skip or prune outside the code). I'll bet that there are numerous people advanced enough in regex to use these verbs that simply don't know about them.

So there are a number of practical barriers preventing widespread use:

Features are experimental
Features are obscure
Features are advanced

Trying to find an answer that looks past this and finds a good use-case, or the reasoning from the developers who created these features.

I'm also aware that a similar closed (too opinion-based) question exists, but it didn’t answer my question as the only answer to say "yes" to that question gave two links, one of which was an esoteric use (additionally, I cannot understand it...). The other one, while it gave a situation of when to use (*FAIL), did not did not address any of the other constructs I mentioned, nor did it use (*FAIL) as a Backtracking mechanism. From what I understand, (*FAIL) can be emulated by any regex that always fails.

Let me respecify what I am looking for in an answer:

Relates specifically to backtracking
Non-Esoteric
Practical
More than an example of usage
Has an explanation for any examples given
May include background on reason for adding features
May include updates, with sources, relevant to the future of the features (in Perl or other regex flavors)

polettix · Accepted Answer

One good piece of documentation you can look at is the section about directives in Parse::RecDescent. The <commit> directive, in particular, seems somehow related to (*PRUNE) (although there is a (*COMMIT) too), and contains an instructive example.

My personal impression is that most of the times they provide you tools to make your regexes better (e.g. more performing, or clearer) but not necessarily more effective. As an example, you can probably live without (*PRUNE), but you would suffer from heavier backtracking and how this affects you depends on what you're trying to match. Re (*FAIL), it might probably be emulated with a non-matching sub-regex, but it's much more clear what the intent is that it at least enhances readability.

When should regex backtracking controls, like (*PRUNE), be used?

Tags:

regex

perl

backtracking

Laurel

1 Answers

polettix

Recent Activity

Donate For Us

When should regex backtracking controls, like (*PRUNE), be used?

Tags:

regex

perl

backtracking

Laurel

1 Answers

polettix

Related questions

Recent Activity

Donate For Us