Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amount of repetitions of symbols in Lua pattern setup

I'm looking for amount of repetitions of symbols in Lua pattern setup. I try to check amount of symbols in a string. As I read in manual, Even with character classes this is still very limiting, because we can only match strings with a fixed length.

To solve this, patterns support these four repetition operators:

  • '*' Match the previous character (or class) zero or more times, as many times as possible.
  • '+' Match the previous character (or class) one or more times, as many times as possible.
  • '-' Match the previous character (or class) zero or more times, as few times as possible.
  • '?' Make the previous character (or class) optional.

So, no information about Braces {} e.g.,

{1,10}; {1,}; {10};

doesn't work.

local np = '1'
local a =  np:match('^[a-zA-Z0-9_]{1}$' )

returns np = nil.

local np = '1{1}'
local a =  np:match('^[a-zA-Z0-9_]{1}$' )

returns np = '1{1}' :)

This url says that no such magic symbols:

Some characters, called magic characters, have special meanings when used in a pattern. The magic characters are

( ) . % + - * ? [ ^ $

Curly brackets do work only as simple text and no more. Am I right? What is the best way to avoid this 'bug'?

It is possible to read usual usage of braces, for instance, here.

like image 393
Vyacheslav Avatar asked Oct 01 '15 09:10

Vyacheslav


1 Answers

We can't but admit that Lua regex quantifiers are very limited in functionality.

  1. They are just those 4 you mentioned (+, -, * and ?)
  2. No limiting quantifier support (the ones you require)
  3. Unlike some other systems, in Lua a modifier can only be applied to a character class; there is no way to group patterns under a modifier (see source). Unfortunately Lua patterns do not support this ('(foo)+' or '(foo|bar)'), only single characters can be repeated or chosen between, not sub-patterns or strings.

As a "work-around", in order to use limiting quantifiers and all other PCRE regex perks, you can use rex_pcre library.

Or, as @moteus suggests, a partial workaround to "emulate" limiting quantifiers having just the lower bound, just repeat the pattern to match it several times and apply the available Lua quantifier to the last one. E.g. to match 3 or more occurrences of a pattern:

local np = 'abc_123'
local a = np:match('^[a-zA-Z0-9_][a-zA-Z0-9_][a-zA-Z0-9_]+$' )

See IDEONE demo

Another library to consider instead of PCRE is Lpeg.

like image 161
Wiktor Stribiżew Avatar answered Oct 01 '22 06:10

Wiktor Stribiżew