Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression and repeating character classes in perl

Tags:

regex

perl

I am trying to write a regular expression that can extract (possibly multiple) strings of four hexadecimal numbers/letters.

I could do something like this: /^[A-Fa-f0-9][A-Fa-f0-9][A-Fa-f0-9][A-Fa-f0-9]/

but is there a better way?

It seems like the repeat operator:

a{n} Matches 'a' repeated exactly n times.

a{n,} Matches 'a' repeated n or more times.

a{n, m} Matches 'a' repeated between n and m times inclusive.

Would work, but the following regular expression does not seem to work:

/^[A-Fa-f0-9]{4}+/

I'm trying to match strings like:

AA00

AA00FFAA

0011FFAA0022

and so on. Each string will be on it's own line.

Thanks!

like image 826
user210099 Avatar asked Jun 30 '11 21:06

user210099


People also ask

How do you repeat a regular expression?

An expression followed by '*' can be repeated any number of times, including zero. An expression followed by '+' can be repeated any number of times, but at least once. An expression followed by '? ' may be repeated zero or one times only.

What is \d in Perl regex?

The Special Character Classes in Perl are as follows: Digit \d[0-9]: The \d is used to match any digit character and its equivalent to [0-9]. In the regex /\d/ will match a single digit. The \d is standardized to “digit”.

What is the use \w in Perl?

Use \w+ to match a string of Perl-identifier characters (which isn't the same as matching an English word). If use locale is in effect, the list of alphabetic characters generated by \w is taken from the current locale.

Is Perl good for regex?

In general, Perl uses a backtrack regex engine. Such an engine is flexible, easy to implement and very fast on a subset of regex. However, for other types of regex, for example when there is the | operator, it may become very slow.


2 Answers

Try this:

/^(?:[A-Fa-f0-9]{4})+$/
like image 61
agent-j Avatar answered Sep 21 '22 13:09

agent-j


You have nested quantifiers in regex; ie, {4} means to match exactly 4 times and + means to match that string many times, so these two quantifiers conflict. If you simply remove the +, it'll work:

/^[A-Fa-f0-9]{4}/
like image 30
Corey Henderson Avatar answered Sep 18 '22 13:09

Corey Henderson