Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple nested matches in JavaScript Regular Expression

Trying to write a regular expression to match GS1 barcode patterns ( https://en.wikipedia.org/wiki/GS1-128 ), that contain 2 or more of these patterns that have an identifier followed by a certain number of characters of data.

I need something that matches this barcode because it contains 2 of the identifier and data patterns:

human readable with the identifiers in parens: (01)12345678901234(17)501200

actual data: 011234567890123417501200

but should match not this barcode when there is only one pattern in:

human readable: (01)12345678901234

actual data: 0112345678901234

It seems like the following should work:

var regex = /(?:01(\d{14})|10([^\x1D]{6,20})|11(\d{6})|17(\d{6})){2,}/g;
var str = "011234567890123417501200";

console.log(str.replace(regex, "$4"));
// matches 501200
console.log(str.replace(regex, "$1"));
// no match? why?

For some strange reason as soon as I remove the {2,} it works, but I need the {2,} so that it only returns matches if there is more than one match.

// Remove {2,} and it will return the first match
var regex = /(?:01(\d{14})|10([^\x1D]{6,20})|11(\d{6})|17(\d{6}))/g;
var str = "011234567890123417501200";

console.log(str.replace(regex, "$4"));
// matches 501200
console.log(str.replace(regex, "$1"));
// matches 12345678901234
// but then the problem is it would also match single identifiers such as
var str2 = "0112345678901234";
console.log(str2.replace(regex, "$1"));
 

How do I make this work so it will only match and pull the data if there is more than 1 set of match groups?

Thanks!

like image 418
Uniphonic Avatar asked Feb 17 '17 00:02

Uniphonic


People also ask

How do you find multiple occurrences of a string in regex?

Method 1: Regex re. To get all occurrences of a pattern in a given string, you can use the regular expression method re. finditer(pattern, string) . The result is an iterable of match objects—you can retrieve the indices of the match using the match.

What does '$' mean in regex?

$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.

What is difference [] and () in regex?

[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .

Does * match everything in regex?

Throw in an * (asterisk), and it will match everything. Read more. \s (whitespace metacharacter) will match any whitespace character (space; tab; line break; ...), and \S (opposite of \s ) will match anything that is not a whitespace character.


1 Answers

Your RegEx is logically and syntatically correct for Perl-Compatible Regular Expressions (PCRE). The issue I believe you are facing is the fact that JavaScript has issues with repeated capture groups. This is why the RegEx works fine once you take out the {2,}. By adding the quantifier, JavaScript will be sure to return only the last match.

What I would recommend is removing the {2,} quantifier and then programmatically checking for matches. I know it's not ideal for those who are big fans of RegEx, but c'est la vie.

Please see the snippet below:

var regex = /(?:01(\d{14})|10([^\x1D]{6,20})|11(\d{6})|17(\d{6}))/g;
var str = "011234567890123417501200";

// Check to see if we have at least 2 matches.
var m = str.match(regex);
console.log("Matches list: " + JSON.stringify(m));
if (m.length < 2) {
    console.log("We only received " + m.length + " matches.");
} else {
    console.log("We received " + m.length + " matches.");
    console.log("We have achieved the minimum!");
}

// If we exec the regex, what would we get?
console.log("** Method 1 **");
var n;
while (n = regex.exec(str)) {
    console.log(JSON.stringify(n));
}

// That's not going to work.  Let's try using a second regex.
console.log("** Method 2 **");
var regex2 = /^(\d{2})(\d{6,})$/;
var arr = [];
var obj = {};
for (var i = 0, len = m.length; i < len; i++) {
    arr = m[i].match(regex2);
    obj[arr[1]] = arr[2];
}

console.log(JSON.stringify(obj));

// EOF

I hope this helps.

like image 174
Damian T. Avatar answered Sep 28 '22 18:09

Damian T.