I have made a simple code for capturing a certain group in a string :
/[a-z]+([0-9]+)[a-z]+/gi (n chars , m digts , k chars).
code :
var myString='aaa111bbb222ccc333ddd';
var myRegexp=/[a-z]+([0-9]+)[a-z]+/gi;
var match=myRegexp.exec(myString);
console.log(match)
while (match != null)
{
match = myRegexp.exec(myString);
console.log(match)
}
The result were :
["aaa111bbb", "111"]
["ccc333ddd", "333"]
null
But wait a minute ,
Why he didnt try the bbb222ccc
part ?
I mean ,
It saw the aaa111bbb
but then he should have try the bbb222ccc
... ( That's greedy !)
What am I missing ?
looking at
while (match != null)
{
match = myRegexp.exec(myString);
console.log(match)
}
how did it progressed to the second result ? at first there was :
var match=myRegexp.exec(myString);
later ( in a while loop)
match=myRegexp.exec(myString);
match=myRegexp.exec(myString);
it is the same line ... where does it remember that the first result was already shown ?
backing up until it can match an 'ab' (this is called backtracking). To make the quantifier non-greedy you simply follow it with a '?' the first 3 characters and then the following 'ab' is matched.
The standard quantifiers in regular expressions are greedy, meaning they match as much as they can, only giving back as necessary to match the remainder of the regex. By using a lazy quantifier, the expression tries the minimal match first.
Introduction to Regular ExpressionsA regular expression (also called regex for short) is a fast way to work with strings of text. By formulating a regular expression with a special syntax, you can: search for text in a string.
Regular expressions are dense. This makes them hard to read, but not in proportion to the information they carry. Certainly 100 characters of regular expression syntax is harder to read than 100 consecutive characters of ordinary prose or 100 characters of C code.
.exec
is stateful when you use the g
flag. The state is kept in the regex object's .lastIndex
property.
var myString = 'aaa111bbb222ccc333ddd';
var myRegexp = /[a-z]+([0-9]+)[a-z]+/gi;
var match = myRegexp.exec(myString);
console.log(myRegexp.lastIndex); //9, so the next `.exec` will only look after index 9
while (match != null) {
match = myRegexp.exec(myString);
console.log(myRegexp.lastIndex);
}
The state can be resetted by setting .lastIndex
to 0
or by execing
a different string. re.exec("")
for instance will reset the state because the state was kept for 'aaa111bbb222ccc333ddd'
.
The same applies to .test
method as well, so never use g
flag with a regex that is used for .test
if you prefer no surprises. See https://developer.mozilla.org/en-US/docs/JavaScript/Reference/Global_Objects/RegExp/exec
You can also update manually the lastIndex
property :
var myString='aaa111bbb222ccc333ddd';
var myRegexp=/[a-z]+([0-9]+)[a-z]+/gi;
var match=myRegexp.exec(myString);
console.log(match);
while (match != null)
{
myRegexp.lastIndex -= match[0].length - 1; // Set the cursor to the position just after the beginning of the previous match
match = myRegexp.exec(myString);
console.log(match)
}
See this link MDN exec.
EDIT :
By the way your regex should be : /[a-z]{3}([0-9]{3})[a-z]{3}/gi
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With