I am using the following Javascript to read strings out of a text file and process them with a regular expression
while (!textFile.AtEndOfStream)
{
currLine = textFile.ReadLine();
match = re.exec(currLine);
do stuff with match
}
The problem I have is that every other time re.exec is called it fails and returns null; so the first row is processed correctly, but the second row results in null, then the third row works, and the fourth row results in null.
I can use the following code to get the result I want
while (!textFile.AtEndOfStream)
{
currLine = textFile.ReadLine();
match = re.exec(currLine);
if (match == null) match = re.exec(currLine);
}
but that seems a bit of a nasty kludge. Can anyone tell my why this happens and what I can do to fix it properly?
Regular expressions are patterns used to match character combinations in strings. In JavaScript, regular expressions are also objects. These patterns are used with the exec() and test() methods of RegExp , and with the match() , matchAll() , replace() , replaceAll() , search() , and split() methods of String .
A Regular Expression (or Regex) is a pattern (or filter) that describes a set of strings that matches the pattern. In other words, a regex accepts a certain set of strings and rejects the rest.
JavaScript RegExp test() The test() method tests for a match in a string. If it finds a match, it returns true, otherwise it returns false.
Parsing and extracting data from text or validating texts to a specific pattern is an important requirement in programming. JavaScript uses regular expressions to describe a pattern of characters.
Your re
is defined with the ‘global’ modifier, eg. something like /foo/g
.
When a RegExp is global, it retains hidden state in the RegExp instance itself to remember the last place it matched. The next time you search, it'll search forward from the index of the end of the last match, and find the next match from there. If you're passing a different string to the one you passed last time, this will give highly unpredictable results!
When you use g
lobal regexps, you should exhaust them by calling them repeatedly until you get null
. Then the next time you use it you'll be matching from the start of the string again. Alternatively you can explicitly set re.lastIndex
to 0
before using one. If you only want to test for the existence of one match, as in this example, simplest is just not to use g
.
The JS RegExp interfaces is one of the most confusing, poorly-designed parts of the language. (And this is JavaScript, so that's saying a lot.)
Javascript regular expressions keep some state between executions and you are probably falling in to that trap.
I always use the String.match function and have never been bitten :
while (!textFile.AtEndOfStream)
{
match = textFile.ReadLine ().match (re);
do stuff with match
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With