Infinite loop using a pair of Perl regex matches

Question

I wrote a small Perl script with regular expressions to get HTML components of a website.

I know its not a good way of doing this kind of job, but I was trying to test out my regex skills.

When run with either one of the two regex patterns in the while loop it runs perfectly and displays the correct output. But when I try to check both patterns in the while loop the second pattern matches every time and the loop runs infinitely.

My script:

#!/usr/bin/perl -w
use strict;

while (<STDIN>) {

    while ( (m/<span class=\"itempp\">([^<]+)+?</span>/g) ||
            (m/<font size=\"-1\">([^<]+)+?</font>/g) ) {
        print "$1
";
    }
}

I am testing the above script with a sample input:

<a href="http://linkTest">Link title</a>
<span class="itempp">$150</span>
<font size="-1"> (Location)</font>

Desired output:

$150
(Location)

Thank you! Any help would be highly appreciated!

Borodin · Accepted Answer

Whenever a global regex fails to match it resets the position where the next global regex will start searching. So when the first of your two patterns fails it forces the second to look from the beginning of the string again.

This behaviour can be disabled by adding the /c modifier, which leaves the position unchanged if a regex fails to match.

In addition, you can improve your patterns by removing the escape characters (" doesn't need escaping and / needn't be escaped if you choose a different delimiter) and the superfluous +? after the captures.

Also use warnings is much better than -w on the command line.

Here is a working version of your code.

use strict;
use warnings;

while (<STDIN>) {

    while( m|<span class="itempp">([^<]+)</span>|gc
            or m|<font size="-1">([^<]+)</font>|gc ) {
        print "$1
";
    }
}

cdtits · Answer

while (<DATA>) {
    if (m{<(?:span class="itempp"|font size="-1")>\s*([^<]+)}i) {
        print "$1
";
    }
}

__DATA__
<a href="http://linkTest">Link title</a>
<span class="itempp">$150</span>
<font size="-1"> (Location)</font>

Infinite loop using a pair of Perl regex matches

Tags:

html

string

regex

pattern-matching

perl

javaCity

2 Answers

Borodin

cdtits

Recent Activity

Donate For Us

Infinite loop using a pair of Perl regex matches

Tags:

html

string

regex

pattern-matching

perl

javaCity

2 Answers

Borodin

cdtits

Related questions

Recent Activity

Donate For Us