Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular expression to get link text

I'm stumped! I've googled and read and read and read and I'm sure there is something really dumb that I'm doing wrong. This is from a Greasemonkey script that I can't for the life of me get to initiate AND perform correctly. I'm trying to match this:

<a href="/browse/post/SOMETHING/">**SOMETHING** (1111)</a>

Here's what I'm using:

var titleRegex = new RegExp("<a href=\"/browse/post/\d*/\">(.*) \(");

I'm sure I'm missing some kind of escape characters? But I just can't figure it out so that Firefox doesn't error out.

I generate the regexp using http://regexpal.com/ -- In Firefox error console I receive "unterminated parenthetical"

like image 277
spazzed Avatar asked Feb 01 '26 23:02

spazzed


2 Answers

When building a regex from a string instead of a regex literal, you need to double the backslashes.

Then, \d* only matches digits. I'm assuming that SOMETHING is just a placeholder, but if that were to contain anything but digits, it would fail.

Also, you should be using (.*?) (lazy) instead of (.*) (greedy), or you might be matching too much. Perhaps ([^(]*) would be even better.

Hard to say, though, without knowing more about the actual text you're trying to match.

All in all:

var titleRegex = new RegExp("<a href=\"/browse/post/\\d*/\">([^(]*) \\(");
like image 107
Tim Pietzcker Avatar answered Feb 04 '26 11:02

Tim Pietzcker


Here's a simple fix:

/href=\".*?\">(.*?)\(/
like image 36
imsky Avatar answered Feb 04 '26 12:02

imsky