Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using (?<! regex assert without fixed width

Tags:

html

regex

I have this regex that works nearly as expected...

(?<!color: )(?<!color:)(?<!pid=[0-9][0-9][0-9][0-9][0-9])\#(\w+)

Let's say this is my html code:

<span style='color: #FFAABB'><a href='?pid=55155#pid55155'>hey #hello</a></span>

The regex only matchs: #hello which is ok but the point is that I don't know how many numbers will be after "pid" and I can't use "?", "*" or "{n,m}" qualifiers with "(?<!)" (I don't know why).

My question is: Is there any way to make it dynamic?

Please don't suggest:

(?<!color: )(?<!color:)(?<!pid=[0-9])(?<!pid=[0-9][0-9])(?<!pid=[0-9][0-9][0-9])(?<!pid=[0-9][0-9][0-9][0-9])\#(\w+)

Because it's awful.

Here is a working example:

https://www.regex101.com/r/rC2mH4/1

Thanks in advance :)

like image 961
Daii Avatar asked Oct 20 '22 18:10

Daii


2 Answers

If your language supports (*SKIP)(*F), then you could use simply the below.

(?:color:\s*|pid=\d*)#(*SKIP)(*F)|#(\w+)

DEMO

Note that the above \s matches newline characters also. So use \h to match only the horizontal spaces.

Explanation:

  • (?:color:\s*|pid=\d*)# Matches all the # symbols plus the preceding color: and the zero or more spaces OR | the pid= and zero or more digits. So the part you don't want was matched.

  • (*SKIP)(*F) causes the previous match to fail. And the pattern after | will try to match the characters from the remaining string.

  • In our case the pattern after | is # . So #(\w+) matches all the hash tags you want.

like image 82
Avinash Raj Avatar answered Oct 23 '22 01:10

Avinash Raj


color:\s*#\w+|pid=\d+#\w+|(#\w+)

You can try this.Just grab the capture or group.See demo.This matches all crap and captures what you want.

https://www.regex101.com/r/rC2mH4/3

$re = "/color:\\s*#\\w+|pid=\\d+#\\w+|(#\\w+)/m";
$str = "<span style=\"font-weight: bold;\">test1<span style=\"color: #FFA500;\">test2</span>test3</span>#hello#how#are#you\n<span style=\"font-weight: bold;\">test1<span style=\"color: #FFA500;\">test2</span>test3</span>#lalala #hello\n<div class=\"post_body\" id=\"pid_58705\">\n<blockquote><cite><span> (Hoy 02:42)</span>Moroha escribió: <a class=\"quick_jump\" href=\"http://test.com/Thread-hello?pid=58672#pid58672\" rel=\"nofollow\">&nbsp;</a></cite>testing</blockquote></div>\npid=97589735935795358672#foobar\n<span style='color: #FFAABB'><a href='?pid=55155#pid55155'>hey #hello</a></span>";

preg_match_all($re, $str, $matches);
like image 30
vks Avatar answered Oct 23 '22 01:10

vks