I'm trying to check if this embed vimeo iframe:
<iframe src="https://player.vimeo.com/video/800711372?h=589188fdd4&title=0&byline=0&portrait=0" width="640" height="360" frameborder="0" allow="autoplay; fullscreen; picture-in-picture" allowfullscreen></iframe>
is appearing exactly once in a string. That means:
<iframe.....></iframe><iframe.....></iframe> (doesnt match)
<iframe.....></iframe> (match)
won't match. I used this pattern:
^(<iframe[^(src)]*?src\=\"https?\:\/\/player\.vimeo\.com\/video\/[^>]*?>)<\/iframe>$
and it works fine, but I'm just thinking it's not a very good idea. Is there any other way to achieve this? I did a little research and people say using lookahead, negative lookahead.
Edit: Oh, the reason my regex works is in my code. I removed all new lines before applying the regex. So if we have:
<iframe.....></iframe>
<iframe.....></iframe>
<iframe.....></iframe> (multiple, keep the line breaks)
my regex will match all.
I don't believe this is an html question.
Assume you want to check a string for a single occurrence of a sub-string.
How's that done ? There are two good ways.
Actively check before and after the occurrence of the sub-string.
Regex engines don't give up trying to match. If you search for the sub-string then
follow up with a negative assertion that it doesn't exist down stream, the engine
will just match the last occurrence, which satisfies the assertion.
Therefore a character by character check before and after, that there is only a single sub-string.
This is fairly slow.
Passively match 2 occurrences of the sub-string. Passively meaning un-greedy .*?
matching the first sub-string, the un-greedy matching OPTIONALLY the second occurrence.
The engine will try hard to match both occurrences. The second occurrence
is within a capture group. This is a flag to be examined on a successful match.
If that group is not NULL, the regex found 2 or more sub-strings.
If that group is NULL, there is a 100% assurance there is only a single sub-string.
Note that if the regex matched it found at least a single sub-string.
Example:
(<iframe\s+(?:"[\S\s]*?"|'[\S\s]*?'|[^>]*?)+>\s*</iframe>)(?:(?:[\S\s]*?(<iframe\s+(?:"[\S\s]*?"|'[\S\s]*?'|[^>]*?)+>\s*</iframe>))|)
Failed, Group 2 is not NULL https://regex101.com/r/DmThDT/1
Passed, Group 2 is NULL https://regex101.com/r/393BPn/1
HTML should be parsed with some kind of html editor, however I believe theis question is not about that.
My attempt at htlm tags is thrown in but this could be anything.
Overview
( # (1 start)
<iframe \s+
(?: " [\S\s]*? " | ' [\S\s]*? ' | [^>]*? )+
> \s* </iframe>
) # (1 end)
(?:
(?:
[\S\s]*?
( # (2 start)
<iframe \s+
(?: " [\S\s]*? " | ' [\S\s]*? ' | [^>]*? )+
> \s* </iframe>
) # (2 end)
)
|
)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With