<pre class="prettyprint"><code>(http([s]?):\/\/?)(([a-zA-Z0-9]+(\.?))+)([a-zA-Z0-9]+((\.[a-zA-Z]{2,5}){1,2})((\/[a-zA-Z0-9\?&=_\-\~:/?#[\]@!\$&'()\*\+,;]*)*)((\.[a-zA-Z]{2,5}){0,2})) </code></pre> This is my regex which is working well for matching the links in the string. But I don't want it to select every link. If a link has <code>"></code> before it, or <code></a></code> after it, that link shouldn't be mathced. How can it be done? These should be matched: <pre class="prettyprint"><code>adasdas http://www.stackoverflow.com asdasas adasdasahttp://www.stackoverflow.com/something asdas </code></pre> These should NOT be matched: <pre class="prettyprint"><code>adasdas<a href="somelink"> http://www.stackoverflow.com </a>asdasas adasdasa<a href="somelink">http://www.stackoverflow.com/something</a> asdas </code></pre> Why do I need this?: I want every link to be clickable even if it isn't between anchor tags.

You need to add <code>lookaround</code>s to your regex c.f.: <ul> <li>Regular expression negative lookahead</li> <li>Lookahead and Lookbehind Zero-Length Assertions</li> </ul>

regex matching links without <a> tag

Tags:

anchor

(http([s]?):\/\/?)(([a-zA-Z0-9]+(\.?))+)([a-zA-Z0-9]+((\.[a-zA-Z]{2,5}){1,2})((\/[a-zA-Z0-9\?&=_\-\~:/?#[\]@!\$&'()\*\+,;]*)*)((\.[a-zA-Z]{2,5}){0,2}))

This is my regex which is working well for matching the links in the string. But I don't want it to select every link. If a link has "> before it, or </a> after it, that link shouldn't be mathced. How can it be done?

These should be matched:

adasdas http://www.stackoverflow.com asdasas
adasdasahttp://www.stackoverflow.com/something asdas

These should NOT be matched:

adasdas<a href="somelink">           http://www.stackoverflow.com     </a>asdasas
adasdasa<a href="somelink">http://www.stackoverflow.com/something</a> asdas

Why do I need this?: I want every link to be clickable even if it isn't between anchor tags.

477

asked Jul 09 '14 10:07

Wellenbrecher

2 Answers

With all the disclaimers about using regex to parse html, if you want to use regex for this task, this will work:

$regex="~<a.*?</a>(*SKIP)(*F)|http://\S+~";

See the demo.

This problem is a classic case of the technique explained in this question to "regex-match a pattern, excluding..."

The left side of the alternation | matches complete <a ...tags </a> then deliberately fails, after which the engine skips to the next position in the string. The right side matches the urls, and we know they are the right ones because they were not matched by the expression on the left.

The url regex I put on the right and can be refined, just use whatever suits your needs.

Reference

How to match (or replace) a pattern except in situations s1, s2, s3...
Article about matching a pattern unless...

118

answered Sep 22 '22 04:09

zx81

You need to add lookarounds to your regex c.f.:

Regular expression negative lookahead
Lookahead and Lookbehind Zero-Length Assertions

answered Sep 19 '22 04:09

Valerij

Related questions
                            
                                Creating custom module in Vtiger CRM from scratch with table
                            
                                Send xml request using PHP curl
                            
                                Yii2 third-party PHP class
                            
                                How to get Shipment Increment ID by Order ID in Magento
                            
                                Use a submit <button> instead of <input> button
                            
                                get post parameters in zend framework in "put" method
                            
                                Sorting UNION queries with Laravel 4.1
                            
                                Having "utm_" in the URL string breaks the $_GET variable in Wordpress
                            
                                PHP Fatal Error: Call to undefined function password_verify()
                            
                                Automatically scroll down after submit action
                            
                                Access Joomla 3.2 article title from the module displayed alongside
                            
                                httpd wont start with added lines for php
                            
                                Using preg_replace on an array
                            
                                How resize image with custom ratio using Intervention image manipulation library in laravel
                            
                                Zend 2:: getting public folder path or basePath() easily in controller action
                            
                                NGINX try_files does not pass to PHP
                            
                                Laravel all routes except '/' return 404
                            
                                How to output 'Site Name' in Joomla 3.x with PHP?
                            
                                how to create tab menu in admin page wordpress
                            
                                Symfony2 - Get an entity instead of PersistentCollection in twig

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

regex matching links without <a> tag

Tags:

regex

php

hyperlink