Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

php: Remove URL from string

Tags:

string

regex

php

I have many strings (twitter tweets) from which I would like to remove the links when I echo them .

I have no control over the string and even though all the links start with http, they can end with a "/" or a ";" not, and be followed or not by a space. Also, sometimes there is not space between the link and the word just before it.

One example of such string:

The Third Culture: The Frontline of Global Thinkinghttp://is.gd/qFioda;via @edge

I have try to play around with preg_replace, but couldn't come up with a solution that fit all the exceptions:

<?php echo preg_replace("/\http[^)]+\;/","",$feed->itemTitle); ?>

Any idea how I should proceed?

Edit: I have tried

<?php echo preg_replace('@(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)‌​?)@', ' ', $feed->itemTitle); ?>

but still no success.

Edit 2: I found this one:

<?php echo preg_replace('^(ht|f)tp(s?)\:\/\/[0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*(:(0-9)*)*(\/?)([a-zA-Z0-9\-‌​\.\?\,\'\/\\\+&amp;%\$#_]*)?$^',' ', $feed->itemTitle); ?>

which remove the link as expected but it also deletes the entire string when there is not space between the link and the word that precedes it.

like image 467
MagTun Avatar asked Jul 05 '14 16:07

MagTun


People also ask

How to remove link from string in PHP?

PHP-Remove-URL-from-string.php $string = 'Hi, visit my website: http://beto.euqueroserummacaco.com'; $string = preg_replace('/\b(https?| ftp|file):\/\/[-A-Z0-9+&@#\/%? =~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i', '', $string);

How do I remove http or https from URL?

To remove http:// or https:// from a url, call the replace() method with the following regular expression - /^https?:\/\// and an empty string as parameters. The replace method will return a new string, where the http:// part is removed.

How to remove variables from URL in PHP?

The safest "correct" method would be: Parse the url into an array with parse_url() Extract the query portion, decompose that into an array using parse_str() Delete the query parameters you want by unset() them from the array.


2 Answers

I would do something like this:

$input = "The Third Culture: The Frontline of Global Thinkinghttp://is.gd/qFioda;via @edge";
$replace = '"(https?://.*)(?=;)"';

$output = preg_replace($replace, '', $input);
print_r($output);

It works for multiple occurances too:

$output = preg_replace($replace, '', $input."\n".$input);
print_r($output);
like image 143
jamb Avatar answered Sep 23 '22 22:09

jamb


If you want to remove everything, link and after the link, like via thing in your example, the below may help you:

$string = "The Third Culture: The Frontline of Global Thinkinghttp://is.gd/qFioda;via @edge";
$regex = "@(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?).*$)@";
echo preg_replace($regex, ' ', $string);

If you want to keep them:

$string = "The Third Culture: The Frontline of Global Thinkinghttp://is.gd/qFioda;via @edge";
$regex = "@(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)@";
echo preg_replace($regex, ' ', $string);
like image 34
Burak Avatar answered Sep 22 '22 22:09

Burak