Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove double space and space after line break from String

Tags:

regex

php

so, first i have this input

$string = "Lorem ipsum 
dolor sit amet, consectetur adipiscing 
elit https://www.youtube.com/watch?v=example sed do eiusmod tempor incididunt https://www.youtube.com/watch?v=example2 https://www.youtube.com/watch?v=example3";

and then i want to remove the url from the $string using regex

$string = preg_replace('/[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)/', '', $string);

after i removed all of the url from the string, the output will be

Lorem ipsum 
dolor sit amet, consectetur adipiscing 
 elit  sed do eiusmod tempor incididunt  

the problem is, there is double space and i want to make it more neat

ive tried using this, which will replaced all the double space with single space

$string = preg_replace('/\x20+/', ' ', $string);

and theres come another problem which is theres a space after line break

Lorem ipsum 
dolor sit amet, consectetur adipiscing 
 elit sed do eiusmod tempor incididunt

and it makes me uncomfortable.

i need a solution to get rid of the url, but also makes it neat. the last result I want is like this

Lorem ipsum 
dolor sit amet, consectetur adipiscing
elit sed do eiusmod tempor incididunt

sorry if its looks weird, thanks

like image 421
Tunku Salim Avatar asked Oct 15 '22 00:10

Tunku Salim


2 Answers

Use preg_replace() to remove all the URL's.

Use trim() to remove any left over spaces

Again, use preg_replace() to remove any dubble spaces. (regex)

Then, to remove any spaces that accrued at the beginning of the line, replace those with nothing to remove them.

<?php

    $r = '/\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|$!:,.;]*[A-Z0-9+&@#\/%=~_|$]/i';
    $string = "Lorem ipsum
    dolor sit amet, consectetur adipiscing
    elit https://www.youtube.com/watch?v=example sed do eiusmod tempor incididunt https://www.youtube.com/watch?v=example2 https://www.youtube.com/watch?v=example3";

    // Remove url's
    $clean = preg_replace($r, ' ', $string);

    // Trim whitespaces
    $clean = trim($clean);

    // Replace dubble-space with single space
    $clean = preg_replace( '/\h+/', ' ', $clean);

    // Remove any spaces after newline
    $clean = preg_replace('/^ /m', '', $clean);

    // Show result
    echo $clean;

Output:

Lorem ipsum 
dolor sit amet, consectetur adipiscing 
elit sed do eiusmod tempor incididunt

Try online


Note: This could be a lot simplified by combining some calls, I chose not to so the steps are more clear

like image 200
0stone0 Avatar answered Oct 18 '22 13:10

0stone0


I would use those regex :

$string = "Lorem ipsum 
dolor sit amet, consectetur adipiscing 
elit https://www.youtube.com/watch?v=example sed do eiusmod tempor incididunt https://www.youtube.com/watch?v=example2 https://www.youtube.com/watch?v=example3";

$string = preg_replace('/[(http(s)?):\/\/(www\.)?a-zA-Z0-9@:%._\+~#=]{2,256}\.[a-z]{2,6}\b([-a-zA-Z0-9@:%_\+.~#?&\/\/=]*)([ ]*)?/', '', $string);
$string = preg_replace('/(([ ]*)?(\r\n|\n)([ ]*)?)/', "\r\n", $string); # Remove any potantial space before line break and remove any potential space after line break

echo $string;

Output

Lorem ipsum
dolor sit amet, consectetur adipiscing
elit sed do eiusmod tempor incididunt 

Note : I just added ([ ]*)? to the regex that match urls to be sure to also match spaces after urls

like image 36
magrigry Avatar answered Oct 18 '22 15:10

magrigry