Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Removing long words regex

Tags:

regex

php

I would like to how can I remove long word from a string. Words greater than length n.

I tried the following:

//remove words which have more than 5 characters from string
$s = 'abba bbbbbbbbbbbb 1234567 zxcee ytytytytytytytyt zczc xyz';
echo preg_replace("~\s(.{5,})\s~isU", " ", $s);

Gives the Output (which is incorrect):

abba 1234567 ytytytytytytytyt zczc xyz
like image 610
Imran Omar Bukhsh Avatar asked Feb 24 '23 13:02

Imran Omar Bukhsh


2 Answers

Use this regex: \b\w{5,}\b. It will match long words.

  1. \b - word boundary
  2. \w{5,} - alphanumeric 5 or more repetitions
  3. \b - word boundary
like image 181
Kirill Polishchuk Avatar answered Feb 26 '23 02:02

Kirill Polishchuk


<?php
//remove words which have more than 5 characters from string
$s = 'abba bbbbbbbbbbbb 1234567 zxcee ytytytytytytytyt zczc xyz';

$patterns = array(
    'long_words' => '/[^\s]{5,}/',
    'multiple_spaces' => '/\s{2,}/'
);

$replacements = array(
    'long_words' => '',
    'multiple_spaces' => ' '
);
echo trim(preg_replace($patterns, $replacements, $s));
?>

Output:

abba zczc xyz

Update, to address the issue you presented in the comments. You can do it like this:

<?php
//remove words which have more than 5 characters from string
$s = '123&nbsp;ReallyLongStringComesHere&nbsp;123';

$patterns = array(
    'html_space' => '/&nbsp;/',
    'long_words' => '/[^\s]{5,}/',
    'multiple_spaces' => '/\s{2,}/'
);

$replacements = array(
    'html_space' => ' ',
    'long_words' => '',
    'multiple_spaces' => ' '
);
echo str_replace(' ', '&nbsp;', trim(preg_replace($patterns, $replacements, $s)));
?>

Output:

123&nbsp;123
like image 44
Shef Avatar answered Feb 26 '23 03:02

Shef