This question has effective solutions for identifying long words: Regex to parse long words
How would I then truncate them at a set value and append "..."
Basically, I want to apply a preg_replace on a long string and truncate any very long words (not truncate the entire string -- just the long words).
The regex flavor should be PHP.
edit: This works for me
$pattern = "/(?<=(\s\w{10}))(\w*\s)/";
This pattern effectively matches any word characters characters followed by a space, that were preceded by a space and 10 word characters.
Then just call something like this:
preg_replace($pattern,"... ",$string);
Hope that helps :)
edited: Actually should use \s instead of space, this will match any whitespace characters.
I guess this regular expression does the trick. I tested using php 5.3.6 and worked fine.
$pattern = "/(\\b\\w{10})\\w+\\b/";
echo preg_replace($pattern, "$1...", "pequeno palavramedia palavrabemgrandemesmo\n");
Where is {10} you should replace by the maximum allowed size without replacement. If you want a maximum word size of N, you should use {N-3}, because of the length of the dots.
It should run fine for big strings, because it describe a regular language and the running time should be O(n). Of course, it depends on the implementation of regex.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With