For example, I have an article should be splitted according to sentence boundary such as ".
", "?
", "!
" and ":
".
But as well all know, whether preg_split
or explode
function, they both remove the delimiter.
Any help would be really appreciated!
EDIT:
I can only come up with the code below, it works great though.
$content=preg_replace('/([\.\?\!\:])/',"\\1[D]",$content);
Thank you!!! Everyone. It is only five minutes for getting 3 answers! And I must apologize for not being able to see the PHP manual carefully before asking question. Sorry.
Both are used to split a string into an array, but the difference is that split() uses pattern for splitting whereas explode() uses string. explode() is faster than split() because it doesn't match string based on regular expression.
PHP | explode() Function The explode() function splits a string based on a string delimiter, i.e. it splits the string wherever the delimiter character occurs. This functions returns an array containing the strings formed by splitting the original string.
The explode() function breaks a string into an array. Note: The "separator" parameter cannot be an empty string. Note: This function is binary-safe.
To split a string into words in PHP, use explode() function with space as delimiter. The explode() function returns an array containing words as elements of the array.
I feel this is worth adding. You can keep the delimiter in the "after" string by using regex lookahead to split:
$input = "The address is http://stackoverflow.com/"; $parts = preg_split('@(?=http://)@', $input); // $parts[1] is "http://stackoverflow.com/"
And if the delimiter is of fixed length, you can keep the delimiter in the "before" part by using lookbehind:
$input = "The address is http://stackoverflow.com/"; $parts = preg_split('@(?<=http://)@', $input); // $parts[0] is "The address is http://"
This solution is simpler and cleaner in most cases.
You can set the flag PREG_SPLIT_DELIM_CAPTURE when using preg_split
and capture the delimiters too. Then you can take each pair of 2n and 2n+1 and put them back together:
$parts = preg_split('/([.?!:])/', $str, -1, PREG_SPLIT_DELIM_CAPTURE); $sentences = []; for ($i = 0, $n = count($parts) - 1; $i <= $n; $i += 2) { $sentences[] = $parts[$i] . ($parts[$i+1] ?? ''); }
Note to pack the splitting delimiter into a group, otherwise they won’t be captured.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With