Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there way to keep delimiter while using php explode or other similar functions?

Tags:

For example, I have an article should be splitted according to sentence boundary such as ".", "?", "!" and ":".

But as well all know, whether preg_split or explode function, they both remove the delimiter.

Any help would be really appreciated!

EDIT:

I can only come up with the code below, it works great though.

$content=preg_replace('/([\.\?\!\:])/',"\\1[D]",$content); 

Thank you!!! Everyone. It is only five minutes for getting 3 answers! And I must apologize for not being able to see the PHP manual carefully before asking question. Sorry.

like image 703
user353889 Avatar asked May 30 '10 09:05

user353889


People also ask

What is the difference between explode () and split () functions in PHP?

Both are used to split a string into an array, but the difference is that split() uses pattern for splitting whereas explode() uses string. explode() is faster than split() because it doesn't match string based on regular expression.

Which function will split a PHP string into an array based on a provided delimiter?

PHP | explode() Function The explode() function splits a string based on a string delimiter, i.e. it splits the string wherever the delimiter character occurs. This functions returns an array containing the strings formed by splitting the original string.

What does explode () do in PHP?

The explode() function breaks a string into an array. Note: The "separator" parameter cannot be an empty string. Note: This function is binary-safe.

How can I split sentences into words in PHP?

To split a string into words in PHP, use explode() function with space as delimiter. The explode() function returns an array containing words as elements of the array.


2 Answers

I feel this is worth adding. You can keep the delimiter in the "after" string by using regex lookahead to split:

$input = "The address is http://stackoverflow.com/"; $parts = preg_split('@(?=http://)@', $input); // $parts[1] is "http://stackoverflow.com/" 

And if the delimiter is of fixed length, you can keep the delimiter in the "before" part by using lookbehind:

$input = "The address is http://stackoverflow.com/"; $parts = preg_split('@(?<=http://)@', $input); // $parts[0] is "The address is http://" 

This solution is simpler and cleaner in most cases.

like image 93
wavemode Avatar answered Sep 28 '22 02:09

wavemode


You can set the flag PREG_SPLIT_DELIM_CAPTURE when using preg_split and capture the delimiters too. Then you can take each pair of 2‍n and 2‍n+1 and put them back together:

$parts = preg_split('/([.?!:])/', $str, -1, PREG_SPLIT_DELIM_CAPTURE); $sentences = []; for ($i = 0, $n = count($parts) - 1; $i <= $n; $i += 2) {     $sentences[] = $parts[$i] . ($parts[$i+1] ?? ''); } 

Note to pack the splitting delimiter into a group, otherwise they won’t be captured.

like image 30
Gumbo Avatar answered Sep 28 '22 02:09

Gumbo