Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do i split string to words and symbols combination array

Tags:

regex

php

split

I do split a sentence to words as followings:
eg.:

This is a test from php, python, asp and also from other languages. Alash! i cannot get my output as followings.  

result:

array(  
[0]=>"This",  
[1]=>"is",  
[2]=>"a",  
[3]=>"test",  
[4]=>"from",  
[5]=>"php",  
[6]=>",",  
[7]=>"python",  
[8]=>",",  
[9]=>"asp",  
[10]=>"and",  
[11]=>"also",  
[12]=>"from",  
[13]=>"other",  
[14]=>"languages",  
[15]=>".",  
[16]=>"Alash",  
[17]=>"!",  
[18]=>"I",  
[19]=>"cannot",  
[20]=>"get",  
...  
)  

What can be my options in php for it?

like image 845
KoolKabin Avatar asked Jun 22 '11 06:06

KoolKabin


2 Answers

Try something like:

preg_split('/\s+|\b/', $string)
like image 154
Qtax Avatar answered Sep 30 '22 05:09

Qtax


Wow, that's a tough one! Because you want to keep "," as well. Here is what to do:

$string = "I beg to differ, you can get it as the previous.";
$words = preg_split('/\s+|(?<=[,\.!\?])|(?=[,\.!\?])/',$string);

Note: in the (?<=) and in the (?=), you must put all the characters that you want to be considered as words as well, even if there is no space before and/or after them.

like image 23
SteeveDroz Avatar answered Sep 30 '22 03:09

SteeveDroz