Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split Strings in java by words

Tags:

java

How can I split the following word in to an array

That's the code

into

array
0 That
1 s
2 the
3 code

I tried something like this

String str = "That's the code";

        String[] strs = str.split("\\'");
        for (String sstr : strs) {
            System.out.println(sstr);
        }

But the output is

That
s the code
like image 458
user2095165 Avatar asked Dec 22 '13 09:12

user2095165


People also ask

How do you break down a string in Java?

The string split() method breaks a given string around matches of the given regular expression. After splitting against the given regular expression, this method returns a string array.

How do you split a string into an array of words?

The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.


Video Answer


3 Answers

To specifically split on white space and the apostrophe:

public class Split {
    public static void main(String[] args) {
        String [] tokens = "That's the code".split("[\\s']");
        for(String s:tokens){
            System.out.println(s);
        }
    }
}

or to split on any non word character:

public class Split {
    public static void main(String[] args) {
        String [] tokens = "That's the code".split("[\\W]");
        for(String s:tokens){
            System.out.println(s);
        }
    }
}
like image 138
Kevin Bowersox Avatar answered Oct 21 '22 15:10

Kevin Bowersox


The best solution I've found to split by words if your string contains accentuated letters is :

String[] listeMots = phrase.split("\\P{L}+");

For instance, if your String is

String phrase = "Salut mon homme, comment ça va aujourd'hui? Ce sera Noël puis Pâques bientôt.";

Then you will get the following words (enclosed within quotes and comma separated for clarity) :

"Salut", "mon", "homme", "comment", "ça", "va", "aujourd", "hui", "Ce", 
"sera", "Noël", "puis", "Pâques", "bientôt".

Hope this helps!

like image 23
Pierre C Avatar answered Oct 21 '22 13:10

Pierre C


You can split according to non-characters chars:

String str = "That's the code";
String[] splitted = str.split("[\\W]");

For your input, output will be:

That
s
the
code
like image 27
Maroun Avatar answered Oct 21 '22 15:10

Maroun