Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split a string with any whitespace chars as delimiters

What regex pattern would need I to pass to java.lang.String.split() to split a String into an Array of substrings using all whitespace characters (' ', '\t', '\n', etc.) as delimiters?

like image 365
mcjabberz Avatar asked Oct 22 '08 11:10

mcjabberz


People also ask

How do you split a string with white space characters?

You can split a String by whitespaces or tabs in Java by using the split() method of java. lang. String class. This method accepts a regular expression and you can pass a regex matching with whitespace to split the String where words are separated by spaces.

What does split \\ s+ do?

split("\\s+") will split the string into string of array with separator as space or multiple spaces. \s+ is a regular expression for one or more spaces.

How do I split a string into multiple spaces?

To split a string by multiple spaces, call the split() method, passing it a regular expression, e.g. str. trim(). split(/\s+/) . The regular expression will split the string on one or more spaces and return an array containing the substrings.

How do you split a string by a space and a comma?

To split a string by space or comma, pass the following regular expression to the split() method - /[, ]+/ . The method will split the string on each occurrence of a space or comma and return an array containing the substrings.


2 Answers

Something in the lines of

myString.split("\\s+"); 

This groups all white spaces as a delimiter.

So if I have the string:

"Hello[space character][tab character]World" 

This should yield the strings "Hello" and "World" and omit the empty space between the [space] and the [tab].

As VonC pointed out, the backslash should be escaped, because Java would first try to escape the string to a special character, and send that to be parsed. What you want, is the literal "\s", which means, you need to pass "\\s". It can get a bit confusing.

The \\s is equivalent to [ \\t\\n\\x0B\\f\\r].

like image 131
Henrik Paul Avatar answered Oct 01 '22 13:10

Henrik Paul


In most regex dialects there are a set of convenient character summaries you can use for this kind of thing - these are good ones to remember:

\w - Matches any word character.

\W - Matches any nonword character.

\s - Matches any white-space character.

\S - Matches anything but white-space characters.

\d - Matches any digit.

\D - Matches anything except digits.

A search for "Regex Cheatsheets" should reward you with a whole lot of useful summaries.

like image 27
glenatron Avatar answered Oct 01 '22 13:10

glenatron