Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to split sentence into words separated by multiple spaces?

Tags:

scala

The following code:

val sentence = "1 2  3   4".split(" ")

gives me:

Array(1, 2, "", 3, "", "", 4)

but I'd rather want to have only the words:

Array(1, 2, 3, 4)

How can I split the sentence when the words are separated by multiple spaces?

like image 819
yalkris Avatar asked Jan 22 '13 23:01

yalkris


People also ask

How do you split words with spaces?

Usually, words are separated by just one white space between them. In order to split it and get the array of words, just call the split() method on input String, passing a space as regular expression i.e." ", this will match a single white space and split the string accordingly.

How do you split a string by spaces and commas?

To split a string by space or comma, pass the following regular expression to the split() method - /[, ]+/ . The method will split the string on each occurrence of a space or comma and return an array containing the substrings.

How do I separate words from a string?

The split() method splits a string into an array of substrings. The split() method returns the new array. The split() method does not change the original string. If (" ") is used as separator, the string is split between words.

How do you split a sentence with a space in Python?

The split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.


3 Answers

Use a regular expression:

scala> "1   2 3".split(" +")
res1: Array[String] = Array(1, 2, 3)

The "+" means "one or more of the previous" (previous being a space).

Better yet, if you want to split on all whitespace:

scala> "1   2 3".split("\\s+")
res2: Array[String] = Array(1, 2, 3)

(Where "\\s" is a Pattern which matches any whitespace. Look here for more examples.)

like image 154
Tim Avatar answered Oct 01 '22 17:10

Tim


You can filter out the "" from the split Array.

scala> val sentence = "1 2  3   4".split(" ").filterNot(_ == "")
sentence: Array[java.lang.String] = Array(1, 2, 3, 4)
like image 5
Brian Avatar answered Oct 01 '22 17:10

Brian


This regular expression \\W+ delivers (alphaunmerical) words, thus

val sentence = "1 2  3   4".split("\\W+")
sentence: Array[String] = Array(1, 2, 3, 4)

For ease of use, in Scala 2.10.* and 2.11.* consider

implicit class RichString(val s: String) extends AnyVal {
  def words = s.split("\\W+")
}

Thus,

sentence.words
res: Array[String] = Array(1, 2, 3, 4)
like image 3
elm Avatar answered Oct 01 '22 17:10

elm