Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using regex to separate individual words?

I have the following line to split a sentence into words and store it into an array based on white spaces: string[] s = Regex.Split(input, @"\s+");

The problem is at the end of the sentence, it also picks up the period. For example: C# is cool.
The code would store:

  1. C#
  2. is
  3. cool.

The question is: How do I get it not to pick up the period ?

like image 833
oneCoderToRuleThemAll Avatar asked Dec 20 '22 21:12

oneCoderToRuleThemAll


2 Answers

You can use a character class [] to add in the dot . or other characters that you need to split on.

string[] s = Regex.Split(input, @"[\s.]+");

See Demo

like image 106
hwnd Avatar answered Dec 22 '22 10:12

hwnd


You can add dot (and other punctuation marks as needed) to the regular expression, like this:

string[] s = Regex.Split(input, @"(\s|[.;,])+");
like image 23
Sergey Kalinichenko Avatar answered Dec 22 '22 10:12

Sergey Kalinichenko