Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# - string to keywords

What is the most efficient way to turn string to a list of words in C#?

For example:

Hello... world 1, this is amazing3,really ,  amazing! *bla*

should turn into the following list of strings:

["Hello", "world", "1", "this", "is", "amazing3", "really", "amazing", "bla"]

Note that it should support other languages other than English.

I need this because I want to collect a list of keywords from specific text.

Thanks.

like image 550
Alon Gubkin Avatar asked Dec 17 '22 22:12

Alon Gubkin


2 Answers

How about using regular expressions? You could make the expression arbitrarily complex, but what I have here should work for most inputs.

new RegEx(@"\b(\w)+\b").Matches(text);
like image 105
Brian Gideon Avatar answered Dec 29 '22 21:12

Brian Gideon


char[] separators = new char[]{' ', ',', '!', '*', '.'};  // add more if needed

string str = "Hello... world 1, this is amazing3,really ,  amazing! *bla*";
string[] words= str.Split(separators, StringSplitOptions.RemoveEmptyEntries);
like image 42
James Curran Avatar answered Dec 29 '22 20:12

James Curran