I am trying to filter a List of strings based on the number of words in each string. I am assuming that you would trim any white-space at the ends of the string, and then count the number of spaces left in the string, so that WordCount = NumberOfSpaces + 1. Is that the most efficient way to do this? I know that for filtering based on character count the following is working fine...just cant figure out how to write it succinctly using C#/LINQ.
if (checkBox_MinMaxChars.Checked)
{
int minChar = int.Parse(numeric_MinChars.Text);
int maxChar = int.Parse(numeric_MaxChars.Text);
myList = myList.Where(x =>
x.Length >= minChar &&
x.Length <= maxChar).ToList();
}
Any ideas of for counting words?
UPDATE: This Worked like a charm...Thanks Mathew:
int minWords = int.Parse(numeric_MinWords.Text);
int maxWords = int.Parse(numeric_MaxWords.Text);
sortBox1 = sortBox1.Where(x => x.Trim().Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Count() >= minWords &&
x.Trim().Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Count() <= maxWords).ToList();
I would approach it in a more simplified manner since you have indicated that a space can be used reliably as a delimiter like so:
var str = " the string to split and count ";
var wordCount = str.Trim().Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Count();
EDIT:
If optimal perforamnce is necessary and memory usage is a concern you could write your own method and leverage IndexOf()
(although there are many avenues for implementation on a problem like this, I just prefer reuse rather than from-scratch code design):
public int WordCount(string s) {
const int DONE = -1;
var wordCount = 0;
var index = 0;
var str = s.Trim();
while (index != DONE) {
wordCount++;
index = str.IndexOf(" ", index + 1);
}
return wordCount;
}
You approach to counting words is ok. String.Split
will give similar result for more memory usage.
Than just implement your int WordCount(string text)
function and pass it to Where:
myList.Where(s => WordCount(s) > minWordCount)
You want all strings with word-count in a given range?
int minCount = 10;
int maxCount = 15;
IEnumerable<string> result = list
.Select(String => new { String, Words = String.Split() })
.Where(x => x.Words.Length >= minCount
&& x.Words.Length <= maxCount)
.Select(x => x.String);
how about splitting the string to an array using space and counting that?
s.Split().Count()
removed the space :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With