Question

I am trying to filter a List of strings based on the number of words in each string. I am assuming that you would trim any white-space at the ends of the string, and then count the number of spaces left in the string, so that WordCount = NumberOfSpaces + 1. Is that the most efficient way to do this? I know that for filtering based on character count the following is working fine...just cant figure out how to write it succinctly using C#/LINQ.

if (checkBox_MinMaxChars.Checked)
{
    int minChar = int.Parse(numeric_MinChars.Text);
    int maxChar = int.Parse(numeric_MaxChars.Text);

    myList = myList.Where(x => 
                              x.Length >= minChar && 
                              x.Length <= maxChar).ToList();
}

Any ideas of for counting words?

UPDATE: This Worked like a charm...Thanks Mathew:

int minWords = int.Parse(numeric_MinWords.Text);
int maxWords = int.Parse(numeric_MaxWords.Text);

sortBox1 = sortBox1.Where(x => x.Trim().Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Count() >= minWords &&
                               x.Trim().Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Count() <= maxWords).ToList();
Was it helpful?

Solution

I would approach it in a more simplified manner since you have indicated that a space can be used reliably as a delimiter like so:

var str = "     the string to split and count        ";
var wordCount = str.Trim().Split(new char[] { ' ' }, StringSplitOptions.RemoveEmptyEntries).Count();

EDIT:

If optimal perforamnce is necessary and memory usage is a concern you could write your own method and leverage IndexOf() (although there are many avenues for implementation on a problem like this, I just prefer reuse rather than from-scratch code design):

    public int WordCount(string s) {
        const int DONE = -1;
        var wordCount = 0;
        var index = 0;
        var str = s.Trim();
        while (index != DONE) {
            wordCount++;
            index = str.IndexOf(" ", index + 1);
        }
        return wordCount;
    }

OTHER TIPS

You approach to counting words is ok. String.Split will give similar result for more memory usage.

Than just implement your int WordCount(string text) function and pass it to Where:

myList.Where(s => WordCount(s) > minWordCount)

You want all strings with word-count in a given range?

int minCount = 10;
int maxCount = 15;
IEnumerable<string> result = list
    .Select(String => new { String, Words = String.Split() })
    .Where(x => x.Words.Length >= minCount
             && x.Words.Length <= maxCount)
    .Select(x => x.String);

how about splitting the string to an array using space and counting that?

s.Split().Count()

removed the space :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top