HashSet IntersectWith Count words but only unique
Question
i got richtextBox control in form and a text file. I am getting text file to array and getting richtextbox1.text to other array than compare it and count words matching. But for example there are two "name" word in text file and three "and" word in richtextbox .. So if there is two same word in text file in richtextbox it cant be 3 or higher after 2 , it must be wrong word so it must not be counted. But HashSet is counting unique values only not looking for duplicates in text file. I wanna compare every word in text file with words in RichTextBox .. (sorr for my english.)
My Codes here ;
StreamReader sr = new StreamReader("c:\\test.txt",Encoding.Default);
string[] word = sr.ReadLine().ToLower().Split(' ');
sr.Close();
string[] word2 = richTextBox1.Text.ToLower().Split(' ');
var set1 = new HashSet<string>(word);
var set2 = new HashSet<string>(word2);
set1.IntersectWith(set2);
MessageBox.Show(set1.Count.ToString());
Solution
Inferring that you want:
file:
foo
foo
foo
bar
text box:
foo
foo
bar
bar
to result in '3' (2 foos and one bar)
Dictionary<string,int> fileCounts = new Dictionary<string, int>();
using (var sr = new StreamReader("c:\\test.txt",Encoding.Default))
{
foreach (var word in sr.ReadLine().ToLower().Split(' '))
{
int c = 0;
if (fileCounts.TryGetValue(word, out c))
{
fileCounts[word] = c + 1;
}
else
{
fileCounts.Add(word, 1);
}
}
}
int total = 0;
foreach (var word in richTextBox1.Text.ToLower().Split(' '))
{
int c = 0;
if (fileCounts.TryGetValue(word, out c))
{
total++;
if (c - 1 > 0)
fileCounts[word] = c - 1;
else
fileCounts.Remove(word);
}
}
MessageBox.Show(total.ToString());
Note that this is destructively modifying the read dictionary, you can avoid this (so only have to read the dictionary once) buy simply counting the rich text box in the same way and then taking the Min of the individual counts and summing them.
OTHER TIPS
You need the counts to be the same? You need to count the words, then...
static Dictionary<string, int> CountWords(string[] words) {
// use (StringComparer.{your choice}) for case-insensitive
var result = new Dictionary<string, int>();
foreach (string word in words) {
int count;
if (result.TryGetValue(word, out count)) {
result[word] = count + 1;
} else {
result.Add(word, 1);
}
}
return result;
}
...
var set1 = CountWords(word);
var set2 = CountWords(word2);
var matches = from val in set1
where set2.ContainsKey(val.Key)
&& set2[val.Key] == val.Value
select val.Key;
foreach (string match in matches)
{
Console.WriteLine(match);
}