Frage

I have an array of ints in C# and I want to get 5% of the whole array, in the way that the new array includes the most frequent similar values. For an example, say I have an array with 100 entries that includes 40 siblings of 20 (15 to 25). What I want is to detect the 20 as the most frequent value (including it's siblings) as a new array and then 5 most frequent values inside the new array. I need to run the code on an ASP.net website and because of that, I need a fast algorithem. Could anyone help me with this please?

War es hilfreich?

Lösung

You can build a simple algorithm by grouping the values, ordering by count, and then taking them until you fill the required 5% array, like this:

// Build a set of {Value, Count} pairs using LINQ
var counts = data
    .GroupBy(v => v)
    .Select(g => new {
        Value = g => Key
    ,   Count = g.Count()
    }).OrderByDescending(p => p.Count)
    .Take(5);

EDIT :

The array may be as big as 1024*1024 in size and the ranges are between 0 and 255

Since the range is very small, you can use counting array instead of a group, like this:

int counts = new int[256];
foreach (var b in data) {
    counts[b]++;
}

Now you can run the Quick Select Algorithm to choose the fifth item. Here is an answer that provides a C# implementation of QuickSelect.

var fifth = QuickSelect(counts, 5);
var res = new List<KeyValuePair<int,int>>();
for (int i = 0 ; i != counts.Length && res.Length != 5 ; i++) {
    if (counts[i] >= fifth) {
        res.Add(new KeyValuePair<int,int>(i, counts[i]));
    }
}

You may want to replace the quick select algorithm with the median-of-medians algorithm, which has the same linear performance, but is not randomized.

Andere Tipps

var numbersByOccurrence = from numbers in yourNumberArrayVariable
                          group numbers by numbers into g
                          select new { Number = g.Key, Count = g.Count() };

var limitedSize = numbersByOccurrence.OrderByDescending(n => n.Count).Take(5);

You now have an variable (you can cast as an array or list) of 5 objects with a Number and Count variable you can easily access.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top