Question

I have a big C# Hashset and I am not able to process it all at once. I need to extract chunks having a given size. I know I can iterate through the hash and copy each element into an array/list that can be later on processed, but is there any faster/more elegant way to do that? Something like a single line?

    public static IEnumerable<T[]> Slice<T>(this HashSet<T> h, int size)
    {
        if (0 >= size)
        {
            throw new Exception("0 or negative slice sizes are not accepted!");
        }

        if (null == h || 0 == h.Count)
        {
            yield return new T[0];
            yield break;
        }

        if (size >= h.Count)
        {
            yield return h.ToArray();
            yield break;
        }

        List<T> to_ret = new List<T>(size);
        foreach (T elem in h)
        {
            if (size == to_ret.Count)
            {
                yield return to_ret.ToArray();
                to_ret.Clear();
            }

            to_ret.Add(elem);
        }

        if (0 < to_ret.Count)
        {
            yield return to_ret.ToArray();
            to_ret.Clear();
        }
    }

This is how I did it ... I was thinking there is a more elegant way than this. :(

Was it helpful?

Solution

There isn't anything built-in.

However if you use the MoreLinq library (which is a useful thing to have around), then it has a Batch operation which does what you want.

int batchSize = 1024;

foreach (var batch in myHashSet.Batch(batchSize))
{
    foreach (var item in batch)
    {
        ...
    } 
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top