Efficient column operations (sum, average...) on very large Enumerable<arrayType[]> with Linq

StackOverflow https://stackoverflow.com/questions/23544070

  •  18-07-2023
  •  | 
  •  

質問

Assuming we have the following IEnumerable example:

IEnumerable<int[]> veryLargeJaggedArray =
{
   new int[] {1, 3, 5},
   new int[] {0, 2, 4},
   new int[] {11,22,6},
   ...lots of data streaming in
}

where the underlying collection is not based on ICollection (i.e. no fast Count() lookup), what is the most efficient way to apply c# Linq to perform aggregate type column operations?

This question extends my previous question to a more practical case.

役に立ちましたか?

解決

You can easily create your own method to generate a sequence representing a column given that input:

public static IEnumerable<T> GetColumn<T>(
    this IEnumerable<IList<T>> data, 
    int columnNumber)
{
    return data.Select(row => row[columnNumber]);
}

Now you can write code like:

var firstColumnSum = veryLargeJaggedArray.GetColumn(0).Sum();
var secondColumnAverage = veryLargeJaggedArray.GetColumn(1).Average();

他のヒント

The built-in extension methods are already pretty smart. You typically just need something like this:

var result = MyCollection.Sum();

or

var result = MyCollection.Average(i => i.NumericProperty);

These extension methods take advantage of overloading and polymorphism, such that if your underlying collection type implements IList<T>, that version of the method (which can take advantage of a Count property) will be called, but if only IEnumerable<T> is available, then the IEnumerable<T> will run.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top