Calculating Weighted Average with LINQ
-
01-10-2019 - |
Question
My goal is to get a weighted average from one table, based on another tables primary key.
Example Data:
Table1
Key WEIGHTED_AVERAGE
0200 0
Table2
ForeignKey Length Value
0200 105 52
0200 105 60
0200 105 54
0200 105 -1
0200 47 55
I need to get a weighted average based on the length of a segment and I need to ignore values of -1. I know how to do this in SQL, but my goal is to do this in LINQ. It looks something like this in SQL:
SELECT Sum(t2.Value*t2.Length)/Sum(t2.Length) AS WEIGHTED_AVERAGE
FROM Table1 t1, Table2 t2
WHERE t2.Value <> -1
AND t2.ForeignKey = t1.Key;
I am still pretty new to LINQ, and having a hard time figuring out how I would translate this. The result weighted average should come out to roughly 55.3. Thank you.
Solution
I do this enough that I created an extension method for LINQ.
public static double WeightedAverage<T>(this IEnumerable<T> records, Func<T, double> value, Func<T, double> weight)
{
double weightedValueSum = records.Sum(x => value(x) * weight(x));
double weightSum = records.Sum(x => weight(x));
if (weightSum != 0)
return weightedValueSum / weightSum;
else
throw new DivideByZeroException("Your message here");
}
After you get your subset of data the call looks like this.
double weightedAverage = records.WeightedAverage(x => x.Value, x => x.Length);
This has become extremely handy because I can get a weighted average of any group of data based on another field within the same record.
Update
I now check for dividing by zero and throw a more detailed exception instead of returning 0. Allows user to catch the exception and handle as needed.
OTHER TIPS
If you're certain that for each foreign key in Table2 there is a corresponding record in Table1, then you can avoid the join just making a group by.
In that case, the LINQ query is like this:
IEnumerable<int> wheighted_averages =
from record in Table2
where record.PCR != -1
group record by record.ForeignKey into bucket
select bucket.Sum(record => record.PCR * record.Length) /
bucket.Sum(record => record.Length);
UPDATE
This is how you can get the wheighted_average
for a specific foreign_key
.
IEnumerable<Record> records =
(from record in Table2
where record.ForeignKey == foreign_key
where record.PCR != -1
select record).ToList();
int wheighted_average = records.Sum(record => record.PCR * record.Length) /
records.Sum(record => record.Length);
The ToList method called when fetching the records, is to avoid executing the query twice while aggregating the records in the two separate Sum operations.
(Answering jsmith's comment to the answer above)
If you don't wish to cycle through some collection, you can try the following:
var filteredList = Table2.Where(x => x.PCR != -1)
.Join(Table1, x => x.ForeignKey, y => y.Key, (x, y) => new { x.PCR, x.Length });
int weightedAvg = filteredList.Sum(x => x.PCR * x.Length)
/ filteredList.Sum(x => x.Length);