Question

I have a collection of about 8,000 test scores in an XML file. Using Linq and C#, what is one of the most efficient ways to calculate the percentile of a particular test score.

My emphasis is on efficiency. So what is the recommended approach? I am also looking for the appropriate builtin Linq or C# functions recommended for this calculation. Is there something called Percentile() or TopPercent?

Was it helpful?

Solution

It sounds like you're worrying about efficiency before you've verified that you need to worry about it.

I would take the following approach:

  • Load the XML file into memory with LINQ to XML (as the simplest XML API in .NET)
  • Convert the scores into a list of integers (or whatever the score type is)
  • You can now find out the total count easily
  • Use Count with a predicate to find out how many scores are less than your "target" score

If you need to check multiple scores, you obviously only need to repeat the final step.

My first attempt at optimizing this (for multiple checks) would be to sort the list, so you can then just do a binary search to find the rank of each score. I'd only go that far after benchmarking though.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top