Вопрос

I've been experimenting with Java 8 streams, is this is best way to remove the min and max scores.

private final Set<MatchScore> scores = new HashSet<>(10);

. . .

public double OPR() {
    return scores.stream()
            .mapToDouble(MatchScore::getScore)
            .filter((num) -> { //Exclude min and max score
                return num != scores.stream()
                                    .mapToDouble(MatchScore::getScore)
                                    .max().getAsDouble() 
                        && 
                       num != scores.stream()
                                    .mapToDouble(MatchScore::getScore)
                                    .min().getAsDouble();
            })
            .average().getAsDouble();
}
Это было полезно?

Решение

A simpler approach would be:

return scores.stream()
        .mapToDouble(MatchScore::getScore)
        .sorted()
        .skip(1)
        .limit(scores.size() - 2)
        .average().getAsDouble();

Note: that works because elements in a set are unique - with a list there could be more than one element equal to the min or max score.


Performance wise*, the skip/limit is significantly faster on a small set of 10 elements (the Mean column shows the average time taken by a sample call, in nanoseconds):

Benchmark                      Mode   Samples         Mean   Mean error    Units
c.a.p.SO22923505.maxMin        avgt         5     6996.190      284.287    ns/op
c.a.p.SO22923505.skipLimit     avgt         5      479.935        4.547    ns/op

*using jmh - and here is the source code for the tests.

Другие советы

One can use DoubleSummaryStatistics to collect the required information in a single pass over the data, and then subtract out the minimum and maximum:

@GenerateMicroBenchmark
public double summaryStats() {
    DoubleSummaryStatistics stats =
        scores.stream()
              .collect(Collectors.summarizingDouble(Double::doubleValue));

    if (stats.getCount() == 0L) {
        return 0.0; // or something
    } else {
        return (stats.getSum() - stats.getMin() - stats.getMax()) / (stats.getCount() - 2);
    }
}

Adding this code to assylias' benchmark code gives me the following results. Although my machine is slower overall, the relative performance of using DoubleSummaryStatistics over a single pass is faster.

Benchmark                         Mode   Samples         Mean   Mean error    Units
c.a.p.SO22923505.maxMin           avgt         5     9629.166     1051.585    ns/op
c.a.p.SO22923505.skipLimit        avgt         5      682.221       80.504    ns/op
c.a.p.SO22923505.summaryStats     avgt         5      412.740       85.372    ns/op

I think this will do the job without having to make multiple passes through the stream, or sorting it:

private static class ScoreData {
    public double min, max, sum;
    public int count;
    public ScoreData() {
        min = Double.POSITIVE_INFINITY;
        max = Double.NEGATIVE_INFINITY;
        sum = 0;
        count = 0;
    }
    public void add(double d) {
        if (d < min)
            min = d;
        if (d > max)
            max = d;
        sum += d;
        count++;
    }
    public void combine(ScoreData m) {
        if (m.min < min)
            min = m.min;
        if (m.max > max)
            max = m.max;
        sum += m.sum;
        count += m.count;
    }
}

private static ScoreData getScoreData(DoubleStream ds) {
    return ds.collect(ScoreData::new, ScoreData::add, ScoreData::combine);
}

This works for any DoubleStream. Now you can get the average excluding the extrema like

ScoreData sd = getScoreData(scores.stream().mapToDouble(MatchScore::getScore));
double answer = (sd.sum - sd.min - sd.max) / (sd.count - 2);

assuming that sd.count > 2.

EDIT: Looks like I just reinvented the wheel! Stuart has a better solution using a class that already exists in the JDK.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top