Question

I have been using EmpiricalDistributionImpl from Apache Commons-Math library for quite a while now, upgrading from 2.x to 3.3 I am experiencing some problems.

First off NaNs seem to be causing problems during load() in this version, I am pretty sure they were not problematic before. Then the real problem is that I am getting negative values from my EDI class by getNextValue() even though all of the values I have loaded are strictly positive. Specifically my values are positive ratios in (0, +Inf) range, and if I plot them it's pretty top heavy (i.e like 90-95% values end up in the top 3 bins).

FWIW, I have found the following two bug reports but not sure they are entirely related.

They both appear to be fixed and scheduled for 3.4 release, except there is no ETA on the release date.

Suggestions?

Was it helpful?

Solution

MATH-1132 is unrelated; but MATH-984 likely is related to the data range problem you mention. NaNs should be filtered before data are passed to load, as there is no meaningful way to handle them (without adding support for a NanStrategy, which is not currently supported).

Version 3.4 was just released.

Please open a new ticket if you still have range problems and feel free to open a ticket to get NaNs supported via a NaNStrategy.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top