Question

Hi I am currently comparing statistics between Matlab and Apache functions. Here Apache functions are tested in Java. For the very same set of data, I get different results from a double array (double[] ) as follow:

---------------------------------------
           Matlab       vs Apache
---------------------------------------
max      = 0.5451       vs 0.5450980392156862
min      = 0.4941       vs 0.49411764705882355
var      = 5.4154e-05   vs 5.415357603461868E-5
std      = 0.0074       vs 0.007358911334879547
mean     = 0.5206       vs 0.5205525290240967
kurtosis = 3.3442       vs 0.35227427833465486
skewness = 0.2643       vs 0.26466432504210746

I checked and rechecked my data, each value from Matlabs is the same used in Java. Here we can see that all statistics are identical, except for the kurtosis.

Is that possible that kurtosis is computed differently from Matlab and Apache library? If so, which data should I trust then?


EDIT

My data is a subset of an image matrix (containing pixels values). For each subset I compute the above statistics. Everytime, all the statistics match perfectly except for the kurtosis.

The matlab code for computing the kurtosis of my subset is the following:

kurtosis( sub(:) ); // sub is a n x m matrix

While the one I used in Java is:

import org.apache.commons.math3.stat.descriptive.moment.Kurtosis;
// ...
Kurtosis kurt = new Kurtosis();
System.out.println("-kurtosis: " + kurt.evaluate(subImg) );

subImg being a double[n x m] array.

Was it helpful?

Solution

You can calculate the Apache Java statistics in Matlab as well by importing the function. The Apache function uses an unbiased estimator of the population excess kurtosis. Excess kurtosis means substracting 3 so that the kurtosis of a normal distribution is equal to zero.

To demonstrate it I also made a Matlab function out of the function (Apache documentation):

function y = kurtosis_apache(x)

    n=length(x);
    mean_x = mean(x);
    std_x = std(x);

    y = ( (n*(n+1) / ((n -1)*(n - 2)*(n-3))) * sum((x - mean_x).^4) / std_x.^4 ) - ((3*(n-1).^2) / ((n-2)*(n-3)));
end

And my code in the command Window that shows the Matlab Apache implementation, the Java Apache implementation, and the Matlab version (biassed/unbiassed):

javaaddpath('commons-math3-3.2.jar')
import org.apache.commons.math3.stat.descriptive.moment.Kurtosis;

x = randn(1e4,1);

kurtosis_apache(x)

ans = 0.0016

kurt = Kurtosis();
kurt.evaluate(x)

ans = 0.0016

kurtosis(x)

ans = 3.0010

kurtosis(x,0)

ans = 3.0016

Note also the Matlab Kurtosis documentation:

Matlab documentation

So with the 0 flag the unbiassed Matlab implementation is exactly the same as the Apache version, when you substract 3 to make it an excess kurtosis.

(kurtosis(x,0)-3)-kurt.evaluate(x)

ans = 3.8636e-14

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top