How to calculate variance in perl?

Question 1

The variance of a constant sequence is zero, so your calculations are all more or less correct, and all more or less the same.

You get results slightly different than zero because you are performing many operations with finite precision floating point numbers. Let's take this code:

 $z = 0;
 $z += 0.98 for 1..58;
 $mean = $z / 58;
 printf "%.20f", $mean;

With this code we take the sum of 58 instances of the number 0.98, and then divide the sum by 58. It makes sense that this code would print out 0.98000000000000000000, right? No, what I actually get is

 0.97999999999999887201

(and your results might vary).

The canonical What Every Programmer Should Know About Floating-Point Arithmetic can explain the gory details to you.

Question 2

Fixed point arithmetic

Do your data use 0.01 (1/100) precision? Example you have provided my suggests it.

YES => you may use fixed point arithmetic instead of floating point arithmetic to reduce cumulation of rounding errors. Use 1/(100**2)=1/10_000 scaling factor.

https://en.wikipedia.org/wiki/Fixed-point_arithmetic

sub var4{
    my ($data) = @_;
    if (@$data ==1) {
        return 0;
    }
    my $totalD2 = 0;
    foreach (@$data) {
       $totalD2 += $_*100
    }
    my $meanD2 = $totalD2 / (scalar @$data);

    my $sqtotalD4 = 0;
    foreach (@$data) {
        $sqtotalD4 += ($_*100 - $meanD2) ** 2;
    }
    my $varD4 = $sqtotalD4 / (scalar @$data - 1);
    return $varD4/10_000; # convert from fixed point to floating point
}

Question 3

I was waiting for someone to just make the post about floating point arithmetic (ty mob), because that's ultimately the answer.

However, the Statistics::Descriptive module that you linked to was version 2.6 while the current is version 3.0607.

If you look at the source code for 2.6, it uses some stranger math to calculate the mean and the variance:

sub add_data {
  ....

  ##Calculate new mean, pseudo-variance, min and max;
  foreach ( @{ $aref } ) {
    $oldmean = $self->{mean};
    $self->{sum} += $_;
    $self->{count}++;

    ....

    $self->{mean} += ($_ - $oldmean) / $self->{count};
    $self->{pseudo_variance} += ($_ - $oldmean) * ($_ - $self->{mean});
  }

Now this method of calculating the running mean is mathematical accurate and the same as Sum(A1..An)/n. However, because of floating point arithmetic, you're going to see a difference than if it just did the mean straight from the sum/count. Additionally, I suspect that this running variance might be mathematically identical (didn't bother doing a paper proof), but you will also see slight differences due to floating point arithmetic.

The latest version of the module does use the simpler method of calculating the mean and the variance though, and also has some efficiency improvements by utiltizing modules like List::Util. So if your linking to version 2.6 was anything but an oversite, then I suggest that you upgrade to the latest version.