Coming from this question on Math SE I have the following scenario.
There is a set ($array
) with arbitrary values, the amount of values in the set ($n
), it's mean ($mean
) and standard deviation ($s
).
$array = array(1, 5, 16, 3, ...);
$n = count($array);
$mean = array_sum($array) / count($array);
$s = sd($array);
Where the sd()
function has it's origin on the PHP comments for the stats_standard_deviation()
function:
// Function to calculate square of value - mean
function sd_square($x, $mean) { return pow($x - $mean,2); }
// Function to calculate standard deviation (uses sd_square)
function sd($array) {
// square root of sum of squares devided by N-1
return sqrt(array_sum(array_map("sd_square", $array, array_fill(0,count($array), (array_sum($array) / count($array)) ) ) ) / (count($array)-1) );
}
Now the $array
is dropped and the values aren't available anymore (let's say for reasons of anonymity) but another $x
value is coming in which shall be calculated within the $mean
and $s
(standard deviation).
I try to calculate the new standard deviation by this formular (according to this answer on Math SE):
function m_reverse($n, $mean, $x) {
return ( $n * $mean + $x ) / ( $n + 1 );
}
function sd_reverse($s, $n, $x, $mean) {
return sqrt( 1 / $n * ( ( $n - 1 ) * pow( $s, 2 ) + ( $x - $mean ) ) );
}
The m_reverse()
functions returns the correct new mean. But the sd_reverse()
function won't. Can anyone figure out, what I've done wrong? Maybe inappropriate usage of paranthesis?
You can find a code example of my implementation here: http://3v4l.org/5mPDp
Any help appreciated!