Question

I have this section of code that is supposed to find the standard deviation of each number in A, where A is a list of lists consisting of 7 values.

def sigma(A):
    diff = 0
    positives = [b for b in A if b >= 0]
    if positives:
        mean = sum(positives) / len(positives)
        for i in positives:  
            diff = ((sum([abs(i - mean)**2 for i in positives]))/(len(positives)))**(0.5)
            return diff
    else:
        return 0

    G = map(sigma, zip(*A))
    print G

this correctly gives me the standard deviation for the first list of 7 numbers but shouldn't map(sigma, zip(*A)) make it iterate over all the lists? I have also tried [sigma(A) for col in xrange(len(rows[0]))] but that did not work either. Ideally the standard deviations would also be saved as a list of lists of seven. Any help is appreciated.

UPDATE: this is the code that I have now;

def sigma(A):
    diff = 0
    positives = [b for b in A if b >= 0]
    if positives:
        mean = sum(positives) / len(positives)
        diff += ((sum([abs(i - mean)**2 for i in positives]))/(len(positives)))**(0.5)
        for i in positives:
            if (abs(i - mean)) > (diff*3):
                return -9999.00
            else:
                return i

    else:
        return -9999.00

G = map(sigma, zip(*A))
print G

it does all the things I want it to do however when I run it this way it only outputs for the first row. If the 'return' statements are replace with 'print' and print G is removed the outputs I want for all the rows are printed. How can I store all these values in a list? I'm assuming it's this line G = map(sigma, zip(*A)) that's the problem. I tried changing it to G = map(sigma, A) however this only gives me the numbers for the first column. Does anyone have any ideas?

Was it helpful?

Solution

positives = [b for b in A if b >= 0] doesn't do what you think it does. b would be a list of 7 elements, how can a list of 7 elements be greater than 0?

numpy makes this easy:

import numpy as np
import numpy.ma as ma
A = [[-9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0],
    [-9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0],
    [0.040896, 0.018690, 0.005620, -9999.0, 0.038722, 0.018323, -9999.0],
    [0.039443, 0.017517, 0.003460, -9999.0, 0.035526, 0.011692, -9999.0],
    [-9999.0, 0.017963, 0.005264, -9999.0, 0.03788, 0.014316, -9999.0]]
A = np.array(A)

sigmas = []
for b in A:
    bmask=ma.masked_array(b,mask=np.greater_equal(b,0))
    b=b[bmask.mask]
    print b
    sigmas.append(np.std(b))

gives

[]
[]
[ 0.040896  0.01869   0.00562   0.038722  0.018323]
[ 0.039443  0.017517  0.00346   0.035526  0.011692]
[ 0.017963  0.005264  0.03788   0.014316]

>>> sigmas
[0.0, 0.0, 0.013412289355661845, 0.013828802328473713, 0.011917047544903896]

edit: in response to comment

>>> A=[[1,2,3,4,5,6,7],[2,-3,4,-3,2,1,-9]]
>>> [b for b in A if b>=0]
[[1, 2, 3, 4, 5, 6, 7], [2, -3, 4, -3, 2, 1, -9]]

Python doesn't give you an error, but it is not comparing the elements in b to 0, it is only comparing b, which is evaluated as a boolean.

Here you can see explicitly what is happening:

>>> bool(b)
True
>>> True >= 0
True

For every list of 7 numbers b in A you are just doing True >= 0, which is always True.

edit2: I'm an idiot and see now you were trying to use map and the problem I was talking about would be avoided. Just change G = map(sigma, zip(*A)) to G = map(sigma, A)

edit3:: you were returning i instead of diff. here is the code:

def sigma(A):
    positives = [b for b in A if b >= 0]
    if positives:
        mean = sum(positives) / len(positives)
        diff = ((sum([abs(i - mean)**2 for i in positives]))/(len(positives)))**(0.5)
        for i in positives:
            if (abs(i - mean)) > (diff*3):
                return -9999.00
        return diff
    else:
        return -9999.00

A = [[-9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0],
    [-9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0],
    [0.040896, 0.018690, 0.005620, -9999.0, 0.038722, 0.018323, -9999.0],
    [0.039443, 0.017517, 0.003460, -9999.0, 0.035526, 0.011692, -9999.0],
    [-9999.0, 0.017963, 0.005264, -9999.0, 0.03788, 0.014316, -9999.0]]

G = map(sigma, A)

which gives:

>>> G
[-9999.0, -9999.0, 0.013412289355661845, 0.013828802328473713, 0.011917047544903896]

edit4: clarified problem

def sigma(A):
    positives = [b for b in A if b >= 0]
    sq_err=[]
    if positives:
        mean = sum(positives) / len(positives)
        diff = ((sum([abs(i - mean)**2 for i in positives]))/(len(positives)))**(0.5)
        for i in positives:
            if (abs(i - mean)) > (diff*3):
                sq_err.append(-9999.00)
            else:
                sq_err.append(i)
    else:
        return [-9999.00]
    return sq_err

A = [[-9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0],
    [-9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0, -9999.0],
    [0.040896, 0.018690, 0.005620, -9999.0, 0.038722, 0.018323, -9999.0],
    [0.039443, 0.017517, 0.003460, -9999.0, 0.035526, 0.011692, -9999.0],
    [-9999.0, 0.017963, 0.005264, -9999.0, 0.03788, 0.014316, -9999.0]]

G = map(sigma, A)

gives

>>> G
[[-9999.0], [-9999.0], [0.040896, 0.01869, 0.00562, 0.038722, 0.018323], [0.039443, 0.017517, 0.00346, 0.035526, 0.011692], [0.017963, 0.005264, 0.03788, 0.014316]]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top