changing the values of a list if they are more than a certain value

https://stackoverflow.com/questions/16944445

31-05-2022
|

Question

I am reading in a list from a text file and taking the standard deviation of this list, I want to know how to make values outside one standard deviation away from the mean to just be used as one standard deviation away from the mean. Here is the part of my code I am working with:

a=np.genfromtxt('meanvals2.txt').T[1]
b=np.std(a)
c=np.mean(a)
ok=(a>(c-b))*(a<(c+b)) # within 1st deviation
h=a[ok]
print h

This code just deletes any values outside one standard deviation. how would i change it so these deleted values are a capped at 1 standard deviation away from the mean but kept in the data set?

For example if my list was [1,2,3,4,5,20] the standard deviation is 7.08 and the mean is 5.88. so one standard devation away from the mean is 12.96 or -1.2, so current my code would exclude any numbers out side this so the list would be [1,2,3,4,5] but i want the list to actually read [1,2,3,4,5,12.96]. how would I do this

Solution

I think I would do this in two steps:

a = np.genfromtxt('meanvals2.txt').T[1] 
b = np.std(a)
c = np.mean(a)

#step 1, values lower than 1 std from mean
ok = a > (c - b)
a[~ok] = c - b

#step 2, values higher than 1 std from mean
ok = a < (c + b)
a[~ok] = c + b

print a

of course, if you really want a separate array h, you could do h = a.copy() and then work with h instead of a.

Using your data as an example:

>>> a = np.array([1,2,3,4,5,20],dtype=np.float32)
>>> b = np.std(a)
>>> c = np.mean(a)
>>> print b
6.46572151487
>>> print c
5.83333333333
>>> ok = a > (c - b)
>>> a[~ok] = c - b
>>> ok = a < (c + b)
>>> a[~ok] = c + b
>>> print a
[  1.          2.          3.          4.          5.         12.2990551]

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow