Mean by interval of an array, standard deviation in python (Pandas)

https://stackoverflow.com/questions/22572599

19-06-2023
|

Domanda

I'd like to calculate the mean and the standar deviation of many consecutive intervals of two related arrays (List below), where the first two columns are (let's say) time and distance respectively. The third, fourth and fifth are mean time (central), mean distance and the deviation standard. (Actually i made this list by hand). How you can see in the example the mean and standard deviation are made for each three consecutive intervals (But in general can be over 4 by 4, 10 by 10 and so on).

So, I've got similar long lists and i wanna calculate (maybe with PANDAS, NUMPY and/or SCIPY) something like this doing some loop and create they arrays of mean time, mean distance and deviation standar. Therefore be able to plot distance versus time and plot the mean values for time and distance with its standard deviation (error, known as sigma)

1  1   2  4.6   3.29
2  4   5  25.6  8.17
3  9   8  64.6  13.07
4  16  11 121.6 17.96
5  25  14 196.6 22.86
6  36  17 289.6 27.76
7  49  20 400.6 32.66
8  64
9  81
10 100
11 121
12 144
13 169
14 196
15 225
16 256
17 289
18 324
19 361
20 400
21 441

i plotted this using errorbar, but my problem is how to do the loop for each interval

enter image description here

Soluzione

You can do this with numpy. reshape can be used to group data into chunks to calculate stats:

import numpy as np

// data
time = np.arange(1.0,22.0)
distance = time ** 2

// group data into chunks to get stats
meanTime = np.mean(time.reshape(-1,3),axis=1)
meanDistance = np.mean(distance.reshape(-1,3), axis=1)
std = np.std(distance.reshape(-1,3), axis=1)

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow