Pergunta

I have a txt file that looks like this:

0.065998       81   
0.319601      81   
0.539613      81  
0.768445      81  
1.671893      81  
1.785064      81  
1.881242      954  
1.921503      193  
1.921605      188  
1.943166      81  
2.122283      63  
2.127669      83  
2.444705      81  

The first column is the packet arrival and second packet size in bytes.

I need to get the average value of bytes in each second. For example in the first second I have only packets with value 81 so the average bitrate is 81*8= 648bit/s. Then I should plot a graph x axis time in seconds, y axis average bitrate in each second.

So far I have only managed to upload my data as arrays:

import numpy as np

d = np.genfromtxt('data.txt')

x = (d[:,0])  
y = (d[:,1 ])

print x  
print(y*8)

I'm new to Python, so any help where to start would be much appreciated!

Here is the result script:

import matplotlib.pyplot as plt  
import numpy as np  
x, y = np.loadtxt('data.txt', unpack=True)  
bins = np.arange(60+1)  
totals, edges = np.histogram(x, weights=y, bins=bins)  
counts, edges = np.histogram(x, bins=bins)  

print counts  
print totals*0.008/counts  

plt.plot(totals*0.008/counts, 'r')  
plt.xlabel('time, s')  
plt.ylabel('kbit/s')  
plt.grid(True)  
plt.xlim(0.0, 60.0)  
plt.show()      

The script reads the .txt file which contains packet size(bytes) and arrival time, and plots the average bitrate/s during a time period. Used to monitor server incoming/outgoing traffic!

Foi útil?

Solução 2

If you want to use numpy, you can use numpy.histogram:

>>> import numpy as np
>>> x, y = np.loadtxt('data.txt', unpack=True)
>>> bins = np.arange(10+1)
>>> totals, edges = np.histogram(x, weights=y, bins=bins)
>>> totals
array([  324.,  1578.,   227.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.])

This gives the total in each bin, and you could divide by the width of the bin to get an approximate instantaneous rate:

>>> totals/np.diff(bins)
array([  324.,  1578.,   227.,     0.,     0.,     0.,     0.,     0.,
           0.,     0.])

(Okay, since the bin widths were all one, that isn't very interesting.)

[update]

I'm not sure I understand your follow-up comment that you need the average packet size in each second -- I don't see that mentioned anywhere in your question, but I'm notorious at missing the obvious.. :-/ In any case, if you want the number of packets in a time bin, then you simply don't need to set the weights (default is 1):

>>> counts, edges = np.histogram(x, bins=bins)
>>> counts
array([4, 6, 3, 0, 0, 0, 0, 0, 0, 0])

where counts is the number of packets which arrived in each bin.

Outras dicas

Your data is already sorted by time so I might just use itertools.groupby for this one:

from itertools import groupby
with open('data.txt') as d:
     data = ([float(x) for x in line.split()] for line in d)
     for i_time,packet_info in groupby(data,key=lambda x:int(x[0])):
         print i_time, sum(x[1] for x in packet_info)

output is:

0 324.0
1 1578.0
2 227.0

Since the arrival times are irregular, I recommend quantizing them into integer numbers of seconds, and then aggregating total bytes for all arrivals for a given second. With this done, plotting and other analysis gets a lot easier.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top