Question

I have a file that is updated every 60 seconds with some data. I want to be able to read from the bottom of the file backwards so I can plot out the last 6 hours of data into a graph.

I am able to read the first 360 lines (see below), but this is old data. I want to be able to view the bottom 360 lines of the file which is the latest data. I have no idea how to do this. Can anyon help?

import numpy,itertools, matplotlib,datetime 
with open('temperature_logging') as t_in:
 temp = numpy.genfromtxt(itertools.islice(t_in,360),dtype=None,usecols=(0))
 time = numpy.genfromtxt(itertools.islice(t_in,360),dtype=None,usecols=(1))

dates = matplotlib.dates.datestr2num(time)
pylab.savefig('graph.png')
Was it helpful?

Solution

If you don't have to use numpy.genfromtxt(), and if lines of file are of constant length (or changing slowly or not highly variable) you could compute an offset and use file.seek() to get to that offset. Eg, use os.stat() to find out file size, subtract 360 times average line length plus a cushion, seek, read.

Using numpy.genfromtxt(), you probably can use the skip_header parameter to specify how many lines to skip forward before reading data. For example, read the first line of the file to find out when the file began, and compute the number of minutes since that time. Suppose it is minutesOld minutes. Say skip_header = minutesOld - 360 within the genfromtxt() parameter list.

So that you don't have to skip a whole lot of lines twice, get the time and temp data in the same genfromtxt call. To do that, say (eg)

data = numpy.genfromtxt(t_in, dtype=None, usecols=(0,1), skip_header = minutesOld - 360)

One can get the time and temp data in the same call via a usecols setting or via a dtype setting; see genfromtxt Examples docs. Code shown below illustrates the former. The temperature logging file for this example has about 1234 lines in it and times don't have dates attached. Instead hours keep counting up past 24. Adjust the code that computes the number of lines in the file to match your own time-representation conventions.

from numpy import genfromtxt
from itertools import islice
from time import localtime, time

toKeep = 20
with open('temperature_logging') as fin:
    start = genfromtxt(islice(fin,1), dtype=None, usecols=(0,1))
    hf, mf = map(int, start.tolist()[0].split(':'))
    ti = localtime(time())
    hn, mn = ti.tm_hour, ti.tm_min
    print 'File start: {:02d}:{:02d},  Time Now: {:02d}:{:02d}'.format(hf, mf, hn, mn)
    minutesOld = (hn-hf)*60 + mn-mf
    if minutesOld < 0: minutesOld += 24*60
    data = genfromtxt(fin, dtype=None, usecols=(0,1), skip_header = minutesOld - toKeep)

print data

Here is an example output from the above code:

File start: 03:43,  Time Now: 00:16
[('23:57', 66.3) ('23:58', 66.8) ('23:59', 66.7) ('24:00', 67.1)
 ('24:01', 66.7) ('24:02', 67.1) ('24:03', 66.8) ('24:04', 67.2)
 ('24:05', 67.4) ('24:06', 67.7) ('24:07', 67.3) ('24:08', 67.1)
 ('24:09', 66.8) ('24:10', 67.3) ('24:11', 67.8) ('24:12', 67.3)
 ('24:13', 67.6) ('24:14', 67.6) ('24:15', 67.7) ('24:16', 67.3)]
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top