Pregunta

I have a data file that has values in it like this:

@ DD MM YYYY HH MN SS Hs Hrms Hma x Tz Ts Tc THmax EP S T0 2 Tp Hrms EPS

29 11 2000 13 17 56 2.44 1.71 3.12 9.12 11.94 5.03 12.74 .83 8.95 15.03 1.80 .86
29 11 2000 13 31 16 2.43 1.74 4.16 9.17 11.30 4.96 11.70 .84 8.84 11.86 1.80 .87

I use the following to get the data in:

infile = open ("testfile.txt", 'r')
data = np.genfromtxt(infile,skiprows=2) 

which gives me a numpy.ndarray

I want to be able to interpret the first 0-5 columns as a timestamp (DD:MM:YYY:HH:MN:SS), but this is where I get stumped - there seems to be a million ways to do it and I don't know what's best.

I've been looking at dateutil and pandas - I know there is something blindingly obvious I should do, but am at a loss. Should I convert to a csv format first? Somehow concatenate the values from each row (cols 0-5) using a for loop?

After this I'll plot values from other columns against the timestamps/deltas.

I'm totally new to python, so any pointers appreciated :)

¿Fue útil?

Solución

Here's a pandas solution for you:

test.csv:

29 11 2000 13 17 56 2.44 1.71 3.12 9.12 11.94 5.03 12.74 .83 8.95 15.03 1.80 .86
29 11 2000 13 31 16 2.43 1.74 4.16 9.17 11.30 4.96 11.70 .84 8.84 11.86 1.80 .87

pandas provide a read_csv util for reading the csv, you should give the following parameters to parse your file:

  1. delimiter: the default one is comma, so you need to set it as a space
  2. parse_dates: those date columns (order sensitive)
  3. date_parser: the default is dateutil.parser.parse, but seems it doesn't work for your case, so you should implement your own parser
  4. header: if your csv doesn't have the column name, you should set it as None

Finally, here the sample code:

In [131]: import datetime as dt

In [132]: import pandas as pd

In [133]: pd.read_csv('test.csv', 
                       parse_dates=[[2,1,0,3,4,5]], 
                       date_parser=lambda *arr:dt.datetime(*[int(x) for x in arr]),
                       delimiter=' ', 
                       header=None)
Out[133]:
          2_1_0_3_4_5     6     7     8     9     10    11     12    13    14  \
0 2000-11-29 13:17:56  2.44  1.71  3.12  9.12  11.94  5.03  12.74  0.83  8.95
1 2000-11-29 13:31:16  2.43  1.74  4.16  9.17  11.30  4.96  11.70  0.84  8.84

      15   16    17
0  15.03  1.8  0.86
1  11.86  1.8  0.87

Otros consejos

This is how I would do it:

from datetime import datetime

# assuming you have a row of the data in a list like this
# (also works on ndarrays in numpy, but you need to keep track of the row, 
#  so let's assume you've extracted a row like the one below...)
rowData = [29, 11, 2000, 13, 17, 56, 2.44, 1.71, 3.12, 9.12, 11.94, 5.03, 12.74, 0.83, 8.95, 15.03, 1.8, 0.86] 

# unpack the first six values
day, month, year, hour, min, sec = rowData[:6] 
# create a datetime based on the unpacked values
theDate = datetime(year,month,day,hour,min,sec)

No need to convert the data to a string and parse that. Might be good to check out the datetime documentation.

I barely know anything about numpy, but you can use the datetime module to convert the dates into a date object:

import datetime
line = "29 11 2000 13 17 56 2.44 1.71 3.12 9.12 11.94 5.03 12.74 .83 8.95 15.03 1.80 .86"
times = line.split()[:6]

Now from here you have two options:

print ':'.join(times)
# 29:11:2000:13:17:56

Or, as I said before, use the datetime module:

mydate = datetime.datetime.strptime(':'.join(times), '%d:%m:%Y:%H:%M:%S')
print datetime.datetime.strftime(mydate, '%d:%m:%Y:%H:%M:%S')
# 29:11:2000:13:17:56

Of course, you're probably thinking that the second option is useless, but if you want more information from the dates (i.e like the year), then it's probably better to convert it to a datetime object.

import datetime
import re

import numpy as np

def convert_to_datetime(x):
    return datetime.datetime.strptime(x, '%d:%m:%Y:%H:%M:%S')

infile = open("testfile.txt", 'r')
infile = (re.sub(r'^(\d+) (\d+) (\d+) (\d+) (\d+) (\d+)', r'\1:\2:\3:\4:\5:\6', line, 1) for line in infile)
data = np.genfromtxt(infile, skiprows=2, converters={0: convert_to_datetime})
Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top