Pergunta

['Date,Open,High,Low,Close,Volume,Adj Close', 
 '2014-02-12,1189.00,1190.00,1181.38,1186.69,1724500,1186.69', 
 '2014-02-11,1180.17,1191.87,1172.21,1190.18,2050800,1190.18', 
 '2014-02-10,1171.80,1182.40,1169.02,1172.93,1945200,1172.93', 
 '2014-02-07,1167.63,1177.90,1160.56,1177.44,2636200,1177.44', 
 '2014-02-06,1151.13,1160.16,1147.55,1159.96,1946600,1159.96', 
 '2014-02-05,1143.38,1150.77,1128.02,1143.20,2394500,1143.20', 
 '2014-02-04,1137.99,1155.00,1137.01,1138.16,2811900,1138.16', 
 '2014-02-03,1179.20,1181.72,1132.01,1133.43,4569100,1133.43']

I need to make a namedtuple for each of the lines in this list of lines, basically the fields would be the word in the first line 'Date,Open,High,Low,Close,Volume,Adj Close', I will then be making some calculations and will need to add 2 more fields at the end of each namedtuple. Any help on how I can do this?

Foi útil?

Solução 2

Any special reason why you want to used namedtuples? If you want to add fields later maybe you should use a dictionary. If you really wan't to go the namedtuple way though, you could use a placeholder like:

from collections import namedtuple

field_names = data[0].replace(" ", "_").lower().split(",")
field_names += ['placeholder_1', 'placeholder_2']
Entry = namedtuple('Entry', field_names)

list_of_named_tuples = []
mock_data = [None, None]
for row in data[1:]:
    row_data = row.split(",") + mock_data
    list_of_named_tuples.append(Entry(*row_data))

If, instead, you want to parse your data into a list of dictionaries (more pythonic IMO) you should do:

field_names = data[0].split(",")
list_of_dicts = [dict(zip(field_names, row.split(','))) for row in data[1:]]

EDIT: Note that even though you may use dictionaries instead of namedtuples for the small dataset from your example, doing so with large amounts of data will translate into a higher memory footprint for your program.

Outras dicas

from collections import namedtuple

data = ['Date,Open,High,Low,Close,Volume,Adj Close', 
        '2014-02-12,1189.00,1190.00,1181.38,1186.69,1724500,1186.69', 
        '2014-02-11,1180.17,1191.87,1172.21,1190.18,2050800,1190.18', 
        '2014-02-10,1171.80,1182.40,1169.02,1172.93,1945200,1172.93', 
        '2014-02-07,1167.63,1177.90,1160.56,1177.44,2636200,1177.44', 
        '2014-02-06,1151.13,1160.16,1147.55,1159.96,1946600,1159.96', 
        '2014-02-05,1143.38,1150.77,1128.02,1143.20,2394500,1143.20', 
        '2014-02-04,1137.99,1155.00,1137.01,1138.16,2811900,1138.16', 
        '2014-02-03,1179.20,1181.72,1132.01,1133.43,4569100,1133.43']


def convert_to_named_tuples(data):
    # get the names for the named tuple  
    field_names = data[0].split(",")
    # these are you two extra custom fields
    field_names.append("extra1")
    field_names.append("extra2")

    # field names can't have spaces in them (they have to be valid python identifiers
    # and "Adj Close" isn't)
    field_names = [field_name.replace(" ", "_") for field_name in field_names]

    # you can do this as many times as you like.. 
    # personally I'd do it manually once at the start and just check you're getting 
    # the field names you expect here...  
    ShareData = namedtuple("ShareData", field_names)

    # unpack the data into the named tuples
    share_data_list = []
    for row in data[1:]:
        fields = row.split(",")
        fields += [None, None]

        share_data = ShareData(*fields)
        share_data_list.append(share_data)

    return share_data_list

# check it works..
share_data_list = convert_to_named_tuples(data)

for share_data in share_data_list:
    print share_data

Actually this is better I think since it converts the fields into the right types. On the downside it won't take arbitraty data...

from collections import namedtuple
from datetime import datetime 

data = [...same as before...]

field_names = ["Date","Open","High","Low","Close","Volume", "AdjClose", "Extra1", "Extra2"] 
ShareData = namedtuple("ShareData", field_names)

def convert_to_named_tuples(data):
    share_data_list = []
    for row in data[1:]:
        row = row.split(",")

        fields = (datetime.strptime(row[0], "%Y-%m-%d"),  # date
                  float(row[1]), float(row[2]),
                  float(row[3]), float(row[4]),
                  int(row[5]),   # volume
                  float(row[6]), # adj close
                  None, None)    # extras

        share_data = ShareData(*fields)
        share_data_list.append(share_data)

    return share_data_list

# test
share_data_list = convert_to_named_tuples(data)
for share_data in share_data_list:
    print share_data

But I agree with other posts.. why use namedtuple when you can use a class definition..

why don't you use a dictionary for the data, adding additional keys is then easy

dataList = []
keys = myData[0].split(',')
for row in myData:
    tempdict = dict()
    for index, value in enumerate(row.split(',')):
        tempdict[keys[index]] = value
        # if your additional values are going to be determined here then 
        # you can do whatever calculations you need and add them
        # otherwise you do work with this list elsewhere
    dataList.append(tempdict)
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top