質問

This is my input :

ClientData = {
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),

           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },


'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1),
      ],
           'aggregate_Pageviews_VisitsByWeek': [],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

How can I append to the key 'aggregate_Pageviews_VisitsByWeek' the union of the 'aggregate_PageviewsByWeek' and 'aggregate_VisitsByWeek' based on the date key

the output will looks like something similar to this :

{
'ClientName1': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2)],

           'aggregate_Pageviews_VisitsByWeek': [

                                               ('2013-01-06', 2, 0),
                                               ('2013-02-03', 1, 0),
                                               ('2013-02-10', 1, ),
                                               ('2013-02-24', 1, 0),
                                               ('2013-03-03', 2, 1),
                                               ('2013-05-12', 0, 1)],
           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                         ('2013-05-12', 1)]

                                       },



'ClientName2': {
           'aggregate_PageviewsByWeek': [('2013-01-06', 2),
                                       ('2013-02-03', 1),
                                       ('2013-02-10', 1),
                                       ('2013-02-24', 1),
                                       ('2013-03-03', 2),
                                       ('2013-03-24', 1)],

           'aggregate_Pageviews_VisitsByWeek': [
                                       ('2013-01-06', 2, 0),
                                       ('2013-02-03', 1, 0),
                                       ('2013-02-10', 1, 0),
                                       ('2013-02-24', 1, 0),
                                       ('2013-03-03', 2, 1),
                                       ('2013-03-31', 1, 1),
                                       ('2013-05-12', 0, 1),
                                       ('2013-05-19', 0, 2),
                                       ('2013-06-30', 0, 2)],

           'aggregate_VisitsByWeek': [('2013-03-03', 1),
                                      ('2013-03-31', 1),
                                      ('2013-05-12', 1),
                                      ('2013-05-19', 2),
                                      ('2013-06-30', 2)]
                                       }

}

if the key "which the date in this case" is not on the other list I want to replace that value with 0 (Date, aggregate_PageviewsByWeek_Value, aggregate_VisitsByWeek_Value )

example :
aggregate_PageviewsByWeek :('2013-01-06', 12) and aggregate_VisitsByWeek : (2013-01-13, 30)

the output will be :
aggregate_Pageviews_VisitsByWeek : [('2013-01-06', 12, 0), (2013-01-13, 0, 30)]

my goal of thsi question is to get the trends of page views and visits based on the date.

Thanks!

役に立ちましたか?

解決 2

Convert each list to dict, combine keys of these dicts, loop thru keys and generate another list, where each element is date, value from first dict or 0 and value from second dict or 0, it is better explained via code :)

def merge_lists(list1, list2):
    dict1 = dict(list1)
    dict2 = dict(list2)
    dates = list(set(dict1.keys())|set(dict2.keys()))
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        item.append(dict1.get(date,0))
        item.append(dict2.get(date,0))
        merged_list.append(item)

    return merged_list

merged_list = merge_lists([('2013-01-06', 2),
            ('2013-02-03', 1),
            ('2013-02-10', 1),
            ('2013-02-24', 1),
            ('2013-03-03', 2),
            ('2013-03-24', 1)],
            [('2013-03-03', 1),
            ('2013-03-31', 1),
            ('2013-05-12', 1),
            ('2013-05-19', 2),
            ('2013-06-30', 2)])


import pprint
pprint.pprint(merged_list)

output:

[['2013-01-06', 2, 0],
 ['2013-02-03', 1, 0],
 ['2013-02-10', 1, 0],
 ['2013-02-24', 1, 0],
 ['2013-03-03', 2, 1],
 ['2013-03-24', 1, 0],
 ['2013-03-31', 0, 1],
 ['2013-05-12', 0, 1],
 ['2013-05-19', 0, 2],
 ['2013-06-30', 0, 2]]

You can make it generic by merging any number of lists

def merge_lists(*lists):
    dicts = [dict(l) for l in lists]
    dates = set()
    for d in dicts:
        dates |= set(d.keys())
    dates = list(dates)
    dates.sort()
    merged_list = []
    for date in dates:
        item = [date]
        for d in dicts:
            item.append(d.get(date,0))
        merged_list.append(item)

    return merged_list

他のヒント

First, you need a function that merges a single client's entries.

There are two easy ways to merge parallel sequences that might each be missing some values: You can iterate the two in parallel, or you can build a dictionary (or sorted map) of keys, and just handle each sequence separately. You can see an example of the first, e.g., here. But the second is simpler, at least in Python, so long as the keys are hashable. So:

def merge_client(client):
    merged = {}
    for day, views in client['aggregate_PageviewsByWeek']:
        merged[day] = [views, 0]
    for day, visits in client['aggregate_VisitsByWeek']:
        merged.setdefault(day, [0, 0])[1] = visits
    flattened = [tuple([key] + value) for key, value in merged.items()]
    client['aggregate_Pageviews_VisitsByWeek'] = sorted(flattened)

To make this algorithm to more than two entries, you'd use append—or, if there may be a huge number of entires, just use a dict instead of a list (so we don't have to fill in all those default 0's until the end).

Now we just need to call this on each client in the list:

for client in ClientData.values():
    merge_client(client)
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top