Question

I have a dictionary with key and value as tuples where the key is (queryID,sentence) and value is (score,documentID) (the first item is a number and second is a string, in both the key and value tuples).

d={(1,'bla bla'):(10,'doc1'),(1,'yada yada'):(20,'doc2'),(2,'bla bla'):(30,'doc1'),(2,'more of the same'):(40,'doc3')}

I have grouped this dict by the query ID and sorted by the score, so for each query ID i have items sorted by the score.

What i would like to do is get , for each query ID, the top k items in the already sorted dict. so if i have a 100 items for query ID=1, and the same for qID =2, i would like to get for each of them the top k items in the sorted dict . How can that be done please?

This is (part of) my code - to get the sorted dict -

sorted_dict=collections.OrderedDict(sorted(sen_dict.items(), key= lambda x: (-int(x[0][0]),x[1][0]),reverse=True)
Was it helpful?

Solution

This uses your sorted_dict variable to get the top K highest scores related to each query ID.

k = 2 #assign how many top values you want to see
id = 1 #assign queryID 
topK = [val for key,val in sorted_dict.items() if key[0] == id][0:k]
print topK

OTHER TIPS

You could just loop through the dictionary and append a result array. I guess something like that should work if the qID increases linearly by 1:

results=[]
i = 1

for key in d:
    if key[0]==i:
        currentResult=d[key]
    else:
        results.append(currentResult)
        currentScore=0
        i+=1
results.append(currentResult)

This only works if there is always just one item of the highest score but it can easily be appended to work for multiple items of the same score.

results=[]
i = 1
currentResults=[]
currentScore = 0   

for key in d:
    if key[0]==i:
        if currentScore == d[key][0]:
            currentResults.append(d[key])
        elif currentScore < d[key][0]:
            currentResults = [d[key]]
            currentScore = d[key][0]
    else:
        results.append(currentResult)
        i+=1
results.append(currentResult)

I guess something like this should work.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top