Вопрос

Let's say I have a list of lists of strings (stringList):

[['its', 'all', 'ball', 'bearings', 'these', 'days'], 
['its', 'all', 'in', 'a', 'days', 'work']]

and I also I have a set of strings (stringSet) that are the unique words from stringList:

{'its', 'all', 'ball', 'bearings', 'these', 'days', 'in', 'a', 'work'}

Using a comprehension, if possible, how can I get a dictionary that maps each word in stringSet to a dictionary of the indexes of stringList that contain that word? In the above example, the return value would be:

{'its': {0,1}, 'all':{0,1}, 'ball':{0}, 'bearings':{0}, 'these':{0}, 'days':{0,1}, 'in':{1}, 'a':{1}, 'work':{1}}

My hangup is how to accumulate the indexes into the dictionary. I'm sure its relatively simple to those further along than I am. Thanks in advance...

Это было полезно?

Решение

This seems to work:

str_list = [
    ['its', 'all', 'ball', 'bearings', 'these', 'days'], 
    ['its', 'all', 'in', 'a', 'days', 'work']
]
str_set = set(word for sublist in str_list for word in sublist)

str_dict = {word: set(lindex
        for lindex, sublist in enumerate(str_list) if word in sublist)
    for word in str_set}

print (str_dict)

Другие советы

>>> alist = [['its', 'all', 'ball', 'bearings', 'these', 'days'], 
... ['its', 'all', 'in', 'a', 'days', 'work']]
>>> aset = {'its', 'all', 'ball', 'bearings', 'these', 'days', 'in', 'a', 'work'}

>>> {x: {alist.index(y) for y in alist if x in y} for x in aset}
{'a': set([1]), 'all': set([0, 1]), 'ball': set([0]), 'these': set([0]), 'bearings': set([0]), 'work': set([1]), 'days': set([0, 1]), 'in': set([1]), 'its': set([0, 1])}

Also you can use enumerate and use list to be value will make the result clearer:

>>> {x: [i for i, y in enumerate(alist) if x in y] for x in aset}
{'a': [1], 'all': [0, 1], 'ball': [0], 'these': [0], 'bearings': [0], 'work': [1], 'days': [0, 1], 'in': [1], 'its': [0, 1]}

Here's my code, works with a few nested loops, tried to make something you would find easily readable and understandable!

def accumulate(stringList,stringSet):
    outputDict = {}
    for setItem in stringSet:
        outputItem = set()
        for i,listItem in enumerate(stringList):
            if setItem in listItem:
                outputItem.add(i)
        outputDict[setItem] = outputItem
    return outputDict

stringList = [['its', 'all', 'ball', 'bearings', 'these', 'days'], ['its', 'all', 'in', 'a', 'days', 'work']]
stringSet = {'its', 'all', 'ball', 'bearings', 'these', 'days', 'in', 'a', 'work'}

print(accumulate(stringList,stringSet))

You can use a nested loop:

result = {}
for w in stringSet:
    result[w] = []
    for i,l in enumerate(stringList):
        if w in l:
            result[w].append(i)

What this does is that it goes through each word in stringSet, and checks if it is in the first list, in the second list, etc. and updating the dictionary accordingly.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top