Large anagram search not reading to end of set Python

https://stackoverflow.com/questions/19238553

30-06-2022
|

Question

I've got a piece of code here that checks anagrams of a long list of words. I'm trying to find out how to search through every word in my long word list to find other anagrams that can match this word. Some words should have more than one anagram in my word list, yet I'm unable to find a solution to join the anagrams found in my list.

set(['biennials', 'fawn', 'unsupportable', 'jinrikishas', 'nunnery', 'deferment', 'surlinesss', 'sonja', 'bioko', 'devon'] ect...

Since I've been using sets, the set never reads to the end, and it returns only the shortest words. I know there should be more. I've been trying to iterate over my key over my whole words set so I can find all the ones that are anagrams to my key.

anagrams_found = {'diss': 'sids', 'abels': 'basel', 'adens': 'sedna', 'clot': 'colt', 'bellow': 'bowell', 'cds': 'dcs', 'doss': 'sods', '
als': 'las', 'abes': 'base', 'fir': 'fri', 'blot': 'bolt', 'ads': 'das', 'elm': 'mel', 'hops': 'shop', 'achoo': 'ochoa'... and more}

I was wondering where my code has been cutting off short. It should be finding a lot more anagrams from my Linux word dictionary. Can anyone see what's wrong with my piece of code? Simpley put, first the program iterates through every word I have, then checks if the sets contain my keys. This will append the keys to my dictionary for words later that will also match my same key. If there is already a key that I've added an anagram for, I will update my dictionary by concatenating the old dict value with a new word (anagram)

    anagram_list = dict()
    words = set(words)
    anagrams_found = []
    for word in words:
        key = "".join(sorted([w for w in word]))
        if (key in words) and (key != word):
            anagrams_found.append(word)
            for name, anagram in anagram_list.iteritems():
                if anagram_list[name] == key:
                    anagram = " ".join([anagram],anagram_found)
                    anagram_list.update({key:anagram})
            anagram_list[key] = word
    return anagram_list

All in all, this program is maybe not efficient. Can someone explain the shortcomings of my code?

Solution

anagram_dict = {} # You could also use defaultdict(list) here
for w in words:
    key = "".join(sorted(w))
    if key in anagram_dict:
        anagram_dict[key].append(w)
    else:
        anagram_dict[key] = [w]

Now entries that only have one item in the list aren't anagrams so

anagram_list = []
for v in anagram_dict.iteritems():
    if len(v) > 1:
        anagram_list += v

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow