I'm not sure if what I'm thinking of is what you're currently doing in your code, but I can't think of anything better:
from collections import defaultdict
words = 'dog god steer reste trees dog fred steer'.split() # or words from a file
unique_words = set(words)
anagram_dict = defaultdict(list)
for word in unique_words:
key = "".join(sorted(word))
anagram_dict[key].append(word)
for anagram_list in anagram_dict.values():
if len(anagram_list) > 1:
print(*anagram_list)
This will print (in arbitrary order):
god dog
steer trees reste
If you wanted to get the dictionary key value, you could make the final loop be over the items
rather than the values
of anagram_dict
(and you could print out words that don't have any anagrams like 'fred'
in the example above, if you wanted). Note that thanks to the set
, duplicates of words are not sorted multiple times.
Running time should be O(M + U*N*log(N))
where M
is the number of words, U
is the number of unique ones and N
is their average length. Unless you're sorting an organic chemistry textbook or something else that has lots of long words, it should be pretty close to linear in the length of the input.