Grouping items in a list using python defaultdic

Question

Your code has few issues

There are duplicates in your GO_ID, and you seem to only care about unique. So you need a defaultdict(set) instead of defaultdict(list)
Your split algorithm to generate the key and value is buggy
GO_dict[gene_id] = GO_id, simply assigns the last value to the dict instead of appending it.

A possible corrected solution

>>> GO_dict = defaultdict(set)
>>> for GO_names in GO_file:
   gene_id,_,GO_id = GO_names.partition(" ")
   gene_id = gene_id.split("_")[0]
   GO_dict[gene_id].add(GO_id)


>>> print GO_dict
defaultdict(<type 'set'>, {'A': set(['13', '12', '14']), 'B': set(['1', '5'])})

One possible problem with the above code is, the order of the elements are not guaranteed. Unfortunately the default library does not provide an OrderedSet, but we can easily customize OrderedDict to server our purpose

>>> GO_dict = defaultdict(OrderedDict)
>>> for GO_names in GO_file:
   gene_id,_,GO_id = GO_names.partition(" ")
   gene_id = gene_id.split("_")[0]
   GO_dict[gene_id][GO_id] = None


>>> OrderedDict([('A', ['12', '13', '14']), ('B', ['1', '5'])])
OrderedDict([('A', ['12', '13', '14']), ('B', ['1', '5'])])

But

There are cases, as this one I believe, where the itertools solution is more elegant than using defaultdict

>>> from itertools import groupby
>>> from operator import itemgetter
>>> GO_file_kv = [(key.split("_")[0], value) 
                   for key, value in (elem.split(" ") for elem in GO_file)]
>>> {key: OrderedDict.fromkeys([e for _, e in value]).keys()
     for key, value in groupby(sorted(GO_file_kv, key=itemgetter(0)),
                       key=itemgetter(0))
 }
{'A': ['12', '13', '14'], 'B': ['1', '5']}