This block contains two instances of horridly inefficient coding:
if key not in masterIndex.keys():
masterIndex[key]= partialIndex[key]
elif key in masterIndex.keys():
masterIndex[key].extend(partialIndex[key])
First, the key is or is not in masterIndex
, so there's no useful point at all to the elif
test. If the not in
test fails, the in
test must succeed. So this code does the same:
if key not in masterIndex.keys():
masterIndex[key]= partialIndex[key]
else:
masterIndex[key].extend(partialIndex[key])
Second, masterIndex
is a dict
. Dicts support very efficient membership testing without any help from you ;-) By materializing its keys into a list (via .keys()
), you're changing what should be a blazing fast dict lookup into a horridly slow linear search over a list. So do this instead:
if key not in masterIndex:
masterIndex[key]= partialIndex[key]
else:
masterIndex[key].extend(partialIndex[key])
The code will run much faster then.