Firstly, 2000 * 100 chars is'nt that big, you could use a set directly.
Secondly, if your strings are sorted, there is a quick way (which I found here)to compare them, as follows:
def compare(E1, E2):
i, j = 0, 0
I, J = len(E1), len(E2)
while i < I:
if j >= J or E1[i] < E2[j]:
print(E1[i], "is not in E2")
i += 1
elif E1[i] == E2[j]:
print(E1[i], "is in E2")
i, j = i + 1, j + 1
else:
j += 1
It is certainly slower than using a set, but it doesn't need the strings to be hold into memory (only two are needed at the same time).
For the Levenshtein thing, there is a C module which you can find on Pypi, and which is quite fast.