Given the example:
>>> from gensim import corpora
>>> docs = ["this is a foo bar", "you are a foo"]
>>> texts = [[i for i in doc.lower().split()] for doc in docs]
>>> print texts
[['this', 'is', 'a', 'foo', 'bar'], ['you', 'are', 'a', 'foo']]
>>> dictionary = corpora.Dictionary(texts)
>>> dictionary.save('foobar.txtdic')
If you use the gensim.corpora.dictionary.save_as_text()
(see https://github.com/piskvorky/gensim/blob/develop/gensim/corpora/dictionary.py), you should have got the below text file:
0 a 2
5 are 1
1 bar 1
2 foo 2
3 is 1
4 this 1
6 you 1
If you use the default gensim.corpora.dictionary.save()
, it saves into a pickled binary file. See class SaveLoad(object)
in https://github.com/piskvorky/gensim/blob/develop/gensim/utils.py
For information on pickle
, see http://docs.python.org/2/library/pickle.html#pickle-example