I am using ElasticSearch to index some data. But I found that the performance is not that efficiency.
There are only 3000 entries data and each data has 6 columns. It costs 5 mins to index these 3000 entries.
Because I am new with ElasticSearch, my code and program flow are basic as following:
- search and check is there any same data with it.
- if there is same data, then update.
- If not, then add.
The code is following:
conn = pyes.ES('server:9200')
Search:
searchResult = conn.search(searchDict, indexName, TypeName)
Index
conn.index(storeDict, indexName, TypeName, id)
Update the Count in the index data.
conn.partial_update(indexName, TypeName, id, "ctx._source.Count += counter", params={"counter" : 1})
Is there any method that can improve the performance of my code ?
Thank you for your help.