Question

It seems like a pretty easy question, but for some reason I still can't understand how to solve the same. I have an elastic search cluster which is using twitter river to download tweets. I would like to implement a sentiment analysis module which takes each tweet and computes a score (+ve/-ve) etc. I would like the score to be computed for each of the existing tweets as well as for new tweets and then visualize using Kibana.

However, I am not sure where should I place the call to this sentiment analysis module in the elastic search pipeline.

I have considered the option of modifying twitter river plugin but that will not work retrospectively.

Essentially, I need to answer two questions :- 1) how to call python/java code while indexing a document so that I can modify the json accordingly. 2) how to use the same code to modify all the existing documents in ES.

Était-ce utile?

La solution

If you don't want an external application to do the analysis before indexing the documents in Elasticsearch, the best way I guess is to write a plugin that does it. You can write a plugin that implements a custom analyzer that does the sentiment analysis. Then in the mapping define the fields you want to run your analyzer on.

See examples of analysis plugins - https://github.com/barminator/elasticsearch-analysis-annotation https://github.com/yakaz/elasticsearch-analysis-combo/

To run the analysis on all existing documents you will need to reindex them after defining the correct mapping.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top