Domanda

I am trying to use flume to use the Twitter Stream API and index the tweet to my elasticsearch. I setup my flume.conf to use com.cloudera.flume.source.TwitterSource as twitter source (with my dev tokens) and I use the default elastisearch for the sink.

I am able to get the tweets (because I also save it into HDFS, and when I open the file I can see the tweets) but when i search into my elasticsearch, I get as response :

 {
      _index: twitter-2014-02-14
      _type: tweet-rt
      _id: ilL5ZrBRSlqrZcsVUbnO-g
      _version: 1
      _score: 1
      _source: {
      @message: org.elasticsearch.common.xcontent.XContentBuilder@12da4409
      @timestamp: 2014-02-14T10:16:13.000Z
      @fields: {
      timestamp: 1392372973000
      }

  }

here example of my flume config.

# - ElasticSearch Sink                                                                                                                                
TwitterAgent.sinks.ES.type = elasticsearch
TwitterAgent.sinks.ES.channel = FileChannel
TwitterAgent.sinks.ES.hostNames = 192.168.10.100:9300
TwitterAgent.sinks.ES.indexName = twitter
TwitterAgent.sinks.ES.indexType = tweet-rt
TwitterAgent.sinks.ES.clusterName = testou

Do I have to add something else ? I dont understand why ES cannot deserialize my tweet.

Any ideas?

thankyou

È stato utile?

Soluzione

This is weird. It's doing some form of identityHashCode on the XContentBuilder to get that message and it should not.

I think I'd recommend clearing out Flume and re-installing. I'd be concerned about classpath and JAR dependency issues.

What version of Flume?

Altri suggerimenti

For others who come across this error, this is a bug in flume elastic search sink which has been fixed now. See https://issues.apache.org/jira/browse/FLUME-2126

If you are on flume version earlier than 1.6 you may want to cherry pick and build one with this patch against your version.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top