Question

I have a Flume agent writing tweets to a HBase sink.

After a few seconds, transactions to the sink are failing and every 8-10 seconds I get these error messages in the Flume agent log telling me the transaction to HBase is failing.

The strange thing is that some tweets still get through and go into the HBase table. What could be causing this? This is running on a single node Cloudera Quickstart VM, could it be a problem with resources?

This is the agent log

9:20:44.618 PM  ERROR   org.apache.flume.SinkRunner     

Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Could not write events to Hbase. Transaction failed, and rolled back.
    at org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:245)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:662)

9:20:53.883 PM  ERROR   org.apache.flume.SinkRunner     

Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Could not write events to Hbase. Transaction failed, and rolled back.
    at org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:245)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:662)

These are some strange things in the debug log, maybe related?

2014-03-06 09:39:12,069 DEBUG org.apache.zookeeper.client.ZooKeeperSaslClient: Could not retrieve login configuration: java.lang.SecurityException: Unable to locate a login configuration

2014-03-06 09:39:12,298 DEBUG org.apache.zookeeper.ClientCnxn: An exception was thrown while closing send thread for session 0x144965080900029 : Unable to read additional data from server sessionid 0x144965080900029, likely server has closed socket

This is my agent configuration

TwitterAgent.sinks.HBaseTweet.channel = MemChannel
TwitterAgent.sinks.HBaseTweet.type = org.apache.flume.sink.hbase.AsyncHBaseSink
TwitterAgent.sinks.HBaseTweet.table = tweets
TwitterAgent.sinks.HBaseTweet.columnFamily = tweet
TwitterAgent.sinks.HBaseTweet.batchSize = 100
TwitterAgent.sinks.HBaseTweet.serializer = flume_hdfs.hbase.util.AsyncHbaseTwitterEventSerializer 
TwitterAgent.sinks.HBaseTweet.serializer.columns = tweet:id,tweet:created_at,tweet:source,tweet:favourited,tweet:text
TwitterAgent.sinks.HBaseTweet.serializer.delimiter = \\t

TwitterAgent.channels.MemChannel.type = memory
TwitterAgent.channels.MemChannel.capacity = 200
TwitterAgent.channels.MemChannel.transactionCapacity = 100

Some metrics from the log when stopping the agent, might be interesting

Component type: CHANNEL, name: MemChannel stopped

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.start.time == 1394093630078

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.stop.time == 1394093894804

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.capacity == 200

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.current.size == 125

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.attempt == 220

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.put.success == 209

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.attempt == 3059

Shutdown Metric for type: CHANNEL, name: MemChannel. channel.event.take.success == 9

Unable to deliver event. Exception follows.
org.apache.flume.EventDeliveryException: Could not write events to Hbase. Transaction failed, and rolled back.
    at org.apache.flume.sink.hbase.AsyncHBaseSink.process(AsyncHBaseSink.java:245)
    at org.apache.flume.sink.DefaultSinkProcessor.process(DefaultSinkProcessor.java:68)
    at org.apache.flume.SinkRunner$PollingRunner.run(SinkRunner.java:147)
    at java.lang.Thread.run(Thread.java:662)

Component type: SINK, name: HBaseTweet stopped

Shutdown Metric for type: SINK, name: HBaseTweet. sink.start.time == 1394093630407

Shutdown Metric for type: SINK, name: HBaseTweet. sink.stop.time == 1394093894833

Shutdown Metric for type: SINK, name: HBaseTweet. sink.batch.complete == 27

Shutdown Metric for type: SINK, name: HBaseTweet. sink.batch.empty == 0

Shutdown Metric for type: SINK, name: HBaseTweet. sink.batch.underflow == 7

Shutdown Metric for type: SINK, name: HBaseTweet. sink.connection.closed.count == 1

Shutdown Metric for type: SINK, name: HBaseTweet. sink.connection.creation.count == 1

Shutdown Metric for type: SINK, name: HBaseTweet. sink.connection.failed.count == 0

Shutdown Metric for type: SINK, name: HBaseTweet. sink.event.drain.attempt == 3053

Shutdown Metric for type: SINK, name: HBaseTweet. sink.event.drain.sucess == 9

HBase Regionserver error

2014-03-08 09:37:44,371 ERROR org.apache.hadoop.hbase.regionserver.HRegionServer: 
org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column family retweet does not exist in region tweets,,1394029330397.953f602dd0790637df8106720396f219. in table 'tweets', {NAME => 'entities', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'retweeted_status', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'tweet', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}, {NAME => 'user', DATA_BLOCK_ENCODING => 'NONE', BLOOMFILTER => 'NONE', REPLICATION_SCOPE => '0', COMPRESSION => 'NONE', VERSIONS => '3', TTL => '2147483647', MIN_VERSIONS => '0', KEEP_DELETED_CELLS => 'false', BLOCKSIZE => '65536', ENCODE_ON_DISK => 'true', IN_MEMORY => 'false', BLOCKCACHE => 'true'}
    at org.apache.hadoop.hbase.regionserver.HRegion.checkFamily(HRegion.java:5475)
    at org.apache.hadoop.hbase.regionserver.HRegion.checkFamilies(HRegion.java:3022)
    at org.apache.hadoop.hbase.regionserver.HRegion.internalPut(HRegion.java:2900)
    at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:2083)
    at org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:2239)
    at sun.reflect.GeneratedMethodAccessor17.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:323)
    at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1428)
Was it helpful?

Solution

The error message from the HBase log suggests that there is a schema mismatch, in particular the agent expects there to be a column family named retweet, whereas the schema actually specifies retweeted_status.

The solution is either to recompile the agent to use the correct column family name, or change the schema to use the name expected by the agent. I don't know what fix is more correct; if you defined this schema on your own, then most likely you can just change the column family name. But if the schema was defined externally (i.e.: by some script or by following specific instructions from somewhere), renaming a column family may break something else that depends on the name being retweeted_status. In that case, the source code of Twitter_HBase_Impala should be fixed to use the correct name.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top