Question

We are trying to use the HornetQ store and forward mechanism... however forwarding messages from one standalone HornetQ instance to another using the core bridge is very slow. We have not been able to increase the throughput rate above 200 messages per second.

The surprising fact is that if we point the same client (that was publishing messages to the forwarding HornetQ instance) directly at the destination HornetQ instance, we start observing a throughput rate of over 1000 messages per second (this client is JMS based). This basically means that the core bridge that's been configured between the Forwarding HornetQ instance and the Destination HornetQ instance is problematic.

The following are the relevant sections for configuring the core bridge on the Forwarding HornetQ:

<connectors>
            <connector name="netty-bridge">
                 <factory-class>org.hornetq.core.remoting.impl.netty.NettyConnectorFactory</factory-class>
                 <param key="host" value="destination.xxx.com"/>
                 <param key="port" value="5445"/>
                 <param key="batch-delay" value="50"/>
                 <param key="tcp-send-buffer-size" value="1048576"/>
                 <param key="tcp-receive-buffer-size" value="1048576"/>
                 <param key="use-nio" value="true"/>
           </connector>
</connectors>
<address-settings>
      <address-setting match="jms.queue.Record">
                <dead-letter-address>jms.queue.RecordDLQ</dead-letter-address>
                <max-size-bytes>262144000</max-size-bytes>
                <page-size-bytes>10485760</page-size-bytes>
                <address-full-policy>PAGE</address-full-policy>
        </address-setting>
</address-settings>
<queues>
         <queue name="jms.queue.Record">
                  <address>jms.queue.Record</address>
         </queue>
</queues>
<bridges>
        <bridge name="core-bridge">
                <queue-name>jms.queue.Record</queue-name>
                <forwarding-address>jms.queue.Record</forwarding-address>
                <retry-interval>1000</retry-interval>
                <retry-interval-multiplier>1.0</retry-interval-multiplier>
                <reconnect-attempts>-1</reconnect-attempts>
                <confirmation-window-size>10485760</confirmation-window-size>
                <static-connectors>
                        <connector-ref>netty-bridge</connector-ref>
                </static-connectors>
        </bridge>
</bridges>

The following are the relevant sections for configuring the core bridge on the Destination HornetQ:

<acceptors>
      <acceptor name="netty">
        <factory-class>org.hornetq.core.remoting.impl.netty.NettyAcceptorFactory</factory-class>
         <param key="host"  value="${hornetq.remoting.netty.host:192.168.2.xxx}"/>
         <param key="port"  value="${hornetq.remoting.netty.port:xxxx}"/>
         <param key="tcp-send-buffer-size"  value="1048576"/>
         <param key="tcp-receive-buffer-size"  value="1048576"/>
         <param key="use-nio"  value="true"/>
         <param key="batch-delay"  value="50"/>
         <param key="use-nio"  value="true"/>
      </acceptor>
<acceptors>
<address-settings>
          <address-setting match="jms.queue.Record">
                    <dead-letter-address>jms.queue.RecordDLQ</dead-letter-address>
                    <max-size-bytes>262144000</max-size-bytes>
                    <page-size-bytes>10485760</page-size-bytes>
                    <address-full-policy>PAGE</address-full-policy>
            </address-setting>
    </address-settings>
    <queues>
             <queue name="jms.queue.Record">
                      <address>jms.queue.Record</address>
             </queue>
    </queues>

All system variables (CPU/Memory/Disk IO/Network/etc.) are underutilized and there are no errors in the logs.

Note: We have tried with both NIO as well as the legacy/old IO. This has been tried both with HornetQ-2.2.5-Final and HornetQ-2.2.8-GA (2.2.8-GA was built from source)

Any idea as to what might be causing this issue and what the resolution might be?

Other observations: It looks like the messages are being sent through the core bridge are transactional... so is it possible to batch these transactions and have the communication between the two HornetQ instances happen asynchronously?

Was it helpful?

Solution

OK .. I figured this out for myself.

When the Forwarding HornetQ creates a bridge, it internally uses only one thread for sending the messages over the bridge and opens only one connection to the Destination HornetQ. As such, it is not able to take advantage of multiple processors and is also limited by the network (latency / bandwidth / rtt) and is not able to effectively parallelize the sending of messages. As such, if you have a high throughput, you start hitting a cap (in our case around 200 messages per second). You can increase this by tweaking the HornetQ Connector and Acceptor parameters (like the TCP send and receive buffer sizes) and Bridge Settings (confirmation window size) but that can only take you so long (we got the throughput up to around 300 messages per second).

The solution - create multiple bridges between the same pair of Forwarding and Destination HornetQ instances (involving the same queues). This effectively parallelizes the transfer of messages and thus increases the throughput. Creating three bridges almost tripled the throughput to 870 messages per second.

JBoss needs to ideally make this parallelization configurable in the core bridge.

OTHER TIPS

I believe you were using 2.2.5 (It's not clear from your post what version you were using) which had a bug on the bridges causing the issue you were saying.

At some version the bridge was sending messages synchronously instead of counting on the async confirmations.

Take a look at how it would behave on the latest version.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top