Tomcat waiting for join jgroups cluster

https://stackoverflow.com/questions/13881735

08-12-2021
|

Question

I uses EHCache + JGroups to replicate the cache of my webapps on 3 tomcat instances.

<!-- Use jgroups (UDP) to replicate cache among the cluster -->
    <cacheManagerPeerProviderFactory
        class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"
properties="channelName=EH_CACHE_STA::connect=UDP(mcast_addr=229.10.10.10;mcast_port=45567;):PING:MERGE2:FD_SOCK:VERIFY_SUSPECT:pbcast.NAKACK:UNICAST:pbcast.STABLE:FRAG:pbcast.GMS"
        propertySeparator="::" />

Sometimes a tomcat instance don't restart. In the jgroups logs I can see :

[webapp] WARN  2012-12-14 15:36:55,784 [GMS] : join(tc-fr-sta-tomcat1-32427) sent to b0dc40aa-12aa-4045-01e4-c80b013dbb13 timed out (after 5000 ms), retrying
[webapp] WARN  2012-12-14 15:36:55,785 [UDP] : tc-fr-sta-tomcat1-32427: no physical address for b0dc40aa-12aa-4045-01e4-c80b013dbb13, dropping message

It seems the node try to join himself ???! We have to restart all tomcat in production to restore the cluster. Anybody can help me to resolve this issue ?

Solution

Which version of JGroups is this running with (java -jar jgroups.jar) ? I recommend to run with the latest stable version. Also, set timer_type="old" in UDP.

In addition, it would be better if ehcache allowed for a JGroups config to be defined in an XML file, perhaps the latest version does this ? (I'm not an ehcache expert). Cheers, Bela

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow