Question

I uses EHCache + JGroups to replicate the cache of my webapps on 3 tomcat instances.

<!-- Use jgroups (UDP) to replicate cache among the cluster -->
    <cacheManagerPeerProviderFactory
        class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory"
properties="channelName=EH_CACHE_STA::connect=UDP(mcast_addr=229.10.10.10;mcast_port=45567;):PING:MERGE2:FD_SOCK:VERIFY_SUSPECT:pbcast.NAKACK:UNICAST:pbcast.STABLE:FRAG:pbcast.GMS"
        propertySeparator="::" />

Sometimes a tomcat instance don't restart. In the jgroups logs I can see :

[webapp] WARN  2012-12-14 15:36:55,784 [GMS] : join(tc-fr-sta-tomcat1-32427) sent to b0dc40aa-12aa-4045-01e4-c80b013dbb13 timed out (after 5000 ms), retrying
[webapp] WARN  2012-12-14 15:36:55,785 [UDP] : tc-fr-sta-tomcat1-32427: no physical address for b0dc40aa-12aa-4045-01e4-c80b013dbb13, dropping message

It seems the node try to join himself ???! We have to restart all tomcat in production to restore the cluster. Anybody can help me to resolve this issue ?

Was it helpful?

Solution

Which version of JGroups is this running with (java -jar jgroups.jar) ? I recommend to run with the latest stable version. Also, set timer_type="old" in UDP.

In addition, it would be better if ehcache allowed for a JGroups config to be defined in an XML file, perhaps the latest version does this ? (I'm not an ehcache expert). Cheers, Bela

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top