Question

I have a java program that downloads information and parses it, storing results in a MySQL database. It can be started multiple times, up to 3 at the same time.

I was finding that the server load was going sky high (> 10) at certain points and while I still need to tune my MySQL server, I found that the java programs were halting or GC a lot and the server was grinding to an absolute stop.

After some reading of GC configuration articles, I'm using the following parameters with the intention of reducing the number of full GC sweeps that take place (or at least stop them halting the program):

-d64 
-Xms256m 
-Xmx256m 
-XX:PermSize=128m 
-XX:MaxPermSize=128m 
-server 
-XX:+PrintGCTimeStamps 
-Xloggc:/log/gc_$1_$2.log 
-XX:+UseConcMarkSweepGC 
-verbose:gc 
-XX:-UseAdaptiveSizePolicy

However, now I'm seeing almost constant Full GC on each run. Here's an excerpt of the log:

1057.920: [Full GC 30851K->29270K(258880K), 0.1528210 secs]
1058.965: [Full GC 30303K->29297K(258880K), 0.2060580 secs]
1062.652: [Full GC 29839K->29353K(258880K), 0.2875420 secs]
1063.639: [Full GC 30894K->29356K(258880K), 0.1839510 secs]
1070.604: [Full GC 33956K->29262K(258880K), 0.3445180 secs]
1072.146: [Full GC 29770K->29262K(258880K), 0.2201660 secs]
1074.600: [Full GC 30295K->29321K(258880K), 0.1319390 secs]
1081.234: [Full GC 31373K->29293K(258880K), 0.1931140 secs]
1086.498: [Full GC 30591K->29347K(258880K), 0.2624270 secs]

There's a Full GC pretty much every second. Have I chosen the wrong parameters or is there something else I can use to stop this?

Update:

Here's the first 10mins of a run's gc log output with -XX:+PrintGCDetails. I also removed -XX:-UseAdaptiveSizePolicy

4.794: [GC 5.367: [ParNew: 34112K->4224K(38336K), 0.0814000 secs] 34112K->8605K(520064K), 0.0816920 secs] [Times: user=0.04 sys=0.03, real=0.65 secs] 
5.843: [GC 5.843: [ParNew: 38336K->4224K(38336K), 0.0371760 secs] 42717K->15987K(520064K), 0.0372770 secs] [Times: user=0.02 sys=0.01, real=0.04 secs] 
6.153: [GC 6.153: [ParNew: 38336K->4224K(38336K), 0.0333900 secs] 50099K->24554K(520064K), 0.0334970 secs] [Times: user=0.02 sys=0.02, real=0.04 secs] 
15.668: [GC 15.668: [ParNew: 38336K->4224K(38336K), 0.0471250 secs] 58666K->30766K(520064K), 0.0472230 secs] [Times: user=0.06 sys=0.00, real=0.05 secs] 
220.547: [Full GC (System) 220.548: [CMS: 26542K->31455K(481728K), 3.3996650 secs] 57025K->31455K(520064K), [CMS Perm : 20869K->20848K(131072K)], 3.3998940 secs] [Times: user=0.38 sys=0.05, real=3.40 secs] 
448.238: [Full GC (System) 448.264: [CMS: 31455K->31236K(481728K), 2.3479700 secs] 32783K->31236K(520064K), [CMS Perm : 20898K->20895K(131072K)], 2.5008180 secs] [Times: user=0.30 sys=0.03, real=2.53 secs] 
456.217: [Full GC (System) 456.217: [CMS: 31236K->31177K(481728K), 0.2216600 secs] 33856K->31177K(520064K), [CMS Perm : 20912K->20910K(131072K)], 0.2218030 secs] [Times: user=0.21 sys=0.00, real=0.22 secs] 
480.982: [Full GC (System) 480.982: [CMS: 31177K->31251K(481728K), 0.2026140 secs] 32586K->31251K(520064K), [CMS Perm : 20915K->20912K(131072K)], 0.2027550 secs] [Times: user=0.19 sys=0.00, real=0.20 secs] 
483.419: [Full GC (System) 483.419: [CMS: 31251K->31196K(481728K), 0.2436580 secs] 33189K->31196K(520064K), [CMS Perm : 20928K->20926K(131072K)], 0.2438080 secs] [Times: user=0.23 sys=0.00, real=0.24 secs] 
484.583: [Full GC (System) 484.583: [CMS: 31196K->31233K(481728K), 0.2706090 secs] 31842K->31233K(520064K), [CMS Perm : 20980K->20973K(131072K)], 0.2707580 secs] [Times: user=0.25 sys=0.01, real=0.27 secs] 
486.361: [Full GC (System) 486.424: [CMS: 31233K->31235K(481728K), 0.2614690 secs] 32866K->31235K(520064K), [CMS Perm : 20994K->20993K(131072K)], 0.2618060 secs] [Times: user=0.21 sys=0.02, real=0.33 secs] 
521.280: [Full GC (System) 521.280: [CMS: 31235K->31163K(481728K), 0.2190910 secs] 34502K->31163K(520064K), [CMS Perm : 21010K->21009K(131072K)], 0.2192190 secs] [Times: user=0.15 sys=0.00, real=0.22 secs] 
531.169: [Full GC (System) 531.169: [CMS: 31163K->29453K(481728K), 0.2923910 secs] 32492K->29453K(520064K), [CMS Perm : 21015K->20845K(131072K)], 0.2925890 secs] [Times: user=0.27 sys=0.01, real=0.29 secs] 
532.270: [Full GC (System) 532.271: [CMS: 29453K->29384K(481728K), 0.2542390 secs] 32073K->29384K(520064K), [CMS Perm : 20852K->20850K(131072K)], 0.2780520 secs] [Times: user=0.20 sys=0.01, real=0.28 secs] 
533.320: [Full GC (System) 533.321: [CMS: 29384K->29379K(481728K), 0.2429570 secs] 30030K->29379K(520064K), [CMS Perm : 20853K->20851K(131072K)], 0.2430970 secs] [Times: user=0.17 sys=0.00, real=0.24 secs] 
542.963: [Full GC (System) 542.963: [CMS: 29379K->29443K(481728K), 0.2208060 secs] 30708K->29443K(520064K), [CMS Perm : 20854K->20851K(131072K)], 0.2209180 secs] [Times: user=0.11 sys=0.01, real=0.22 secs] 
544.949: [Full GC (System) 544.949: [CMS: 29443K->29405K(481728K), 0.1397210 secs] 31382K->29405K(520064K), [CMS Perm : 20855K->20853K(131072K)], 1.0028950 secs] [Times: user=0.14 sys=0.00, real=1.00 secs] 
546.517: [Full GC (System) 546.517: [CMS: 29405K->29464K(481728K), 0.2366270 secs] 30051K->29464K(520064K), [CMS Perm : 20858K->20857K(131072K)], 0.2367790 secs] [Times: user=0.20 sys=0.00, real=0.23 secs] 
547.508: [Full GC (System) 547.508: [CMS: 29464K->29467K(481728K), 0.1974860 secs] 30756K->29467K(520064K), [CMS Perm : 20859K->20858K(131072K)], 0.1976260 secs] [Times: user=0.17 sys=0.01, real=0.19 secs] 
560.975: [Full GC (System) 560.975: [CMS: 29467K->29382K(481728K), 0.1750130 secs] 32092K->29382K(520064K), [CMS Perm : 20865K->20864K(131072K)], 0.1751500 secs] [Times: user=0.17 sys=0.00, real=0.17 secs] 
573.975: [Full GC (System) 573.975: [CMS: 29382K->29451K(481728K), 0.2862430 secs] 30711K->29451K(520064K), [CMS Perm : 20868K->20865K(131072K)], 0.2863970 secs] [Times: user=0.19 sys=0.00, real=0.28 secs] 
575.101: [Full GC (System) 575.101: [CMS: 29451K->29381K(481728K), 0.1325170 secs] 31390K->29381K(520064K), [CMS Perm : 20869K->20867K(131072K)], 0.1326370 secs] [Times: user=0.13 sys=0.00, real=0.14 secs] 
580.396: [Full GC (System) 580.396: [CMS: 29381K->29454K(481728K), 0.1913770 secs] 30709K->29454K(520064K), [CMS Perm : 20875K->20872K(131072K)], 0.1928260 secs] [Times: user=0.19 sys=0.00, real=0.19 secs] 
582.300: [Full GC (System) 582.300: [CMS: 29454K->29403K(481728K), 0.1238430 secs] 31392K->29403K(520064K), [CMS Perm : 20875K->20873K(131072K)], 0.1239580 secs] [Times: user=0.12 sys=0.00, real=0.13 secs] 
582.982: [Full GC (System) 582.982: [CMS: 29403K->29457K(481728K), 0.1238720 secs] 30049K->29457K(520064K), [CMS Perm : 20875K->20874K(131072K)], 0.1239860 secs] [Times: user=0.12 sys=0.00, real=0.13 secs] 
584.284: [Full GC (System) 584.284: [CMS: 29457K->29461K(481728K), 0.1820700 secs] 31431K->29461K(520064K), [CMS Perm : 20875K->20874K(131072K)], 0.1823370 secs] [Times: user=0.16 sys=0.01, real=0.18 secs] 
592.363: [Full GC (System) 592.364: [CMS: 29461K->29385K(481728K), 0.2151030 secs] 32081K->29385K(520064K), [CMS Perm : 20892K->20890K(131072K)], 0.2153350 secs] [Times: user=0.18 sys=0.02, real=0.22 secs] 
599.096: [Full GC (System) 599.096: [CMS: 29385K->29408K(481728K), 0.1319160 secs] 30713K->29408K(520064K), [CMS Perm : 20894K->20890K(131072K)], 0.1320350 secs] [Times: user=0.13 sys=0.00, real=0.14 secs]
Was it helpful?

Solution

First, use -XX:+PrintGCDetails to see the full set of GC stats for each collection. Then you should compute the size of Live Data Set, Allocation Rate, and Promotion Rate. This will tell you if you need a bigger Heap or if your eg. Young Gen is too small. See Advanced JVM Tuning for tips howto compute these stats. After you have these stats, you can Tune your Young generation with -XX:InitialSurvivorRatio=3 -XX:SurvivorRatio=3 -XX:TargetSurvivorRatio=90 and sizes with e.g. -Xmn -Xmx -Xms. Adjust the numbers to your Allocation/Promotion rate.

Don't use -XX:-UseAdaptiveSizePolicy just yet. This option is to disable automatic GC ergonomics so that you can tune the Heap parameters. But if you do not know the allocation behavior of your app, then this parameter won't help. Further, the Adaptive Size Policy is enabled only for the Throughput collector (-XX:+UseParallelOldGC).

Also, If you want to tune your GC for throughput, you should be using the -XX:+UseParallelOldGC and e.g. -XX:ParallelGCThreads=4. This will enable multithreaded Young/Old Generation GCs (provided that you run on multicore).

Further, setup the tenuring with -XX:MaxTenuringThreshold=15 and check with -XX:+PrintTenuringDistribution.

UPDATE:

Looking at the logs, please, could you use also -XX:+DisableExplicitGC, because Full GC (System) means that somebody is calling System.gc().

OTHER TIPS

Memory is cheap, and a young gen is much cheaper to do than an old gen GC.

Try maybe 2 or 4 GB heap... should be much fewer full GCs (maybe 0 if your app is well behaved enough). A parnew should only take a few millis.

Based on those parameters, you shouldn't be hitting a full GC based on your old generation running low on space, as the old gen should have a capacity of over 200MB, and you're only apparently using 30MB.

The CMS collector can give you more information about what's going on if you use the PrintGCDetails flag. Add this to your parameters: -XX:+PrintGCDetails.

My first thought was that you would fine that the full GC's are caused by concurrent mode failures. Some reading about those:

Edit:

Based on your detailed GC logs, and some further looking about, it seems that when the logs says Full GC (System), it's an indication of an explicit GC invocation. If you're not calling it yourself, perhaps a library you're using is?

The answer there refers to making a custom Runtime class that can help you track down who's calling GC. One way to do that in Java 6 is to follow these instructions to add things to your bootclasspath.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top