Question

I have two questions. And one of them will be ragarding the topic:)

1) I have faced with a problem of not being able to find the full information about how different garbage collector work in HotSpot. But I am not talking about general descriptions of garbage collector's work (we have a lot of this information on the internet), I am talking about concrete algorithms. I have found this whitepaper (Memory managment in the Java HotSpot Virtual Machine) http://www.oracle.com/technetwork/java/javase/tech/memorymanagement-whitepaper-1-150020.pdf . But it has only general ideas. It has a good description (probably it is not so good - see my second question) of the Parallel compacting algorithm (I mean parallel mark-sweep-compact) but it doesn't explain other garbage collector's algorithms. However this whitepaper is the best information I was able to find on the Internet. What I would like to know is where to get a full description/information about how different garbage collectors (for young gen I mean: ParNew, DefNew, PSYoungGen; for old gen: PSOLdGen, ParOldGen, Concurrent-Mark-Sweep) work. Can't believe that this information is not available to the users.

2) The question regarding the Parallel Compacting Collector algorithm (ParOldGen or Parallel Mark-Sweep-Compact). The whitepaper (see the first question) has a description of it's work. Let me paste a quote from the whitepaper (please, spend a minute to take a look on it): enter image description here

The things I cannot understand are listed below:

Regarding summary phase:

  • Due to compactions from previous collections, it is typical that some portion of the left side of each generation will be dense, containing mostly live objects. The amount of space that could be recovered from such dense regions is not worth the cost of compacting them.

well, does it mean that when we have a region which consist of 98-99% of live objects and 2-1% of dead objects (in other words a very small percentage of the dead objects) than compacting of this region is not worth the space that could be recovered from such a region. However this tiny free spaces (holes) will be eventually filled up and there will be no holes after garbage collections is finished.

  • So the first thing the summary phase does is examine the density of the regions, starting with the leftmost one, until it reaches a point where the space that could be recovered from a region and those to the right of it is worth the cost of compacting those regions.

well, if we have a big percentage of the dead objects than this region is worth compacting, right?

  • The regions to the left of that point are referred to as the dense prefix, and no objects are moved in those regions.

"and no objects are moved in those regions", but there might be some little free spaces in those regions, am I right? Can't uderstand the point

  • The regions to the right of that point will be compacted, eliminating all dead space.

please clarify how they will be compacted. Every region will be compacted separately? I guess no. So maybe some kind of shifting will be here?

  • The summary phase calculates and stores the new location of the first byte of live data for each compacted region.

to understand it I need to understand the previous question I suppose.

Regarding compaction phase:

  • In the compaction phase, the garbage collection threads use the
    summary data to identify regions that need to be filled, and the
    threads can independently copy data into the regions. This produces a heap that is densely packed on one end, with a single large empty
    block at the other end.

I am totally confused. So no compaction happened on the "summary phase"? Was the previous phase's purpose only to find all free spaces?

Please help me to get a clear picture.

Was it helpful?

Solution

This is just a general description of an algorithm. Such descriptions can be of different detail. In this case, it gives you most details, but still leaves some choices for the implementer.

Regarding your questions:

  1. So no compaction happened on the "summary phase"? Was the previous phase's purpose only to find all free spaces? - Yes, that is correct. The summary phase gathers indexing data and basically determines everything necessary, so that the compaction phase can then perform the copying. They don't tell how they implement compaction, but the default way is simply placing every live object right next to the previous object. Basically, all empty space is removed and after the compaction step has completed, you have one contiguous chunk of memory, containing all live objects. I see your confusion with the fourth part, but note that it is written in future tense: 'will be compacted' - So not during summary, but later.
  2. Does it mean [...] compacting of this region in not worth the space that could be recovered from such a region? Yes, that is right. You essentially lose some space, but it is very common to trade of memory for execution speed. The exact density threshold is up to the implementation, but I would ballpark the used-to-total-memory ratio threshold at around 70-90%.

If you want to know all the dirty details, have a look at an open source VM implementations, as suggested in the comments.

OTHER TIPS

If you really need a detailed understanding of hoe the collectors work you can read the code. The reason you don't find many detailed pages on this is that the collectors are designed to take care of memory management for you are if you start worrying about the details you have gone down the wrong road.

The best solution is to use a memory profiler and reduce your allocation rate. No amount of tuning or messing with your command line options (unless you have a mis-configured GC) will compare to reducing this allocation rate.

However to respond to your questions.

parallel mark-sweep-compact

There is no such this. The is the Parallel Collector which compacts and the Concurrent Mark Sweep which doesn't. There is also a G1 collector which is not generational in the same way. i.e. it collects both young and old objects.

Can't believe that this information is not eligible for the users.

By design, developer don't need to know this much detail. Nor is it a good idea to over tune your application because this makes it very brittle to changes in the application or JVM.

What I would like to know is where to get a full description/information about how different garbage collectors

Instead of saying, I would like to know everything there is to know (there is 500+ options alone) without having to read the code, you should try to solve a specific problem, and ask a specific question.

Every region will be compacted separately? I guess no. So maybe some kind of shifting will be here?

Only the tenured space is compacted. The young regions are copied repeatedly and never need compacting.

So no compaction happened on the "summary phase"? Was the previous phase's purpose only to find all free spaces?

The compaction phase does a best effort copy to one end of the region without defragmenting it completely. This leave one end with some objects (mostly large ones I imagine) and the other end very dense.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top