Question

All!

I found strange code in LinkedBlockingQueue:

private E dequeue() {
        // assert takeLock.isHeldByCurrentThread();
        Node<E> h = head;
        Node<E> first = h.next;
        h.next = h; // help GC
        head = first;
        E x = first.item;
        first.item = null;
        return x;
}

Who can explain why do we need local variable h? How can it help for GC?

Was it helpful?

Solution

To better understand what happens let's see how the list looks like after executing the code. First consider an initial list:

1 -> 2 -> 3

Then h points to head and first to h.next:

1 -> 2 -> 3
|    |
h    first

Then h.next points to h and head points to first:

1 -> 2 -> 3
|   / \
h head first

Now, practically we know that there is only active reference pointing to the first element, which is by itself (h.next = h), and we also know that the GC collects objects that have no more active references, so when the method ends, the (old) head of the list ca be safely collected by the GC since h exists only within the scope of the method.

Having said this, it was pointed out, and I agree with this, that even with the classic dequeue method (i.e. just make first point to head.next and head point to first) there are no more references pointing towards the old head. However, in this scenario, the old head is left dangling in memory and still has its next field pointing towards first, whereas in the code you posted, the only thing left is an isolated object pointing to itself. This may be triggering the GC to act faster.

OTHER TIPS

If you look at the jsr166 src then you will find the offending commit

http://gee.cs.oswego.edu/cgi-bin/viewcvs.cgi/jsr166/src/main/java/util/concurrent/LinkedBlockingQueue.java?view=log (see v 1.51)

This shows the answer is in this bug report

http://bugs.sun.com/view_bug.do?bug_id=6805775

The full discussion is in this thread

http://thread.gmane.org/gmane.comp.java.jsr.166-concurrency/5758

The "helping GC" bit is about avoiding things bleeding into tenured.

Cheers

Matt

Maybe a bit late, but the current explanation is completely unsatisfactory to me and I think I've got a more sensible explanation.

First of all every java GC does some kind of tracing from a root set one way or another. This means that if the old head is collected we won't read the next variable anyhow - there's no reason to do so. Hence IF head is collected in the next iteration it doesn't matter.

The IF in the above sentence is the important part here. The difference between setting next to something different doesn't matter for collecting head itself, but may make a difference for other objects.

Let's assume a simple generational GC: If head is in the young set, it will be collected in the next GC anyhow. But if it's in the old set it will only be collected when we do a full GC which happens rarely.

So what happens if head is in the old set and we do a young GC? In this case the JVM assumes that every object in the old heap is still alive and adds every reference from old to young objects to the root set for the young GC. And that's exactly what the assignment avoids here: Writing into the old heap is generally protected with a write barrier or something so that the JVM can catch such assignments and handle them correctly - in our case it removes the object next pointed to from the root set which does have consequences.

Short example:

Assume we have 1 (old) -> 2 (young) -> 3 (xx). If we remove 1 and 2 now from our list, we may expect that both elements would be collected by the next GC. But if only a young GC occurs and we have NOT removed the next pointer in old, both elements 1 and 2 won't be collected. Contrary to this if we have removed the pointer in 1, 2 will be collected by the young GC..

Here's a code sample that illustrates the question: http://pastebin.com/zTsLpUpq. Performing a GC after runWith() and taking a heap dump for both versions says there's only one Item instance.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top