With double-checked locking, does a put to a volatile ConcurrentHashMap have happens-before guarantee?

Question 1

Yes, it is correct. volatile protects only that object reference, but nothing else.

No, putting an element to a volatile HashMap will not create a happens-before relationship, not even with a ConcurrentHashMap.

Actually ConcurrentHashMap does not hold lock for read operations (e.g. containsKey()). See ConcurrentHashMap Javadoc.

Update:

Reflecting your updated question: you have to synchronize on the object you put into the CHM. I recommend to use a container object instead of directly storing the Object in the map:

public class ObjectContainer {
    volatile boolean isSetupDone = false;
    Object o;
}

static ConcurrentHashMap<String, ObjectContainer> containers = 
    new ConcurrentHashMap<String, ObjectContainer>();

public Object getInstance(String groupId) {
  ObjectContainer oc = containers.get(groupId);
  if (oc == null) {
    // it's enough to sync on the map, don't need the whole class
    synchronized(containers) {
      // double-check not to overwrite the created object
      if (!containers.containsKey(groupId))
        oc = new ObjectContainer();
        containers.put(groupId, oc);
      } else {
        // if another thread already created, then use that
        oc = containers.get(groupId);
      }
    } // leave the class-level sync block
  }

  // here we have a valid ObjectContainer, but may not have been initialized

  // same doublechecking for object initialization
  if(!oc.isSetupDone) {
    // now syncing on the ObjectContainer only
    synchronized(oc) {
      if(!oc.isSetupDone) {
        oc.o = new String("typically a more complicated operation"));
        oc.isSetupDone = true;
      }        
    }
  }
  return oc.o;
}

Note, that at creation, at most one thread may create ObjectContainer. But at initialization each groups may be initialized in parallel (but at most 1 thread per group).

It may also happen that Thread T1 will create the ObjectContainer, but Thread T2 will initialize it.

Yes, it is worth to keep the ConcurrentHashMap, because the map reads and writes will happen at the same time. But volatile is not required, since the map object itself will not change.

The sad thing is that the double-check does not always work, since the compiler may create a bytecode where it is reusing the result of containers.get(groupId) (that's not the case with the volatile isSetupDone). That's why I had to use containsKey for the double-checking.

Question 2

Therefore writing an element within that Object (if it is e.g. a standard HashMap, performing a put() operation on it) will not establish such a relationship. Is that correct?

Yes and no. There is always a happens-before relationship when you read or write a volatile field. The issue in your case is that even though there is a happens-before when you access the HashMap field, there is no memory synchronization or mutex locking when you are actually operating on the HashMap. So multiple threads can see different versions of the same HashMap and can create a corrupted data structure depending on race conditions.

Now, with using a volatile ConcurrentHashMap, will writing an element to it establish the happens-before relationship, i.e. will the above still work?

Typically you do not need to mark a ConcurrentHashMap as being volatile. There are memory barriers that are crossed internal to the ConcurrentHashMap code itself. The only time I'd use this is if the ConcurrentHashMap field is being changed often -- i.e. is non-final.

Your code really seems like premature optimization. Has a profiler shown you that it is a performance problem? I would suggest that you just synchronize on the map and me done with it. Having two ConcurrentHashMap to solve this problem seems like overkill to me.