Is there any penalty in returning members of nested objects?

Question 1

The standard places no constraints on this. A compiler writer with a really twisted mind could, for example, generate a loop which does nothing at the start of every function, with the number of times through the loop depending on the number of letters in the function name. Fully conforming, but... I rather doubt he'd have many users for his compiler.

In practice, it's just (barely) conceivable that the compiler work out the address of each sub-object; e.g. on an Intel, do something like:

D::direct:
    mov eax, [ecx + offset m_D]
    return

D::ind1:
    lea ebx, [ecx + offest instance_A]
    mov eax, [ebx + offset m_D]
    return

D::ind2:
    lea ebx, [ecx + offset instance_B]
    lea ebx, [ebx + offset m_A_of_B]
    mov eax, [ebx + offset m_D]
    return

In fact, all of the compilers I've ever seen work out the complete layout of the directly contained objects, and would generate something like:

D::direct:
    mov eax, [ecx + offset m_D]
    return

D::ind1:
    mov eax, [ecx + offset instance_A + offset m_D]
    return

D::ind2:
    mov eax, [ecx + offset instance_A + offset m_A_of_B + offset m_D]
    return

(The additions of the offsets in the square brackets occurs in the assembler; the expressions correspond to a single constant within the instruction in the actual executable.)

So in answser to your questions: 1 is that it's completely compiler-dependent, and 2 is that in actual practice, there will be absolutely no difference.

Finally, all of your functions are inline. And they are simple enough that every compiler will inline them, at least with any degree of optimization activated. And once inlined, the optimizer may find additional optimizations: it may be able to detect that you initialized D::instance_B::m_A_of_B::m_A with a constant, for example; in which case, it will just use the constant, and there won't be any access what so ever. In fact, you're wrong to worry about this level of optimization, because the compiler will take care of it for you, better than you can.

Question 2

My understanding is - there is no difference.

Where you have (e.g.) a D object on the stack, then accessing any member or nested member is simply a stack offset. If the D object is on the heap, then it's a pointer offset, but not really different.

This is because a D object directly contains all its members, each of which directly contains their own members.

Question 3

There is no overhead as long as the object is a direct member (so, not a pointer or reference member), the compiler just calculates the appropriate offset, whether you have one, two, three or fiftyfour levels of nesting [assuming you use a reasonably sane compiler - as the comment says, there is nothing STOPPING some obstinate compiler from producing awful code in this case - this applies to many cases where you can with some experience guess what the compiler would do - there is very few things where the C++ standard mandates that the compiler can't add extra code that doesn't do anything particularly useful].

Obviously a reference or a pointer member would have an overhead of reading the address for the actual object.

Is there any penalty in returning members of nested objects?

1. No difference

2. Pointer indirection

Questions