Question

Consider the following code, regarding the nested-access of members:

struct A
{
  size_t m_A;
};
struct B
{  
   A m_A_of_B;
};
class D
{
  B instance_B;
  A instance_A;
  size_t m_D;
public:
  size_t direct (void) { return m_D; }
  size_t ind1 (void) { return instance_A.m_A; }
  size_t ind2 (void) { return instance_B.m_A_of_B.m_A; }
};

I can imagine two different cases here:

1. No difference

To my understanding there should be no difference since all of the functions return a value that has a compile-time constant position in relation to this / in the class memory layout.

I expect the compiler to recognize it.

Therefore, I assume that there is no penalty in returning members from such nested structures like I've shown above (or even deeper).

2. Pointer indirection

It may be possible that the whole "indirection" is carried out here. In ind2 for example:

fetch this -> fetch relative position of instance_B -> fetch relative position of m_A_of_B -> return m_A


Questions

  1. Is it compiler-depedant how this nested-access is handled?
  2. Is there any difference in those three functions?

I ask this since I only have an assumption about this issue from what I know about how things work. Because some of my assumptions have proven to be wrong in the past, I want to ask to be sure here.

Excuse me if this has already been asked, point me to the appropriate answer if possible.

PS: You don't need to give any hints on "premature optimization being the root of all evil" or about profiling. I can profile on this issue using the compiler I am developing with but the program I am aiming at may be compiled with any conforming compiler. So even if I'm not able to determine any differences they may still be present.

Was it helpful?

Solution

The standard places no constraints on this. A compiler writer with a really twisted mind could, for example, generate a loop which does nothing at the start of every function, with the number of times through the loop depending on the number of letters in the function name. Fully conforming, but... I rather doubt he'd have many users for his compiler.

In practice, it's just (barely) conceivable that the compiler work out the address of each sub-object; e.g. on an Intel, do something like:

D::direct:
    mov eax, [ecx + offset m_D]
    return

D::ind1:
    lea ebx, [ecx + offest instance_A]
    mov eax, [ebx + offset m_D]
    return

D::ind2:
    lea ebx, [ecx + offset instance_B]
    lea ebx, [ebx + offset m_A_of_B]
    mov eax, [ebx + offset m_D]
    return

In fact, all of the compilers I've ever seen work out the complete layout of the directly contained objects, and would generate something like:

D::direct:
    mov eax, [ecx + offset m_D]
    return

D::ind1:
    mov eax, [ecx + offset instance_A + offset m_D]
    return

D::ind2:
    mov eax, [ecx + offset instance_A + offset m_A_of_B + offset m_D]
    return

(The additions of the offsets in the square brackets occurs in the assembler; the expressions correspond to a single constant within the instruction in the actual executable.)

So in answser to your questions: 1 is that it's completely compiler-dependent, and 2 is that in actual practice, there will be absolutely no difference.

Finally, all of your functions are inline. And they are simple enough that every compiler will inline them, at least with any degree of optimization activated. And once inlined, the optimizer may find additional optimizations: it may be able to detect that you initialized D::instance_B::m_A_of_B::m_A with a constant, for example; in which case, it will just use the constant, and there won't be any access what so ever. In fact, you're wrong to worry about this level of optimization, because the compiler will take care of it for you, better than you can.

OTHER TIPS

My understanding is - there is no difference.

Where you have (e.g.) a D object on the stack, then accessing any member or nested member is simply a stack offset. If the D object is on the heap, then it's a pointer offset, but not really different.

This is because a D object directly contains all its members, each of which directly contains their own members.

There is no overhead as long as the object is a direct member (so, not a pointer or reference member), the compiler just calculates the appropriate offset, whether you have one, two, three or fiftyfour levels of nesting [assuming you use a reasonably sane compiler - as the comment says, there is nothing STOPPING some obstinate compiler from producing awful code in this case - this applies to many cases where you can with some experience guess what the compiler would do - there is very few things where the C++ standard mandates that the compiler can't add extra code that doesn't do anything particularly useful].

Obviously a reference or a pointer member would have an overhead of reading the address for the actual object.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top