Overhead for "rich type" struct in C++

https://stackoverflow.com/questions/15689553

30-03-2022
|

Question

I'd like keep track of what is essentially "type" information at compile time for a few functions which currently take arguments of the same type. Here's an example; Say I have two functions getThingIndex(uint64_t t) and getThingAtIndex(uint64_t tidx). The first function treats the argument as an encoding of the thing, does a non-trivial computation of an index, and returns it. One can then get the actual "thing" by calling getThingAtIndex. getThingAtIndex, on the other hand, assumes you're querying the structure and already have an index. The latter of the two methods is faster, but, more importantly, I want to avoid the headaches that might result from passing a thing to getThingAtIndex or by passing an index to getThingIndex.

I was thinking about creating types for thing and thing index sort of like so:

struct Thing { uint64_t thing; }
struct ThingIndex { uint64_t idx; }

And then changing the signatures of the functions above to be

getThingIndex(Thing t)
getThingAtIndex(ThingIndex idx)

Now, despite the fact that Thing and ThingIndex encode the same underlying type, they are nonetheless distinct at compile time and I have less opportunity to make stupid mistakes by passing an index to getThingIndex or a thing to getThingAtIndex.

However, I'm concerned about the overhead of this approach. The functions are called many (10s-100s of millions of) times, and I'm curious if the compiler will optimize away the creation of these structures which essentially do nothing but encode compile-time type information. If the compiler won't perform such an optimization, is there a way to create these types of "rich types" with zero overhead?

Solution

Take a look at the disassembly.

unsigned long long * x = new unsigned long long;
0110784E  push        8  
01107850  call        operator new (01102E51h)  
01107855  add         esp,4  
01107858  mov         dword ptr [ebp-0D4h],eax  
0110785E  mov         eax,dword ptr [ebp-0D4h]  
01107864  mov         dword ptr [x],eax  
*x = 5;
01107867  mov         eax,dword ptr [x]  
0110786A  mov         dword ptr [eax],5  
01107870  mov         dword ptr [eax+4],0

And the struct.

struct Thing { unsigned long long a; };
Thing * thing = new Thing;
0133784E  push        8  
01337850  call        operator new (01332E51h)  
01337855  add         esp,4  
01337858  mov         dword ptr [ebp-0D4h],eax  
0133785E  mov         eax,dword ptr [ebp-0D4h]  
01337864  mov         dword ptr [thing],eax  
thing->a = 5;
01337867  mov         eax,dword ptr [thing]  
0133786A  mov         dword ptr [eax],5  
01337870  mov         dword ptr [eax+4],0

There is no difference in the two instructions. The compiler doesn't care that this->a is a member of the struct, it accesses it as if you just declared unsigned long long a.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow