the instruction cache and conditional statements

https://stackoverflow.com/questions/10874315

12-06-2021
|

Question

im trying to orient my code to use the cache as efficiently as possible using data oriented design, its my first time thinking about such things as it goes. ive worked out a way to loop over the same instruction that draw a sprite on screen, the vectors sent to the function include positions and sprites for all game entities.

my question is does the conditional statement get rid of the draw function from the instruction cache and therefore ruin my plan? or is what im doing just generally insane?

struct position
{
    position(int x_, int y_):x(x_), y(Y_)
    int x,y;
};

vector<position> thePositions;
vector<sprite> theSprites;
vector<int> theNoOfEntities; //eg 3 things, 4 thingies, 36 dodahs
int noOfEntitesTotal;

//invoking the draw function
draw(&thePositions[0], &theSprites[0], &theNoOfEntities[0], noOfEntitesTotal)

void draw(position* thepos, sprite* thesp, int* theints, int totalsize)
{
    for(int j=0;int i=0;i<totalsize;i++)
    {
        j+=i%size[j]?1:0;
        thesp[j].draw(thepos[i]);
    }
}

Solution

Did you verify that the conditional stays as a conditional in assembly? generally with simple conditionals such as the one presented above, the expression can be optimized to a branchless sequence (either at machine level using machine specific instructions, or at IR level using some fancy bit math).

In your case, you conditional gets folded down very nicely on x86 to a flat sequence (and AFAIK, this will occur on most non-x86 platforms too, as its a mathematical optimization, not a machine specific one):

IDIV DWORD PTR SS:[ARG.1]
MOV EAX,EDX
NEG EAX                                  ; Converts EAX to boolean
SBB EAX,EAX
NEG EAX

So this means the aren't any branches to predict, other than your outer loop, which follows a pattern, meaning it won't cause any mis-prediction (it might mis-predict on exit, depending on the generated assembly, but its exited, so it doesn't matter).

This brings up a second point, never assume, always profile and test (one of the cases where assembly knowledge helps a lot), that way you can spend time optimizing where it realy matters (and you can understand the inter and inner workings of your code on your target platform better too).

If you really are concerned about branch mis-prediction and the penalties incured, use the resources provided by your target architectures manufacturer (different architectures behave very differently for branch mis-prediction), such as this and this from Intel. AMD's CodeAnalyst is a great tool for checking branch mis-prediction and the penalties it may be causing.

OTHER TIPS

Whoa there buddy! No offence, but it looks like you've read about DOD without fully understanding the how and why of it. Now you're just following the guidelines set in the articles about DOD like they're important. They're not, what's important in DOD is understanding data, understanding the computer architecture and understanding how your code can manipulate that data as efficient as possible using your knowledge of the architecture. The guidelines set out in DOD articles are only there as reminders of common things to think about.

Want to know when how and why you need to use DOD? Learn about the architecture you're working with. Do you know the cost of one cache-miss? It's really really really really low. Do the math. I'm serious, do the math yourself, I could probably give you some numbers but then you wouldn't be learning much. So find out what you can about the architecture, how a processor works, how memory and caches work, how assembly language works, what the assembly generated by your compiler looks like. Once you know and understand all of that, DOD is really nothing more than stating some almost obvious guidelines to writing really efficient code.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow