How to ensure 16byte code alignment of Delphi routines?

https://stackoverflow.com/questions/1852218

13-09-2019
|

Question

Background:

I have a unit of optimised Delphi/BASM routines, mostly for heavy computations. Some of these routines contain inner loops for which I can achieve a significant speed-up if the loop start is aligned to a DQWORD (16-byte) boundary. I can ensure that the loops in question are aligned as desired IF I know the alignment at the routine entry point.

As far as I can see, the Delphi compiler aligns procedures/functions to DWORD boundaries, and e.g. adding functions to the unit may change the alignment of subsequent ones. However, as long as I pad the end of routines to multiples of 16, I can ensure that subsequent routines are likewise aligned -- or misaligned, depending on the alignment of the first routine. I therefore tried to place the critical routines at the beginning of the unit's implementation section, and put a bit of padding code before them so that the first procedure would be DQWORD aligned.

This looks something like below:

interface

procedure FirstProcInUnit;

implementation

procedure __PadFirstProcTo16;
asm
    // variable number of NOP instructions here to get the desired code length
end;

procedure FirstProcInUnit;
asm //should start at DQWORD boundary
    //do something
    //padding to align the following label to DQWORD boundary
    @Some16BAlignedLabel:
        //code, looping back to @Some16BAlignedLabel
    //do something else
    ret #params
    //padding to get code length to multiple of 16
end;

initialization

__PadFirstProcTo16; //call this here so that it isn't optimised out
ASSERT ((NativeUInt(Pointer(@FirstProcInUnit)) AND $0F) = 0, 'FirstProcInUnit not DQWORD aligned');

end.

This is a bit of a pain in the neck, but I can get this sort of thing to work when necessary. The problem is that when I use such a unit in different projects, or make some changes to other units in the same project, this may still break the alignment of __PadFirstProcTo16 itself. Likewise, recompiling the same project with different compiler versions (e.g. D2009 vs. D2010) typically also breaks the alignment. So, the only way of doing this sort of thing I found was by hand as the pretty much last thing to be done when all the rest of the project is in its final form.

Question 1:

Is there any other way to achieve the desired effect of ensuring that (at least some specific) routines are DQWORD-aligned?

Question 2:

Which are the exact factors that affect the compiler's alignment of code and (how) could I use such specific knowledge to overcome the problem outlined here?

Assume that for the sake of this question "don't worry about code alignment/the associated presumably small speed benefits" is not a permissible answer.

Solution

As of Delphi XE, the problem of code alignment is now easily solved using the $CODEALIGN compiler directive (see this Delphi documentation page):

{$CODEALIGN 16}
procedure MyAlignedProc;
begin
..
end;

OTHER TIPS

One thing that you could do, is to add a 'magic' signature at the end of each routine, after an explicit ret instruction:

asm
  ...
  ret
  db <magic signature bytes>
end;

Now you could create an array containing pointers to each routine, scan the routines at run-time once for the magic signature to find the end of each routine and therefore its length. Then, you can copy them to a new block of memory that you allocate with VirtualAlloc using PAGE_EXECUTE_READWRITE, ensuring this time that each routine starts on a 16-byte boundary.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow