I cant traduce this chunk of GAS code to INTEL/NASM syntax

Question

I'm still slightly confused by the uses of ## in G. I found a section of the GNU cpp manual which mentions ## after a comma, but it's meant for use in variadic macros, and this isn't one of those.

But I'm going ahead with an explanation anyway, based on the assumption that those ## are not doing anything.

The ## in lookup_32bit, on the other hand, are perfectly normal and necessary.

Let's go up a level from the G macro and see how it's called. One of its calls looksl ike this:

G(RGI1, RGI2, x1, s0, s1, s2, s3)

Its first argument, RGI1, becomes gi1 in the expansion. The first piece of the G macro:

lookup_32bit(t0, t1, t2, t3, ##gi1, RGS1, shr_next, ##gi1)

expands lookup_32bit with ##gi1 as the 5th and 8th arguments. I'm assuming ##gi1 works the same as gi1, so the 5th and 8th arguments will be RGI1.

Inside the lookup_32bit macro, the 5th and 8th arguments are called src and il_reg, so both of those will expand to RGI1 in this instance. The first instruction in lookup_32bit:

movzbl      src ## bl,        RID1d;

pastes the src argument (RGI1) together with bl (which is not a macro or a macro argument, so it just represents itself), resulting in the pasted token RGI1bl. The instruction now looks like this:

movzbl      RGI1bl,        RID1d;

After the first pass of expanding lookup_32bit is done, the preprocessor will look again for macros to expand, and RGI1bl is a macro defined like this:

#define RGI1bl %dl

Also, RID1d is a macro defined like this:

#define RID1d %ebp

so the instruction ends up being:

movzbl      %dl,        %ebp;

and that's just a zero-extending move from 8-bit register %dl to 32-bit register `%ebp.

Looking at the other macros, you can see that there are a bunch of them starting with RGI1 all of which resolve to %rdx or portions of it. With these macros in place, selecting the low 8-bit portion of a 64-bit register can be done by pasting bl onto the end with ##, which wouldn't be possible using the native register names directly (there's no preprocessor operation as sophisticated as "remove the r from the front of this token and change the final x to an l).

The specific names RGI1, RID1, etc. don't look familiar to me. I'll guess they are derived from the twofish specification.

Token-pasting reference: http://gcc.gnu.org/onlinedocs/cpp/Concatenation.html#Concatenation