First off, calling conventions: what data is sent to the function and where?
According to the documentation, arrays are passed as 32-bit pointers to the data, and integers are passed as values.
According to the same documentation, multiple calling conventions are supported. Unfortunately, the default one isn't documented - explicitly specifying one would be a good idea.
Based on your description that mov ecx, len
doesn't work, I'm guessing the compiler used the register
convention by default, and the arguments were already placed in ecx
and edx
, then your code went and mixed them up. You can either change your code to work with that convention, or tell the compiler to pass the arguments using the stack - use the stdcall
convention. I arbitrarily picked the second option. Whichever one you pick, make sure to specify the calling convention explicitly.
Next, actual function logic.
- Is there a reason why you're working with 16 bit registers instead of full 32-bit ones?
- Your array contains bytes, but you're reading and comparing words.
lea ebx, dword ptr [edx]
is the same asmov ebx, edx
. You're just introducing another temporary variable.- You're comparing elements as if they were signed.
- Modern compilers tend to implement loops without using
loop
. - The documentation also says that
ebx
needs to be preserved - because the function usesebx
, its original value needs to be saved at the start and restored afterwards.
This is how I rewrote your function (using Lazarus, because I haven't touched Delphi in about 8 years - no compiler within reach):
function arrMaxAsm(arr:TMyArray; len:integer):Word; assembler; stdcall;
asm
push ebx { save ebx }
lea edx, arr { Lazarus accepts a simple "mov edx, arr" }
mov edx, [edx] { but Delphi 7 requires this indirection }
mov ecx, len
xor ax, ax { set default max to 0 }
test ecx, ecx
jle @done { if len is <= 0, nothing to do }
movzx ax, byte ptr [edx] { read a byte, zero-extend it to a word }
{ and set it as current max }
@cont:
dec ecx
jz @done { if no elements left, return current max }
@cycle:
inc edx
movzx bx, byte ptr [edx] { read next element, zero-extend it }
cmp bx, ax { compare against current max as unsigned quantities }
jbe @cont
mov ax, bx
jmp @cont
@done:
pop ebx { restore saved ebx }
mov result, ax
end;
It might be possible to optimize it further by reorganizing the loop jumps - YMMV.
Note: this will only work correctly for byte-sized unsigned values. To adapt it to values of different size/signedness, some changes need to be made:
Data size:
- Read the right amount of bytes:
movzx bx, byte ptr [edx] { byte-sized values }
mov bx, word ptr [edx] { word-sized values }
mov ebx, dword ptr [edx] { dword-sized values }
{ note that the full ebx is needed to store this value... }
Mind that this reading is done in two places. If you're dealing with dwords, you'll also need to change the result from ax
to eax
.
- Advance over the right amount of bytes.
@cycle:
inc edx { for an array of bytes }
add edx, 2 { for an array of words }
add edx, 4 { for an array of dwords }
Dealing with signed values:
The value extension, if it's applied, needs to be changed from
movzx
tomovsx
.The conditional jump before setting new maximum needs to be adjusted:
cmp bx, ax { compare against current max as unsigned quantities }
jbe @cont
cmp bx, ax { compare against current max as signed quantities }
jle @cont