I've begun to study assembly and I have some difficult with a sample program.
I wrote a macro that would find the minimum in an array:
%macro min 3
mov ecx, dword[%2]
mov r12, 0
lea rbx, [%1]
movsx eax, word[rbx+r12*4] ; inizializza il minimo con il primo elemento dell'array
%%minLoop:
cmp eax, [rbx+r12*4]
jl %%notNewMin
movsx eax, word[rbx+r12*4]
%%notNewMin:
inc r12
loop %%minLoop
mov [%3], eax
%endmacro
section .data
EXIT_SUCCESS equ 0
SYS_exit equ 60
list1 dd 4, 5, 2, -3, 1
len1 dd 5
min1 dd 0
section .text
global _start
_start:
min list1, len1, min1
last:
mov rax, SYS_exit ; exit
mov rdi, EXIT_SUCCESS ; success
syscall
This program successfully compile, but when I debug it (with DDD), in the eax
register I have the hex value 0xFFFFFFFD
and the decimal value of 4294967293
.
But, if I use a calculator 0xFFFFFFFD
is really -3
which is the correct value.
In your opinion, is my program correct?
Thanks in advance for your answers.
It's not correct, though testing it with small values would hide the bug.
There is an inconsistency in what type of the elements of the array are treated as. They were defined with dd
, and the address calculation is consistent with that (using 4*index
). cmp eax, [rbx+r12*4]
is also consistent with that. But movsx eax, word[rbx+r12*4]
is not, now suddenly the upper 16 bits of the element are not used.
This can be fixed very easily by writing mov eax, [rbx+r12*4]
instead.
By the way you should usually not use loop
, it's quite slow on most modern processors.
0xFFFFFFFD
is 32 bit value 1111_1111_1111_1111_1111_1111_1111_1101
, which is probably the closest metaphor for what the CPU has physically inside (32 cells with different electricity current level or magnetic poles encoding logical value 0 or 1).
Whether you interpret that as -3
or 4294967293
or something completely different (let's say 32 independent true/false values) is up to the code, which is using the value.
The negative integers are usually using the two's complement encoding, which you are observing with your -3
value.
The debugger doesn't know if you are interpreting the value as signed or unsigned (unless you specify it by formatting parameters), so it will pick one format and display like that, in your case as unsigned 32 bit value, which means you see 4294967293
instead of -3
, but bitwise those two are identical, and also for arithmetic instructions like add/sub/cmp/test/...
that value is identical, only the interpretation of results (and flags) by the following code will decide, if the value was "signed" or "unsigned".
The sign itself is not part of the encoded information, or sometimes the top bit is deemed as "sign" bit, because all negative values have the top bit set, but that's the reason why signed 8 bit value can store only values -128..+127, while unsigned 8 bit value can store values 0..+255 (i.e. both interpretations cover exactly 256 different values, because 8 bits can produce 256 different combinations of 0/1 patterns, but the signed interpretation "starts" at "0x80 = -128", while the unsigned interpretation "starts" at "0x00 = 0" and 0x80 is already interpreted as +128. But both interpretations are working with the only 8 bit values, there's no other additional information, like some kind of type, etc..
For example
cmp eax, ebx ; check if eax is bigger than ebx
; now if the values were meant as unsigned, then use "ja" branch
ja eax_is_bigger_as_unsigned
; but if you meant the values as signed, then you should use "jg" (testing different flags)
jg eax_is_bigger_as_signed
So the cmp
itself doesn't care how you interpret that bit pattern, it will set enough flags in the EFLAGS register to make the later conditional branching possible for both cases.
User contributions licensed under CC BY-SA 3.0