I've been trying to teach myself how to accomplish certain tasks in assembly.
Right now, I am working on trying to detect palindromes. I know I could use a stack, or possibly compare strings using Irvine's library, but I'm trying to do it via registers.
The problem is, when it comes to using registers, I'm more than a bit confused.
The following compiles, but when I get to the CMP line, the program breaks and gives me this message:
Unhandled exception at 0x004033FC in Project.exe: 0xC0000005: Access violation reading location 0x0000000F.
I'm assuming it has something to do with how I set the registers, but even using the registers while debugging isn't helping me much.
Any help would be appreciated.
INCLUDE Irvine32.inc .data enteredWord BYTE "Please enter the string to check: ", 0 presetWord BYTE "Step on no pets", 0 isAPalindrome BYTE "The word is a palindrome. ", 0 isNotAPalindrome BYTE "The word is not a palindrome. ", 0 .code main proc mov ecx, SIZEOF presetWord - 1 mov esi,OFFSET presetWord checkWord: MOV eax,[esi] CMP [ecx],eax JNE NOTPALIN inc esi dec ecx loop checkWord mov edx, offset isAPalindrome call WriteString jmp _exit main endp NOTPALIN PROC mov edx, offset isNotAPalindrome call WriteString ret NOTPALIN endp _exit: exit end main
CPU register is piece of computer memory located directly inside the CPU core. Piece of computer memory means some amount of bits (0/1), in case of 64b x86 CPU the general registers are 64 bits "wide", under names
rax, rcx, rdx, rbx, ...
ecx is the lower 32b part of
rcx (upper 32b part is not accessible under special name, only through instructions using
rcx). And the lower 16b part is accessible through
cx, which is composed from two 8b parts
ch (upper), and
So as you are using
ecx, you can set 32 bits to either 0 or 1. Which can interpreted as unsigned number from 0 to 232-1 (in hexa
0 .. 0xFFFFFFFF), or as signed number from -231 to +231-1 (
0x80000000 .. 0x7FFFFFFF). Or you can interpret the meaning of those bits in any way you wish, and write code for.
In your code you can utilize three common ways how to interpret value of bits in some CPU register.
; EBX as memory address: mov ebx,OFFSET presetWord ; some address into memory (32b unsigned number) ; ECX as numeric value ("unsigned long" in C++) mov ecx,SIZEOF presetWord - 1 ; 15 ; AL as ASCII character (extended 8 bit) mov al,[ebx] ; also shows how memory is referenced by address ; AL == 83 == 'S' => value of memory at address "presetWord"
In your example doing
cmp [ecx],eax means to reference memory at address 15, which is fortunately for you illegal, so it does crash. If you would by accident use some legal address for your process (but not the one you wanted to really use), it would silently proceed and continue with unexpected result.
You probably did want to do
cmp [esi+ecx],eax, which means to reference memory at address
presetWord+15 (last char of string), but that's true only for first iteration. Then you do
inc esi and it will point at
presetWord+1 address (second char).
And you probably wanted to compare only characters, so you should change that
al to fetch/compare only single byte at one time, because the string is encoded in ASCII encoding (8bit per char).
eax would work for UTF-32 encoding.
To check for palindrome you may want to load one register ("r1") with address of first char, one register ("r2") of address (!) of last char, and then do this loop:
This will produce "false" for presetWord, as
'S' != 's', so you may want to introduce case insensitivity to the
if (byte [r1]... part, but I would first make it work without that.
While debugging, you should be able to recognize "class" of some of those numbers in registers. If you load size into register, it will be very likely some small number, like
0000000F (15). Address will be very likely some large number like
8040506E. ASCII characters when used as single char should lead to something like
7F in common cases, but if you do
mov al,..., the debugger is still displaying whole
eax, so the upper three bytes will remain it's previous value, for example reading space character into
eax set as
12345678 will change the value of
' ' == 0x20 in ASCII).
You can also use memory view to check content of particular address in memory. If you would for example change that
cmp [esi+ecx],eax, and check that address in memory view, you would see it would point in second iteration again at the last char, not the second last char.
This is all visible and possible to check in the debugger, sometimes a bit tedious, then again often easier than asking on SO or just thinking about the source code, especially if you are stuck for longer time.
Finally ... why even registers? Because computer memory is separate chip. And it may look innocent, but instruction like
mov al,[presetWord] may actually stall for hundreds of CPU cycles, while the CPU chip will wait for the memory chip to read the content of memory and send it over bus wires to the CPU chip. While the
ecx is directly inside the CPU, accessible in the same cycle when the CPU needs it.
So you may want to store values into register, if you use them often in your calculation, to not slow down with memory (although once the memory content is cached by L0/1/2/3 caches, the "hundreds" of cycles becomes reasonable amount, sometimes even 0 cycles with cache level directly on CPU chip). But you want to access memory in predictable pattern (so cache can read-ahead), and in reasonable amounts (caches work usually with sizes like 16-32B up to 4-8k by their level). If you access in couple of instructions like 16 different 8k memory pages, you may run out of available cache-lines, and then there will be at least one access featuring full stall, waiting for real memory read.
User contributions licensed under CC BY-SA 3.0