Why movaps causes segmentation fault?

1

Introduction

I was trying to familiarize myself with AES instructions, to then use libraries that make use of these technologies more consciously. However, I don't regularly program in assembly, so I have some confidence with the language, but I don't consider myself an expert. I have written a listing of about 150 assembly lines to try to use these functions using the documentation offered by Intel. However I have not succeeded to make many steps ahead.

The program crashes due to a segmentation error in the main when I use the instruction movaps. I've tried debugging with both gdb and valgrind, but it seems like everything should work, but it doesn't. Here are the lines that cause problems.

Code

main:
start_f

    printstr

    movaps (string), %xmm15
==> movaps (key), %xmm0
    call   aes_encript

    movaps %xmm15, string
    printstr

end_f

start_f and end_f are just to macros to start and end the function. I also provide the code for the .data section to show how there should be no problem:

    .data
string:
    .string "string"
    .fill   (128 - (.-string)), 1, 0

newline:
    .byte   0x0a

key:
    .fill   128, 1, 0

    .text
    .global _start

Debugging info

As for the error, I couldn't get any useful information either by statically disassembling or in gdb. Valgrind was no help either, which is to be expected since I don't touch the heap at all. I show a partial listing of the main disassembled in gdb:

   0x0000000000401022 <+0>:     push   %rbp
   0x0000000000401023 <+1>:     mov    %rsp,%rbp
   0x0000000000401026 <+4>:     mov    $0x402000,%rsi
   0x000000000040102d <+11>:    call   0x401156 <write_long>
   0x0000000000401032 <+16>:    mov    $0x1,%rax
   0x0000000000401039 <+23>:    mov    $0x1,%rbp
   0x0000000000401040 <+30>:    mov    $0x402080,%rsi
   0x0000000000401047 <+37>:    mov    $0x1,%rdx
   0x000000000040104e <+44>:    syscall 
   0x0000000000401050 <+46>:    movaps 0x402000,%xmm15
=> 0x0000000000401059 <+55>:    movaps 0x402081,%xmm0
   0x0000000000401061 <+63>:    call   0x4010b6 <aes_encript>
   0x0000000000401066 <+68>:    movaps %xmm15,0x402000
   0x000000000040106f <+77>:    mov    $0x402000,%rsi
   0x0000000000401076 <+84>:    call   0x401156 <write_long>

And here is the content at address 0x402081 (which is perfectly accessible):

(gdb) x/32x 0x402081
0x402081:       0x00000000      0x00000000      0x00000000      0x00000000
0x402091:       0x00000000      0x00000000      0x00000000      0x00000000
0x4020a1:       0x00000000      0x00000000      0x00000000      0x00000000
0x4020b1:       0x00000000      0x00000000      0x00000000      0x00000000
0x4020c1:       0x00000000      0x00000000      0x00000000      0x00000000
0x4020d1:       0x00000000      0x00000000      0x00000000      0x00000000
0x4020e1:       0x00000000      0x00000000      0x00000000      0x00000000
0x4020f1:       0x00000000      0x00000000      0x00000000      0x00000000

Request

I don't exclude that the error is something stupidly trivial: I haven't used as for a while. In any case if you could give me a tip I would be grateful.

In case you want to try this code by yourself here is a pastebin with the whole listing: https://paste.debian.net/1194986/

assembly
segmentation-fault
sse
memory-alignment
att
asked on Stack Overflow Apr 24, 2021 by Giovanni Zaccaria • edited Apr 25, 2021 by Peter Cordes

1 Answer

5

0x402081 memory address of key is not aligned to 16-bytes.

From Intel® 64 and IA-32 architectures software developer’s manual, MOVAPS specification:

MOVAPS—Move Aligned Packed Single-Precision Floating-Point Values
...
When the source or destination operand is a memory operand, the operand must be aligned on a 16-byte (128-bit version), 32-byte (VEX.256 encoded version) or 64-byte (EVEX.512 encoded version) boundary or a generalprotection exception (#GP) will be generated.

You could use movups, but it's usually better to align your constants.

In previous line 0x402000 is aligned to 16 bytes, so previous line doesn't segfault.

key could be defined as follows to be aligned to 16 bytes:

    .balign 16
key:
    .fill   128, 1, 0

Also note, that's 128 bytes of zeros, not 128 bits. And since it's all zero, you could have put it in .bss instead of .data.

(Put newline: .byte '\n' after this, so you don't waste 15 bytes on alignment. Or better, put newline in .rodata, or have write_long include a newline in the output it writes.)

answered on Stack Overflow Apr 24, 2021 by Renat • edited Apr 25, 2021 by Peter Cordes

User contributions licensed under CC BY-SA 3.0