Get argv[2] address in assembler x64

2

edi is argc, rsi is argv

   0x0000000000400535 <+8>: mov    %edi,-0x4(%rbp)
   0x0000000000400538 <+11>:    mov    %rsi,-0x10(%rbp)

here I get argv pointer

(gdb) x/8x $rbp-0x10
0x7ffdb7cac380: 0xb7cac478  0x00007ffd  0x00000000  0x00000003
0x7ffdb7cac390: 0x00000000  0x00000000  0x1f130b45  0x00007ff3

Pointer 0x7ffdb7cac478

So my argv[2] is here:

(gdb) x/8x 0x7ffdb7cac478+16
0x7ffdb7cac488: 0xb7cacd8a  0x00007ffd  0x00000000  0x00000000

At address 0x7ffdb7cacd8a

I need to get the address of argv[2], so I want to write this assembler code:

Pseudocode:

x - load 8 bytes from address $rbp-0x10 // (pointer to argv)

y - load 8 bytes from x value+16 // (pointer to argv[2])

I need later to jmp to y.

How do I write in assembler x64? Which register I can use to for x and y?

I hope it is understandable. I am a beginner.

I ask here since I don't know where to start doing my research.

UPDATE:

Tried with this:

bits 64
ldr r8, rbp, #0x10
ldr r9, r8, #0x10
jmp r9

But it doesn't even compile .... I am using nasm.

I guess above was for ARM arch, for amd64 (x64) below should do this. Is it correct?

UPDATE 2:

bits 64
lea r8, [rbp-0x10]
lea r9, [r8+0x10]
jmp r9

UPDATE 3:

Also doesn't work ...

bits 64
lea r8, [rbp-0x10]
mov r9, [r8]
mov r10, [r9+0x10]
jmp r10
assembly
asked on Stack Overflow Mar 8, 2016 by dev • edited Mar 8, 2016 by dev

1 Answer

4

Are you writing main() or _start?

If you're writing main, it's a normal function with its args in rdi, rsi, following the normal calling convention. See tag wiki for links to the x86-64 ABI.

If you're writing _start, then data is on the stack, as documented in process startup section of the ABI: [rsp] = argc, and above that an array of pointers, char *arg[] starting at rsp+8. It's an actual array right there on the stack, not a pointer to an array like main gets.

rbp is meaningless unless you initialize it. It has whatever the caller left in it.


Your code fragment is silly, too: you never initialize rbp. You should assume it holds garbage on process entry. Only rsp is guaranteed to be useful.

lea is just a shift & add instruction that uses effective-address syntax / encoding. mov is the mnemonic for load / store.

    ;; your code with comments, also assuming that RBP was initialized
    bits 64
    lea r8, [rbp-0x10]      ; r8 = rbp-0x10
    mov r9, [r8]            ; should have just done mov r9, [rbp-0x10]
    mov r10, [r9+0x10]
    jmp r10                 ; jump to argv[2]???

Did you put machine code bytes in argv[2]? Jumping to a string is not normally useful.

Of course, since rbp isn't initialized, it's not actually accessing argv[2].


Working example

single-step this in a debugger if you want to see what's going on.

; get argc and argv from the stack, for x86-64 SysV ABI
global _start
_start:
    mov   ecx,  [rsp]             ;   load argc (assuming it's smaller than 2^32)

    cmp   ecx, 3
    jb  .argc_below_3
                                  ;   argv[0] is at rsp+8
    mov   rsi,  [rsp+8 +  8*2]    ;   argv[2]  (the 3rd element)
    movzx eax,  byte [rsi]        ;   first char of argv[2]

    ; if you stop here in a debugger, you can see the character from the second arg.

    ; fall through and exit
.argc_below_3:
    xor edi, edi
    mov eax, 231                  ;  exit_group(0)
    syscall
answered on Stack Overflow Mar 9, 2016 by Peter Cordes • edited Aug 25, 2018 by Peter Cordes

User contributions licensed under CC BY-SA 3.0