I have the following code snippet:
#include <inttypes.h>
#include <stdio.h>
uint64_t
esp_func(void)
{
__asm__("movl %esp, %eax");
}
int
main()
{
uint32_t esp = 0;
__asm__("\t movl %%esp,%0" : "=r"(esp));
printf("esp: 0x%08x\n", esp);
printf("esp: 0x%08lx\n", esp_func());
return 0;
}
Which prints the following upon multiple executions:
❯ clang -g esp.c && ./a.out
esp: 0xbd3b7670
esp: 0x7f8c1c2c5140
❯ clang -g esp.c && ./a.out
esp: 0x403c9040
esp: 0x7f9ee8bd8140
❯ clang -g esp.c && ./a.out
esp: 0xb59b70f0
esp: 0x7fe301f8c140
❯ clang -g esp.c && ./a.out
esp: 0x6efa4110
esp: 0x7fd95941f140
❯ clang -g esp.c && ./a.out
esp: 0x144e72b0
esp: 0x7f246d4ef140
esp_func
shows that ASLR is active with 28 bits of entropy, which makes sense on my modern Linux kernel.
What doesn't make sense is the first value: why is it drastically different?
I took a look at the assembly and it looks weird...
// From main
0x00001150 55 push rbp
0x00001151 4889e5 mov rbp, rsp
0x00001154 4883ec10 sub rsp, 0x10
0x00001158 c745fc000000. mov dword [rbp-0x4], 0
0x0000115f c745f8000000. mov dword [rbp-0x8], 0
0x00001166 89e0 mov eax, esp ; Move esp to eax
0x00001168 8945f8 mov dword [rbp-0x8], eax ; Assign eax to my variable `esp`
0x0000116b 8b75f8 mov esi, dword [rbp-0x8]
0x0000116e 488d3d8f0e00. lea rdi, [0x00002004]
0x00001175 b000 mov al, 0
0x00001177 e8b4feffff call sym.imp.printf ; For whatever reason, the value in [rbp-0x8]
; is assigned here. Why?
// From esp_func
0x00001140 55 push rbp
0x00001141 4889e5 mov rbp, rsp
0x00001144 89e0 mov eax, esp ; Move esp to eax (same instruction as above)
0x00001146 488b45f8 mov rax, qword [rbp-0x8] ; This changes everything. What is this?
0x0000114a 5d pop rbp
0x0000114b c3 ret
0x0000114c 0f1f4000 nop dword [rax]
So my question is, what is in [rbp-0x8]
, how did it get there, and why are the two values different?
No, stack ASLR happens once at program startup. Relative adjustments to RSP between functions are fixed at compile time, and are just the small constants to make space for a function's local vars. (C99 variable-length arrays and alloca
do runtime-variable adjustments to RSP, but not random.)
Your program contains Undefined Behaviour and isn't actually printing RSP; instead some stack address left in a register by the previous printf
call (which appears to be a stack address, so its high bits do vary with ASLR). It tells you nothing about stack-pointer differences between functions, just how not to use GNU C inline asm.
The first value is printing the current ESP correctly, but that's only the low 32 bits of the 64-bit RSP.
Falling off the end of a non-void
function is not safe, and using the return value is Undefined Behaviour. Any caller that uses the return value of esp_func()
necessarily would trigger UB, so the compiler is free to leave whatever it wants in RAX.
If you want to write mov %rsp, %rax
/ ret
, then write that function in pure asm, or mov to an "=r"(tmp)
local variable. Using GNU C inline asm to modify RAX without telling the compiler about it doesn't change anything; the compiler still sees this as a function with no return value.
MSVC inline asm is different: it is apparently supported to use _asm{ mov eax, 123 }
or something and then fall off the end of a non-void function, and MSVC will respect that as the function return value even when inlining. GNU C inline asm doesn't need silly hacks like that: if you want your asm to interact with C values, use Extended asm with an output constraint like you're doing in main
. Remember that GNU C inline asm is not parsed by the compiler, just emit the template string as part of the compiler's asm output to be assembled.
I don't know exactly why clang is reloading a return value from the stack, but that's just an artifact of clang internals and how it does code-gen with optimization disabled. But it's allowed to do this because of the undefined behaviour. It is a non-void function, so it needs to have a return value. The simplest thing would be to just emit a ret
, and is what some compilers happen to do with optimization enabled, but even that doesn't fix the problem because of inter-procedural optimization.
It's actually Undefined Behaviour in C to use the return value of a function that didn't return one. This applies at the C level; using inline asm that modifies a register without telling the compiler about it doesn't change anything as far as the compiler is concerned. Therefore your program as a whole contains UB, because it passes the result to printf
. That's why the compiler is allowed to compile this way: your code was already broken. In practice it's just returning some garbage from stack memory.
TL:DR: this is not a valid way to emit mov %rsp, %rax
/ ret
as the asm definition for a function.
(C++ strengthens this to it being UB to fall off the end in the first place, but in C it's legal as long as the caller doesn't use the return value. If you compile the same source as C++ with optimization, g++ doesn't even emit a ret
instruction after your inline asm template. Probably this is to support C's default-int
return type if you declare a function without a return type.)
This UB is also why your modified version from comments (with the printf format strings fixed), compiled with optimization enabled (https://godbolt.org/z/sE7e84) prints "surprisingly" different "RSP" values: the 2nd one isn't using RSP at all.
#include <inttypes.h>
#include <stdio.h>
uint64_t __attribute__((noinline)) rsp_func(void)
{
__asm__("movq %rsp, %rax");
} // UB if return value used
int main()
{
uint64_t rsp = 0;
__asm__("\t movq %%rsp,%0" : "=r"(rsp));
printf("rsp: 0x%08lx\n", rsp);
printf("rsp: 0x%08lx\n", rsp_func()); // UB here
return 0;
}
Output example:
Compiler stderr
<source>:7:1: warning: non-void function does not return a value [-Wreturn-type]
}
^
1 warning generated.
Program returned: 0
Program stdout
rsp: 0x7fff5c472f30
rsp: 0x7f4b811b7170
clang -O3 asm
output shows that the compiler-visible UB was a problem. Even though you used noinline
, the compiler can still see the function body and try to do inter-procedural optimization. In this case, the UB led it to just give up and not emit a mov %rsp, %rsi
between call rsp_func
and call printf
, so it's printing whatever value the previous printf happened to leave in RSI
# from the Godbolt link
rsp_func: # @rsp_func
mov rax, rsp
ret
main: # @main
push rax
mov rsi, rsp
mov edi, offset .L.str
xor eax, eax
call printf
call rsp_func # return value ignored because of UB.
mov edi, offset .L.str
xor eax, eax
call printf # printf("0x%08lx\n", garbage in RSI left from last printf)
xor eax, eax
pop rcx
ret
.L.str:
.asciz "rsp: 0x%08lx\n"
GNU C Basic asm (without constraints) is not useful for anything (except the body of a __attribute__((naked))
function).
Don't assume the compiler will do what you expect when there is UB visible to it at compile time. (When UB isn't visible at compile time, the compiler has to make code that would work for some callers or callees, and you get the asm you expected. But compile-time-visible UB means all bets are off.)
User contributions licensed under CC BY-SA 3.0