Generate code to ensure that the stack does not grow beyond a certain value, either the value of a register or the address of a symbol. If a larger stack is required, a signal is raised at run time. For most targets, the signal is raised before the stack overruns the boundary, so it is possible to catch the signal without taking special precautions.
For instance, if the stack starts at absolute address ‘0x80000000’ and grows downwards, you can use the flags -fstack-limit-symbol=__stack_limit and -Wl,--defsym,__stack_limit=0x7ffe0000 to enforce a stack limit of 128KB. Note that this may only work with the GNU linker.
You can locally override stack limit checking by using the no_stack_limit function attribute (see Function Attributes).
What is the "signal"? Does it require some OS interface? Should there be documentation as to what ID this signal has so that I can catch it?
I have just now myself run into the same question.
For x86 the signal will be "SIGILL", for illegal instruction. This is because GCC maps the "trap_if" RTL primitive to the x86 ud2
instruction, which once executed will generate the signal.
For example, here is some source code:
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#ifndef N
#define N 100000
#endif
register uintptr_t stack_limit asm ("r12");
void foo();
__attribute__((no_stack_limit))
int main(int argc, char **argv) {
int ret = 0;
stack_limit = (uintptr_t)(&ret - 1000);
foo();
return ret;
}
void foo()
{
char *ch = (char *)alloca(N);
printf("Now in foo\n");
}
Compiled with gcc -ggdb -O0 -fstack-limit-register=r12 -o stack-lim stack-lim.c -fstack-usage -DN=3922
Experimentally, -DN=3921
will work ok, and -DN=3922
will trigger it (on my machine), which is in the right ballpark considering how crude it is.
And here is the consequent disassembly in gdb
. Notice the "ud2" instruction is conditionally executed by the jae
before it.
...
B+>│0x555555554707 <foo+8> mov %fs:0x28,%rax
│0x555555554710 <foo+17> mov %rax,-0x8(%rbp)
│0x555555554714 <foo+21> xor %eax,%eax
│0x555555554716 <foo+23> mov $0x10,%eax
│0x55555555471b <foo+28> sub $0x1,%rax
│0x55555555471f <foo+32> add $0xf62,%rax
│0x555555554725 <foo+38> mov $0x10,%ecx
│0x55555555472a <foo+43> mov $0x0,%edx
│0x55555555472f <foo+48> div %rcx
│0x555555554732 <foo+51> imul $0x10,%rax,%rax
│0x555555554736 <foo+55> mov %rsp,%rdx
│0x555555554739 <foo+58> sub %r12,%rdx
│0x55555555473c <foo+61> cmp %rax,%rdx
│0x55555555473f <foo+64> jae 0x555555554743 <foo+68>
│0x555555554741 <foo+66> ud2
│0x555555554743 <foo+68> sub %rax,%rsp
│0x555555554746 <foo+71> mov %rsp,%rax
...
And here is the relevant snippet of the same assembly, annotated with RTL by using -fdump-rtl -dP
options, like so:
$ gcc -ggdb -O0 -fstack-limit-register=r12 -S -o stack-lim.s stack-lim.c -fstack-usage -DN=3922 -dP -fdump-rtl-final
Notice the (trap_if)
becomes ud2
:
...
#(insn 10 39 11 2 (parallel [
# (set (reg:DI 1 dx [94])
# (minus:DI (reg:DI 1 dx [94])
# (reg:DI 41 r12)))
# (clobber (reg:CC 17 flags))
# ]) "stack-lim.c":24 274 {*subdi_1}
# (nil))
subq %r12, %rdx # 10 *subdi_1/1 [length = 3]
#(insn 11 10 12 2 (set (reg:CC 17 flags)
# (compare:CC (reg:DI 1 dx [94])
# (reg:DI 0 ax [92]))) "stack-lim.c":24 8 {*cmpdi_1}
# (nil))
cmpq %rax, %rdx # 11 *cmpdi_1/1 [length = 3]
#(jump_insn 12 11 31 2 (set (pc)
# (if_then_else (geu (reg:CC 17 flags)
# (const_int 0 [0]))
# (label_ref 15)
# (pc))) "stack-lim.c":24 627 {*jcc_1}
# (nil)
# -> 15)
jnb .L5 # 12 *jcc_1 [length = 2]
#(insn 13 31 14 3 (trap_if (const_int 1 [0x1])
# (const_int 6 [0x6])) "stack-lim.c":24 1005 {trap}
# (nil))
ud2 # 13 trap [length = 2]
.L5:
#(insn 16 32 17 4 (parallel [
# (set (reg/f:DI 7 sp)
# (minus:DI (reg/f:DI 7 sp)
# (reg:DI 0 ax [92])))
# (clobber (reg:CC 17 flags))
# ]) "stack-lim.c":24 274 {*subdi_1}
# (nil))
subq %rax, %rsp # 16 *subdi_1/1 [length = 3]
#(insn 17 16 18 4 (set (reg:DI 0 ax [93])
# (reg/f:DI 7 sp)) "stack-lim.c":24 81 {*movdi_internal}
# (nil))
movq %rsp, %rax # 17 *movdi_internal/4 [length = 3]
...
I suspect the reason it is not well documented is because it is machine-specific as to which signal gets emitted, and indeed if such functionality is implemented at all. Although I do agree that the behavior for each target ought to be documented.
User contributions licensed under CC BY-SA 3.0