Overwrite EIP in the main function

2

I'm curious about how overwriting the stack is different in the main function than in other functions

Take this example:

#include <stdio.h>

int main(int argc, char *argv[])
{
    char buf[8]; 
    gets(buf); 
}

In this code, the buffer to be overflowed is created in the main function, and as a result I receive this output from gdb after entering in a lot of 'A's:

AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Program received signal SIGSEGV, Segmentation fault.
0x5655620c in main (argc=<error reading variable: Cannot access memory at address 0x41414141>,
    argv=<error reading variable: Cannot access memory at address 0x41414145>) at source.c:7
7       }
(gdb) info registers eip
eip            0x5655620c          0x5655620c <main+63>

Disassembly for main:

   0x000011cd <+0>:     endbr32
   0x000011d1 <+4>:     lea    ecx,[esp+0x4]
   0x000011d5 <+8>:     and    esp,0xfffffff0
   0x000011d8 <+11>:    push   DWORD PTR [ecx-0x4]
   0x000011db <+14>:    push   ebp
   0x000011dc <+15>:    mov    ebp,esp
   0x000011de <+17>:    push   ebx
   0x000011df <+18>:    push   ecx
   0x000011e0 <+19>:    sub    esp,0x10
   0x000011e3 <+22>:    call   0x120d <__x86.get_pc_thunk.ax>
   0x000011e8 <+27>:    add    eax,0x2df0
   0x000011ed <+32>:    sub    esp,0xc
   0x000011f0 <+35>:    lea    edx,[ebp-0x10]
   0x000011f3 <+38>:    push   edx
   0x000011f4 <+39>:    mov    ebx,eax
   0x000011f6 <+41>:    call   0x1070 <gets@plt>
   0x000011fb <+46>:    add    esp,0x10
   0x000011fe <+49>:    mov    eax,0x0
   0x00001203 <+54>:    lea    esp,[ebp-0x8]
   0x00001206 <+57>:    pop    ecx
   0x00001207 <+58>:    pop    ebx
   0x00001208 <+59>:    pop    ebp
   0x00001209 <+60>:    lea    esp,[ecx-0x4]
   0x0000120c <+63>:    ret

Here, the EIP register was not overwritten and apparently gdb cannot access memory at an overwritten address.

Whereas in this example where the buffer stuff is written in another function:

#include <stdio.h>

void over() {
    char buf[8]; 
    gets(buf); 
}

int main(int argc, char *argv[])
{
    over();
}
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) info registers eip
eip            0x41414141          0x41414141

Disassembly for main:

   0x000011f9 <+0>:     endbr32
   0x000011fd <+4>:     push   ebp
   0x000011fe <+5>:     mov    ebp,esp
   0x00001200 <+7>:     and    esp,0xfffffff0
   0x00001203 <+10>:    call   0x1219 <__x86.get_pc_thunk.ax>
   0x00001208 <+15>:    add    eax,0x2dd0
   0x0000120d <+20>:    call   0x11cd <over>
   0x00001212 <+25>:    mov    eax,0x0
   0x00001217 <+30>:    leave
   0x00001218 <+31>:    ret

Disassembly for over:

   0x000011cd <+0>:     endbr32
   0x000011d1 <+4>:     push   ebp
   0x000011d2 <+5>:     mov    ebp,esp
   0x000011d4 <+7>:     push   ebx
   0x000011d5 <+8>:     sub    esp,0x14
   0x000011d8 <+11>:    call   0x1219 <__x86.get_pc_thunk.ax>
   0x000011dd <+16>:    add    eax,0x2dfb
   0x000011e2 <+21>:    sub    esp,0xc
   0x000011e5 <+24>:    lea    edx,[ebp-0x10]
   0x000011e8 <+27>:    push   edx
   0x000011e9 <+28>:    mov    ebx,eax
   0x000011eb <+30>:    call   0x1070 <gets@plt>
   0x000011f0 <+35>:    add    esp,0x10
   0x000011f3 <+38>:    nop
   0x000011f4 <+39>:    mov    ebx,DWORD PTR [ebp-0x4]
   0x000011f7 <+42>:    leave
   0x000011f8 <+43>:    ret

A slightly different message is provided and the EIP is overwritten

Why does this make a difference? Why is the EIP not overwritten when the buffer is created in the main function?

I am using: gcc version 9.3.0 (Ubuntu 9.3.0-17ubuntu1~20.04)

And compiled with: gcc -m32 -g -fno-stack-protector source.c -o vuln -z execstack

c
gdb
stack-overflow
buffer-overflow
asked on Stack Overflow Apr 7, 2021 by Nick Pfeiffer • edited Apr 7, 2021 by Nick Pfeiffer

1 Answer

2

The difference is pretty arbitrary. The exact prologue/epilogue instruction sequence generated by GCC is different for over() in the second example than it is for main() in the first example. So it crashes it a very different way, from a debugger's point of view. After single-stepping in GDB, you can see why, and I have just killed some time doing so.

The stack is thoroughly corrupt upon returning from gets(), so all bets are off, but anyway, here goes. I run the first example, setting a breakpoint immediately after returning from the call to gets():

(gdb) disassemble main
Dump of assembler code for function main:
   0x0804842b <+0>: lea    0x4(%esp),%ecx
   0x0804842f <+4>: and    $0xfffffff0,%esp
   0x08048432 <+7>: pushl  -0x4(%ecx)
   0x08048435 <+10>:    push   %ebp
   0x08048436 <+11>:    mov    %esp,%ebp
   0x08048438 <+13>:    push   %ecx
   0x08048439 <+14>:    sub    $0x14,%esp
   0x0804843c <+17>:    sub    $0xc,%esp
   0x0804843f <+20>:    lea    -0x10(%ebp),%eax
   0x08048442 <+23>:    push   %eax
   0x08048443 <+24>:    call   0x80482e0 <gets@plt>
   0x08048448 <+29>:    add    $0x10,%esp
   0x0804844b <+32>:    mov    $0x0,%eax
   0x08048450 <+37>:    mov    -0x4(%ebp),%ecx
   0x08048453 <+40>:    leave  
   0x08048454 <+41>:    lea    -0x4(%ecx),%esp
   0x08048457 <+44>:    ret    
End of assembler dump.
(gdb) b *0x08048448
Breakpoint 1 at 0x8048448: file source.c, line 6.
(gdb) 

Now continue to enter some garbage, hit the breakpoint, and start single-stepping:

(gdb) r
Starting program: /home/lstrand/tmp/vuln 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA

Breakpoint 1, 0x08048448 in main (argc=<error reading variable: Cannot access memory at address 0x41414141>, 
    argv=<error reading variable: Cannot access memory at address 0x41414145>) at source.c:6
6       gets(buf); 
(gdb) disassemble
Dump of assembler code for function main:
   0x0804842b <+0>: lea    0x4(%esp),%ecx
   0x0804842f <+4>: and    $0xfffffff0,%esp
   0x08048432 <+7>: pushl  -0x4(%ecx)
   0x08048435 <+10>:    push   %ebp
   0x08048436 <+11>:    mov    %esp,%ebp
   0x08048438 <+13>:    push   %ecx
   0x08048439 <+14>:    sub    $0x14,%esp
   0x0804843c <+17>:    sub    $0xc,%esp
   0x0804843f <+20>:    lea    -0x10(%ebp),%eax
   0x08048442 <+23>:    push   %eax
   0x08048443 <+24>:    call   0x80482e0 <gets@plt>
=> 0x08048448 <+29>:    add    $0x10,%esp
   0x0804844b <+32>:    mov    $0x0,%eax
   0x08048450 <+37>:    mov    -0x4(%ebp),%ecx
   0x08048453 <+40>:    leave  
   0x08048454 <+41>:    lea    -0x4(%ecx),%esp
   0x08048457 <+44>:    ret    
End of assembler dump.
(gdb) bt
#0  0x08048448 in main (argc=<error reading variable: Cannot access memory at address 0x41414141>, 
    argv=<error reading variable: Cannot access memory at address 0x41414145>) at source.c:6
Backtrace stopped: Cannot access memory at address 0x4141413d
(gdb) stepi
0x0804844b  6       gets(buf); 
(gdb) 
7   }
(gdb) 
0x08048453  7   }
(gdb) 
0x08048454  7   }
(gdb) 
0x08048457  7   }
(gdb) 

Program received signal SIGSEGV, Segmentation fault.
0x08048457 in main (argc=<error reading variable: Cannot access memory at address 0x41414141>, 
    argv=<error reading variable: Cannot access memory at address 0x41414145>) at source.c:7
7   }
(gdb) bt
#0  0x08048457 in main (argc=<error reading variable: Cannot access memory at address 0x41414141>, 
    argv=<error reading variable: Cannot access memory at address 0x41414145>) at source.c:7
Backtrace stopped: Cannot access memory at address 0x4141413d
(gdb) info reg
eax            0x0  0
ecx            0x41414141   1094795585
edx            0xf7fa589c   -134588260
ebx            0x0  0
esp            0x4141413d   0x4141413d
ebp            0x41414141   0x41414141
esi            0xf7fa4000   -134594560
edi            0x0  0
eip            0x8048457    0x8048457 <main+44>
eflags         0x10286  [ PF SF IF RF ]
cs             0x23 35
ss             0x2b 43
ds             0x2b 43
es             0x2b 43
fs             0x0  0
gs             0x63 99
(gdb) 

Here, we die on the ret instruction in main() because the stack pointer esp has the bad value 0x4141413d. GDB correctly pinpoints the failing instruction as being in main().

But what happens in the over() case? Let's take a look:

lstrand@styx:~/tmp$ gdb ./vuln2
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 8.1.0.20180409-git
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./vuln2...done.
(gdb) disassemble over
Dump of assembler code for function over:
   0x0804842b <+0>: push   %ebp
   0x0804842c <+1>: mov    %esp,%ebp
   0x0804842e <+3>: sub    $0x18,%esp
   0x08048431 <+6>: sub    $0xc,%esp
   0x08048434 <+9>: lea    -0x10(%ebp),%eax
   0x08048437 <+12>:    push   %eax
   0x08048438 <+13>:    call   0x80482e0 <gets@plt>
   0x0804843d <+18>:    add    $0x10,%esp
   0x08048440 <+21>:    nop
   0x08048441 <+22>:    leave  
   0x08048442 <+23>:    ret    
End of assembler dump.
(gdb) b *0x0804843d
Breakpoint 1 at 0x804843d: file source2.c, line 5.
(gdb) r
Starting program: /home/lstrand/tmp/vuln2 
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAa

Breakpoint 1, 0x0804843d in over () at source2.c:5
5       gets(buf); 
(gdb) disassemble
Dump of assembler code for function over:
   0x0804842b <+0>: push   %ebp
   0x0804842c <+1>: mov    %esp,%ebp
   0x0804842e <+3>: sub    $0x18,%esp
   0x08048431 <+6>: sub    $0xc,%esp
   0x08048434 <+9>: lea    -0x10(%ebp),%eax
   0x08048437 <+12>:    push   %eax
   0x08048438 <+13>:    call   0x80482e0 <gets@plt>
=> 0x0804843d <+18>:    add    $0x10,%esp
   0x08048440 <+21>:    nop
   0x08048441 <+22>:    leave  
   0x08048442 <+23>:    ret    
End of assembler dump.
(gdb) info reg
eax            0xffffd198   -11880
ecx            0xf7fa45c0   -134593088
edx            0xf7fa589c   -134588260
ebx            0x0  0
esp            0xffffd180   0xffffd180
ebp            0xffffd1a8   0xffffd1a8
esi            0xf7fa4000   -134594560
edi            0x0  0
eip            0x804843d    0x804843d <over+18>
eflags         0x246    [ PF ZF IF ]
cs             0x23 35
ss             0x2b 43
ds             0x2b 43
es             0x2b 43
fs             0x0  0
gs             0x63 99
(gdb) stepi
6   }
(gdb) 
0x08048441  6   }
(gdb) 
0x08048442  6   }
(gdb) stepi
0x41414141 in ?? ()
(gdb) info reg
eax            0xffffd198   -11880
ecx            0xf7fa45c0   -134593088
edx            0xf7fa589c   -134588260
ebx            0x0  0
esp            0xffffd1b0   0xffffd1b0
ebp            0x41414141   0x41414141
esi            0xf7fa4000   -134594560
edi            0x0  0
eip            0x41414141   0x41414141
eflags         0x286    [ PF SF IF ]
cs             0x23 35
ss             0x2b 43
ds             0x2b 43
es             0x2b 43
fs             0x0  0
gs             0x63 99
(gdb) stepi

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) 

Note the subtle difference here. In this case, the epilogue code unwinds %esp with simple arithetic: "add $0x10,%esp" (as opposed to restoring it from the stack, as in the first case). The 'leave' instruction puts garbage into the frame pointer %ebp, but the new %esp value obtained from %ebp is still valid. Then the ret instruction sucessfully executes, leaving us a bad ip, 0x41414141. And then the program dies with SIGSEGV trying to read an instruction from nowhere.

In this case, GDB has no hope of unwinding the stack:

Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
(gdb) bt
#0  0x41414141 in ?? ()
#1  0x41414141 in ?? ()
#2  0x41414141 in ?? ()
#3  0x41414141 in ?? ()
#4  0x41414141 in ?? ()
#5  0xf7006141 in ?? ()
#6  0xf7fa4000 in ?? () from /lib/i386-linux-gnu/libc.so.6
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 

Recall in the first case, the program died on the ret instruction itself because %esp was already bad. In the first case GDB can still find where the program is, but in the second case it cannot.

answered on Stack Overflow Apr 7, 2021 by Leif Strand

User contributions licensed under CC BY-SA 3.0