I am playing around with CPython and trying to understand how a debugger works.
Specifically, I am trying to get the location of the last PyFrameObject
so that I can traverse that and get the Python backtrace.
In the file ceval.c
, line 689 has the definition of the function:
PyObject * PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
What I am interested in getting is the location of f
on the stack. When dumping the binary with dwarfdump
I get that f
is at $rbp-824
, but if I dump the binary with objdump
I get that the location is $rbp-808
- a discrepancy of 16. Also, when debugging with GDB, I get that the correct answer is $rbp-808
like objdump
gives me. Why the discrepancy, and why is dwarfdump
incorrect? What am I not understanding?
How to technically recreate the problem:
Download python-2.7.17.tgz
from Python website. Extract.
I compiled python-2.7.17 from source with debug symbols (./configure --enable-pydebug && make
). Run the following commands on the resulting python
binary:
dwarfdump Python-2.7.17/python
has the following output:
DW_AT_name f
DW_AT_decl_file 0x00000001 /home/meir/code/python/Python-2.7.17/Python/ceval.c
DW_AT_decl_line 0x000002b1
DW_AT_type <0x00002916>
DW_AT_location len 0x0003: 91c879: DW_OP_fbreg -824
I know this is the correct f
because the line the variable is declared on is 689 (0x2b1)
. As you can see the location is:
DW_AT_location len 0x0003: 91c879: DW_OP_fbreg -824
: Meaning $rbp-824
.
Running the command objdump -S Python-2.7.17/python
has the following output:
PyEval_EvalFrameEx(PyFrameObject *f, int throwflag)
{
f7577: 55 push %rbp
f7578: 48 89 e5 mov %rsp,%rbp
f757b: 41 57 push %r15
f757d: 41 56 push %r14
f757f: 41 55 push %r13
f7581: 41 54 push %r12
f7583: 53 push %rbx
f7584: 48 81 ec 38 03 00 00 sub $0x338,%rsp
f758b: 48 89 bd d8 fc ff ff mov %rdi,-0x328(%rbp)
f7592: 89 b5 d4 fc ff ff mov %esi,-0x32c(%rbp)
f7598: 64 48 8b 04 25 28 00 mov %fs:0x28,%rax
f759f: 00 00
f75a1: 48 89 45 c8 mov %rax,-0x38(%rbp)
f75a5: 31 c0 xor %eax,%eax
Debugging this output will show you that the relevant line is:
f758b: 48 89 bd d8 fc ff ff mov %rdi,-0x328(%rbp)
where you can clearly see that f
is being loaded from -0x328(%rbp)
which is $rbp-808
. Also, GDB supports this finding.
So again, the question is, what am I missing and why the 16 byte discrepency between dwarfdump
and reality?
Thanks
Edit:
The dwarfdump
including the function above is:
< 1><0x00004519> DW_TAG_subprogram
DW_AT_external yes(1)
DW_AT_name PyEval_EvalFrameEx
DW_AT_decl_file 0x00000001 /home/meir/code/python/Python-2.7.17/Python/ceval.c
DW_AT_decl_line 0x000002b1
DW_AT_prototyped yes(1)
DW_AT_type <0x00000817>
DW_AT_low_pc 0x000f7577
DW_AT_high_pc <offset-from-lowpc>53969
DW_AT_frame_base len 0x0001: 9c: DW_OP_call_frame_cfa
DW_AT_GNU_all_tail_call_sites yes(1)
DW_AT_sibling <0x00005bbe>
< 2><0x0000453b> DW_TAG_formal_parameter
DW_AT_name f
DW_AT_decl_file 0x00000001 /home/meir/code/python/Python-2.7.17/Python/ceval.c
DW_AT_decl_line 0x000002b1
DW_AT_type <0x00002916>
DW_AT_location len 0x0003: 91c879: DW_OP_fbreg -824
According to the answer below, DW_OP_fbreg
is offset from the frame base - in my case DW_OP_call_frame_cfa
. I am having trouble identifying the frame base. My registers are as following:
(gdb) info registers
rax 0xfffffffffffffdfe -514
rbx 0x7f6a4887d040 140094460121152
rcx 0x7f6a48e83ff7 140094466441207
rdx 0x0 0
rsi 0x0 0
rdi 0x0 0
rbp 0x7ffd24bcef00 0x7ffd24bcef00
rsp 0x7ffd24bceba0 0x7ffd24bceba0
r8 0x7ffd24bcea50 140725219813968
r9 0x0 0
r10 0x0 0
r11 0x246 582
r12 0x7f6a48870df0 140094460071408
r13 0x7f6a48874b58 140094460087128
r14 0x1 1
r15 0x7f6a48873794 140094460082068
rip 0x5559834e99c0 0x5559834e99c0 <PyEval_EvalFrameEx+46153>
eflags 0x246 [ PF ZF IF ]
cs 0x33 51
ss 0x2b 43
ds 0x0 0
es 0x0 0
fs 0x0 0
gs 0x0 0
As stated above, I already know that %rbp-808
works. What is the correct way to do it with the registers that I have?
Edit:
I finally understood the answer. I needed to unwind one more function, and find the place my function was called. There, the variable I was looking for really was in $rsp
and $rsp-824
was correct
DW_OP_fbreg -824
: Meaning$rbp-824
It does not mean that. It means, offset -824
from frame base (virtual) register, which is not necessarily (nor usually) equal to $rbp
.
You need to look for DW_AT_frame_base
to know what the frame base in the current function is.
Most likely it's defined as DW_OP_call_frame_cfa
, which is the value of $RSP
just before current function was called, and is equal to $RBP-16
(8 bytes for return address saved by the CALL
instruction, and 8 bytes for previous $RBP
saved by the first instruction of your function).
User contributions licensed under CC BY-SA 3.0