I have an ARM binary of which I need to find exactly at which addresses its function's prologues end and the epilogues begin. In other words, I need the boundaries of the function bodies. For instance, if I have a function whose assembly is something like:
0x00000320 <+0>: push {r7, lr}
0x00000322 <+2>: sub sp, #16
0x00000324 <+4>: add r7, sp, #0
0x00000326 <+6>: str r0, [r7, #4]
0x00000328 <+8>: (Function body starts here)
...
0x0000034c <+44>: (Function body ends here)
0x0000034e <+46>: mov sp, r7
0x00000350 <+48>: pop {r7, pc}
I need a way to quickly find either 0x00000326
and 0x0000034e
(prologue end/epilogue start) or 0x00000328
and 0x0000034c
(function body start/end) using something like readelf or objdump. Simply disassembling it and inspecting the code won't do (ideally I'd be using a script to parse the output of readelf or whatever program I'm using to get the DWARF info).
According to the DWARF 4 standard, the .debug_line section supposedly has line number info which includes "prologue_end
" and "epilogue_begin
", which is exactly what I need. However, the output of arm-linux-readelf --debug-dump=rawline,decodedline
doesn't give me that info.
I'm compiling using gcc 4.8.2
with the -ggdb3
flag.
EDIT: Some more info: both objdump and readelf show me something like this:
Line Number Statements:
[0x00000074] Extended opcode 2: set Address to 0x100
[0x0000007b] Advance Line by 302 to 303
[0x0000007e] Copy
[0x0000007f] Special opcode 34: advance Address by 4 to 0x104 and Line by 1 to 304
[0x00000080] Special opcode 34: advance Address by 4 to 0x108 and Line by 1 to 305
[0x00000081] Special opcode 37: advance Address by 4 to 0x10c and Line by 4 to 309
[0x00000082] Special opcode 34: advance Address by 4 to 0x110 and Line by 1 to 310
[0x00000083] Special opcode 20: advance Address by 2 to 0x112 and Line by 1 to 311
[0x00000084] Special opcode 37: advance Address by 4 to 0x116 and Line by 4 to 315
[0x00000085] Special opcode 34: advance Address by 4 to 0x11a and Line by 1 to 316
[0x00000086] Advance Line by -13 to 303
[0x00000088] Special opcode 19: advance Address by 2 to 0x11c and Line by 0 to 303
[0x00000089] Special opcode 34: advance Address by 4 to 0x120 and Line by 1 to 304
[0x0000008a] Advance PC by 4 to 0x124
[0x0000008c] Extended opcode 1: End of Sequence
Looking at the source of binutils' dwarf.c, it seems that it should be printing something like "Set prologue_end to true" and "Set epilogue_begin to true" in the line info dump. However, all of the opcodes seem to be special instead of standard.
Try
readelf -wi
And look for DW_AT_low_pc and DW_AT_high_pc for the subroutine you are looking at.
The DWARF spec says:
A subroutine entry may have either a DW_AT_low_pc and DW_AT_high_pc pair of attributes or a DW_AT_ranges attribute whose values encode the contiguous or non-contiguous address ranges, respectively, of the machine instructions generated for the subroutine (see Section 2.17).
If I remember correctly, the DW_AT_low_pc is the address immediately after the prologue and DW_AT_high_pc is the last address before the epilogue.
Don't worry about the opcodes being 'special' that just means that they don't take arguments to save space in the encoded line number program.
This is an extremely belated response, but: if you're using LLVM, then you can get this information from the DWARF line program state machine: LLVM emits DW_LNS_set_prologue_end
and DW_LNS_set_epilogue_begin
as flags for each line program entry.
GCC apparently doesn't (yet) support these attributes, so its likely that its surrounding tooling (like readelf
and gdb
) don't either. But you should be able to parse the prologues and epilogues out with pyelftools
.
User contributions licensed under CC BY-SA 3.0