We are looking for an procedure through which we can easily list down all the file which are compiled together to make an executable.
Use Case : Suppose, We have large repository and we want to know what all are the files existing in repository which are compiled to make an executable (i.e a.out)
For example :
dwarfdump a.out | grep "NS uri" 0x0000064a [ 9, 0] NS uri: "/home/main.c" 0x000006dd [ 2, 0] NS uri: "/home/zzzz.c" 0x000006f1 [ 2, 0] NS uri: "/home/yyyy.c" 0x00000705 [ 2, 0] NS uri: "/home/xxxx.c" 0x00000719 [ 2, 0] NS uri: "/home/wwww.c"
but it doesn't listed down the all the header files. please suggest.
How to Extract Source Code From Executable with Debug Symbol Available ?
You cannot do that. I guess you are on Linux/x86-64 (and your question is operating system and ABI specific, and debugging format specific). Of course, you should pass
-g (or even
-g3) to all the
gcc compilation commands for your executable. Without that
-g3 option used to compile every translation unit (including perhaps those of shared libraries!) you might not have enough information.
Even with debug information in DWARF format, the ELF executable don't contain source code, but only references to source code (e.g. source file path, position as line and column numbers). So the debug information contains stuff like file
src/foo.c, line 34 column 5 (but don't give anything about the content of
src/foo.c near that position). Of course once
gdb knows the file path
src/foo.c it is able to read that source file (if available and up to date w.r.t. executable) so it can list it.
Extracting that debugging meta-data is a different question. Once you have understood DWARF you could use tools like
addr2line or dwarfdump or libdwarf; and you could also script
gdb (recent versions of GDB may be extendable in Python or in Guile) and use it on your ELF executable.
Perhaps you should consider Ian Taylor's libbacktrace. It uses the DWARF information to provide nice looking backtraces at runtime.
cgdb is (like
ddd) only a front-end to
gdb which does all the real work of processing that DWARF information. It is free software, you can study its source code.
i have only
a.outthen i want to list done file names
You might try
dwarfdump -i | grep DW_AT_decl_file and you could use some GNU
awk command instead of
grep. You need to dive into the details of DWARF specifications and you need to understand more about the elf(5) format.
It doesn't listed down the all the header files
This is expected. Most header files don't contain any code, only declarations (e.g.
printf is not implemented in
<stdio.h> but in some C source file of your C standard library, e.g. in
tree/src/stdio/printf.c if you use musl-libc; it is just declared in
/usr/include/stdio.h). DWARF (and other debug information formats) are describing the binary code. And some header files get included only to give access to a few preprocessor macros (which get expanded or skipped at preprocessing time).
Maybe you dream of homoiconic programming languages, then try Common Lisp (e.g. with SBCL).
If your question is how to use
gdb, then please read the Debugging with GDB manual.
If your question is about decompilers, be aware that it is an impossible task in general (e.g. because of Rice's theorem). BTW, programs inside most Linux distributions are generally free software, so it is quite easy to get the source code (and you could even avoid using proprietary software on Linux).
BTW, you could also do more things at compilation time by passing more flags to
gcc. You might pass
-M (etc...) to
gcc (in addition of
-g). You could even consider writing your own GCC plugin to collect the information you want in some database (but that is probably not worth the effort). You could also consider improving your build automation (e.g. adding more into your
Makefile) to collect such information. BTW, many large C programs use some metaprogramming techniques by having some
.c files perhaps containing
#line directives generated by tools (e.g. bison) or scripts, then what kind of file path do you want to keep ??
We are looking for an procedure through which we can easily list down all the files which are compiled together to make an executable.
If you are writing that executable and compiling it from its source code, I would suggest collecting that information at build time. It could be as trivial as passing some
-H flag to
gcc, perhaps into some generated
timestamp.c file (see this for inspiration; but your
timestamp.c might contain information provided by
gcc -M etc...). Your timestamp file might contain
git version control metadata (like generated in this
Makefile). Read also about reproducible builds and about package managers.
User contributions licensed under CC BY-SA 3.0