I see the process image through pmap under linux:
08048000 0 4 0 r-x-- [my program]
08049000 0 4 4 rw--- [my program]
The three segments above are code, rodata and data segments, which are all aligned to the PAGESIZE(4K),but when I put the command objdump -h, the ELF headers are displayed as follows:
read-only code segment
Load off 0x00000000 vaddr 0x08048000 paddr 0x08048000 align 2**12
filesz 0x00000448 memsz 0x00000448 flags r-x
read/write data segment
Load off 0x00000448 vaddr 0x08049448 paddr 0x08049448 align 2**12
filesz 0x000000e8 memsz 0x00000104 flag rw-
It is said in the ELF header, code segment and data segment are addressed from 0x08048000,0x049448 in virtual address seperately, which is different from the process image in memory. I know the code/data segment should be assigned to different PAGESIZE,which can give them different protection permissions. However, how can the program execute if the real virtual is different from the elf binary?
The way ELF program loading (and memory mapping in general from files) is on a page basis. So the addresses involved, the offsets in the files, and the size must all be multiples of the page size.
However, the program loader is smart enough to deal with sections that do not begin or end exactly on a page boundary by rounding them out to the page boundary, mapping more than is required. So some extra data will be loaded from the file to fill out the page, but it shouldn't be accessed so that should not matter.
In your example, the code segment loads at address 0x08048000 from offset 0x0, with a size of 0x448. The address and offset are aligned, so just the size needs to be rounded up to a full page. The data segment loads at 0x08049448 from offset 0x448. Those aren't aligned, but are compatable -- the loader rounds both down to a page multiple (0x08049000 and 0x000) and maps in that page. Note that this ends up being the same page from the file as the code segment, so that page is loaded at two different addresses, one read-only, the other read-write-nonshared. So the code and data all ends up visible in two places in the process image, but that is unimportant -- the code ends up r-x at 0x8048000..0x8048447 and the data ends up rw- at 0x8049448..0x804954b, which is all that matters.
User contributions licensed under CC BY-SA 3.0