does the loader modify relocation information on program startup?

Question

does the loader modify relocation information on program startup?

I have always believed that resolving absolute addresses is completely the linker's job. That is, after the linker combines all object files into one executable, it will then modify all absolute addresses to reflect the new location inside the executable. But after reading in here that the loader does not have to place programs text at the address specified by the linker, I got really confused.

Take this code for example

Main.c

 void printMe();
int main(){
    printMe();

    return 0;

}

Foo.c

/* Lots of other functions*/
void printMe(){
     printf("Hello");
}

Say that after linking, the code for main gets placed at address 0x00000010 and the code for printMe gets placed at address 0x00000020. Then when the program is launched, the loader will indeed load main and printMe to their virtual addresses as specified by the linker. But if the loader does not load the program in that manner, then won't that break all absolute address references.

linux

compilation

linker

loader

virtual-memory

asked on Stack Overflow Apr 29, 2018 by

GamefanA

2 Answers

A program is normally composed of several modules created by a linker. There is the executable and usually a number of shared libraries. On some systems one executable can load another executable and call it's starting routine as a function.

If all these compiled uses had fixed addresses, it is likely there would be conflicts upon loading. If two linked modules used the same address, the application could not load.

For decades, relocatable code has been the solution to that problem. A module can be loaded anywhere. Some system take this to the next step and randomly place modules in memory for security.

There are some situations where code cannot be purely relocatable.

If you have something like this:

static int b, *a = &b ;

the initialization depends upon where the model is placed in memory (and where "b" is located). Linkers usually generate information for such constructs so that the loader can fix them up.

Thus, this is not correct:

I have always believed that resolving absolute addresses is completely the linker's job.

answered on Stack Overflow Apr 30, 2018 by

user3344003

According to my knowledge, it's not the case here.

If it is linked statically, then the address of function is calculated statically by th linker. Because the relative address is know, so a relative function call is issued, and everything will be fine.

If it is linked dynamically, then ld.so comes in and loads the lib. The symbol is resolve either by Load-time relocation of shared libraries or by Position Independent Code (PIC) in shared libraries (these 2 articles aren't writen by me).

To simply put,

load-time relocation is done by rewriting code to give them the correct address, which disables wirte-protect and share among different processes.
PIC is done by adding 2 sections called GOT and PLT, all at a specific address that can be know at link-time. A call to a function in dynamic lib will first call a ...@plt function (E.x. printf@plt) and then it will jump *GOT[offset]. At the first call, this will actually be the address of the next instruction, which will call the dynamic loader to load the function. At the second call, this will be the address of the function. As you can see, this cost additional memory and time compared to normal code.

answered on Stack Overflow Apr 29, 2018 by

JiaHao Xu

User contributions licensed under CC BY-SA 3.0