Discrepancy in behavior of Linux loaders (ld-linux-x86-64) between Glibc 2.12 and Glibc 2.17

0

I'm trying to compile the same lib on two x86 separate machines. Both use the same toolchain (exactly same set of files) but have different Glibc versions.

When I run command LD_DEBUG=libs /lib64/ld-linux-x86-64.so.2 --list ./libl2ps.so I notice the following discrepancy between the 2 Linux loaders:

Machine 1 (with Glibc 2.12):

 19943: find library=libm.so.6 [0]; searching
 19943:  search path=/ebs/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64:...:/ebs/frperies/repo/gnb/uplane/build_bbp/l2_ps/build/.      (RPATH from file ./libl2ps.so)
 19943:   trying file=/ebs/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm.so.6
 19943: 
 19943: find library=libgcc_s.so.1 [0]; searching
 ...

In this case the Linux loader selects lib libm.so.6 from the toolchain path based on RPATH of lib libl2ps.so.

Machine 2 (with Glibc 2.17):

 10699: find library=libm.so.6 [0]; searching
 10699:  search path=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64:/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/lib64:/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib:/home/frperies/repo/gnb/uplane/build_bbp/l2_ps/build/.        (RPATH from file ./libl2ps.so)
 10699:   trying file=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm.so.6
 10699:   trying file=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/lib64/libm.so.6
 10699:   trying file=/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib/libm.so.6
 10699:   trying file=/home/frperies/repo/gnb/uplane/build_bbp/l2_ps/build/./libm.so.6
 10699:  search cache=/etc/ld.so.cache
 10699:   trying file=/lib64/libm.so.6

As for Machine 1, the loader attempts from RPATH of libl2ps.so to select lib libm.so.6 from toolchain path but skip it for some reason and try further other paths. Finally it selects libm.so.6from the system path /lib64/.

The RPATH of the 2 libs lib2ps.so are exactly the same. The two files libm.so.6 are also exactly the same on both machines (checked with md5sum).

I don't understand this differences of behavior between the 2 Linux loaders. Do you see any reason what would explain this discrepancy ?

Thank you very much for your answers.

Update:

Thank you yugr for your answer.

Output of readelf -h gives only differences on fields "Entry point address" and "Start of section headers" and there is no other differences so I think it will not help.

Regarding using dlopen()/dlerror(), I've done a little executable with the following statement:

dlopen("/home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so", RTLD_LAZY);

On machine 1 it works as expected:

C++ dlopen demo

Opening libm-2.28.so...
Closing library...

On machine 2 it fails and dlerror() gives the following output:

Cannot open library: /home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so: cannot open shared object file: No such file or directory

but the file libm-2-28.so really exists on my file system:

$ ls -l /home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic-linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so
-rwxr-xr-x 1 frperies linseeusers_lte_espoo 1682944 Oct  5 13:50 /home/frperies/repo/gnb/uplane/build/prefix-root/asik-x86_64-ps_lfs-dynamic- linker-on/toolchain/sysroots/core2-64-pc-linux-gnu/usr/lib64/libm-2.28.so

This is very weird, what could lead to this situation ???

Thanks

Update 2:

That is true that I haven't pointed out that machine 1 is a RHEL6.8 distro while machine 2 is RHEL7.4 distro. I (naively?) didn't think this really matters...

On machine 1:

$ cat /proc/sys/kernel/osrelease
4.4.115-1.NSN.el6.x86_64

$ uname -a
Linux sq24-3 4.4.115-1.NSN.el6.x86_64 #1 SMP Mon Feb 12 12:35:46 CET 2018 x86_64 x86_64 x86_64 GNU/Linux

$ readelf -n libl2ps.so 

Notes at offset 0x00000270 with length 0x00000024:
  Owner                 Data size   Description
  GNU                  0x00000014   NT_GNU_BUILD_ID (unique build ID bitstring)
  Build ID: b598468830fdf2f61eda25553b9a367c4d28cdc9

On machine 2:

$ cat /proc/sys/kernel/osrelease
3.10.0-693.el7.x86_64

$ uname -a
Linux localhost.localdomain 3.10.0-693.el7.x86_64 #1 SMP Thu Jul 6 19:56:57 EDT 2017 x86_64 x86_64 x86_64 GNU/Linux

$ readelf -n libl2ps.so 

Displaying notes found at file offset 0x00000270 with length 0x00000024:
  Owner                 Data size   Description 
  GNU                  0x00000014   NT_GNU_BUILD_ID (unique build ID bitstring)
  Build ID: 5829181bc0502233748149369108915ea7b10e8f

Does it help ?

Thanks

Update 3:

$ readelf -n libm.so.6 

Notes at offset 0x00000238 with length 0x00000024:
  Owner                 Data size   Description
  GNU                  0x00000014   NT_GNU_BUILD_ID (unique build ID bitstring)
  Build ID: 0d84c7247dd76008c096719043e5592735a1c4bd

Notes at offset 0x0000025c with length 0x00000020:
  Owner                 Data size   Description
  GNU                  0x00000010   NT_GNU_ABI_TAG (ABI version tag)
  OS: Linux, ABI: 4.4.0

So, how to interpret this ABI version number set to 4.4.0 ??

Thanks

Thank you yugr and Employed Russian for your answers!! I will give it a try by upgrading my Kernel version on Machine 2.

Thanks Regards

linker
shared-libraries
glibc
ldd
rpath
asked on Stack Overflow Oct 11, 2018 by frperies • edited Oct 14, 2018 by frperies

1 Answer

1

The error message that you see is the infamously confusing ENOENT errno. I see two instances of it in dl-load.c:

I suspect the first one fails in your case which would mean that OS kernel is incompatible between two machines. ld.so manpage indeed says that

Each shared object can inform the dynamic linker of the minimum kernel ABI version that it requires. (This requirement is encoded in an ELF note section that is viewable via readelf -n as a section labeled NT_GNU_ABI_TAG.) At run time, the dynamic linker determines the ABI version of the running kernel and will reject loading shared objects that specify minimum ABI versions that exceed that ABI version.

NT_GNU_ABI_TAG is 4.4.0 which means that you run a program expecting a minimum 4.4 kernel on a 3.10 kernel. Theoretically newer Glibc should run on older kernels as well but in your case Glibc was probly built with explicit --enable-kernel flag which prevents it's usage on kernels before 4.4 (see e.g. this explanation of --enable-kernel).

As a workaround, you may try to fool Glibc by overriding kernel version on machine 2 via

export LD_ASSUME_KERNEL=4.4.0

but it may not work if libm makes 4.4-specific syscalls that are not really present on 3.10.

answered on Stack Overflow Oct 12, 2018 by yugr • edited Oct 16, 2018 by yugr

User contributions licensed under CC BY-SA 3.0