Objcopy symbols are mixed or invalid in executable

0

As a simple example of my problem, let's say we have two data arrays to embed into an executable to be used in a C program: chars and shorts. These data arrays are stored on disk as chars.raw and shorts.raw.

Using objcopy I can create object files that contain the data.

objcopy --input binary --output elf64-x86-64 chars.raw char_data.o
objcopy --input binary --output elf64-x86-64 shorts.raw short_data.o

objdump shows that the data is correctly stored and exported as _binary_chars_raw_start, end, and size.

$ objdump -x char_data.o 

char_data.o:     file format elf64-x86-64
char_data.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .data         0000000e  0000000000000000  0000000000000000  00000040  2**0
                  CONTENTS, ALLOC, LOAD, DATA
SYMBOL TABLE:
0000000000000000 l    d  .data  0000000000000000 .data
0000000000000000 g       .data  0000000000000000 _binary_chars_raw_start
000000000000000e g       .data  0000000000000000 _binary_chars_raw_end
000000000000000e g       *ABS*  0000000000000000 _binary_chars_raw_size

(Similar output for short_data.o)

However, when I link these object files with my code into an executable, I run into problems. For example:

#include <stdio.h>

extern char _binary_chars_raw_start[];
extern char _binary_chars_raw_end[];
extern int _binary_chars_raw_size;

extern short _binary_shorts_raw_start[];
extern short _binary_shorts_raw_end[];
extern int _binary_shorts_raw_size;

int main(int argc, char **argv) {
        printf("%ld == %ld\n", _binary_chars_raw_end - _binary_chars_raw_start, _binary_chars_raw_size / sizeof(char));
        printf("%ld == %ld\n", _binary_shorts_raw_end - _binary_shorts_raw_start, _binary_shorts_raw_size / sizeof(short));
}

(compiled with gcc main.c char_data.o short_data.o -o main) prints

14 == 196608
7 == 98304

on my computer. The size _binary_chars_raw_size (and short) is not correct and I don't know why.

Similarly, if the _starts or _ends are used to initialize anything, then they may not even be located near each other in the executable (_end - _start is not equal to the size, and may even be negative).

What am I doing wrong?

c
gcc
linker
ld
object-files
asked on Stack Overflow May 21, 2020 by pizzapants184

1 Answer

1

The lines:

extern char _binary_chars_raw_start[];
extern char _binary_chars_raw_end[];
extern int _binary_chars_raw_size;

extern short _binary_shorts_raw_start[];
extern short _binary_shorts_raw_end[];
extern int _binary_shorts_raw_size;

They are not variables themselves. They are variables that are placed themselves at the beginning and end of the region. So the addresses of these variables are the start and end of the region. Do:

#include <stdio.h>

extern char _binary_chars_raw_start;
extern char _binary_chars_raw_end;
extern char _binary_chars_raw_size;

    // print ptrdiff_t with %td
    printf("%td == %d\n", 
          // the __difference in addresses__ of these variables
           &_binary_chars_raw_end - &_binary_chars_raw_start,
           (int)&_binary_chars_raw_size);
    // note: alsoo print size_t like result of `sizeof(..)` with %zu

@edit _size is also a pointer

answered on Stack Overflow May 21, 2020 by KamilCuk • edited May 21, 2020 by KamilCuk

User contributions licensed under CC BY-SA 3.0