Assembly code different from gdb display of code

0

I'm learning about operating systems from the book Operating Systems from 0 to 1, and I'm trying to display the code in my kernel called main, however the code displayed in GDB is not the same even though I jumped to the address that is the entry point.

bootloader.asm

;*************************************************
; bootloader.asm 
; A Simple Bootloader
;*************************************************
bits 16
start: jmp  boot

;; constants and variable definitions
msg db "Welcome to My Operating System!", 0ah, 0dh, 0h

boot:

    cli ; no interrupts 
    cld ; all that we need to init
    
    mov ax, 0x0000

    ;; set buffer
    mov es, ax  
    mov bx, 0x0600

    mov al, 1   ; read one sector
    mov ch, 0   ; track 0
    mov cl, 2       ; sector to read
    mov dh, 0   ; head number
    mov dl, 0   ; drive number
        
    mov ah, 0x02    ; read sectors from disk    
    int 0x13      ; call the BIOS routine
    jmp 0x0000:0x0600   ; jump and execute the sector!
    
    hlt ; halt the system 

; We have to be 512 bytes. Clear the rest of the bytes with 0

times 510 - ($-$$) db 0
dw 0xAA55 ; Boot Signature

readelf -l main

ELF Header:
  Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF32
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Intel 80386
  Version:                           0x1
  Entry point address:               0x600
  Start of program headers:          52 (bytes into file)
  Start of section headers:          12888 (bytes into file)
  Flags:                             0x0
  Size of this header:               52 (bytes)
  Size of program headers:           32 (bytes)
  Number of program headers:         3
  Size of section headers:           40 (bytes)
  Number of section headers:         12
  Section header string table index: 11

readelf -l main


Elf file type is EXEC (Executable file)
Entry point 0x600
There are 3 program headers, starting at offset 52

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000000 0x00000000 0x00000000 0x00094 0x00094 R   0x4
  LOAD           0x000000 0x00000000 0x00000000 0x00094 0x00094 R   0x4
  LOAD           0x000100 0x00000600 0x00000600 0x00006 0x00006 R E 0x100

 Section to Segment mapping:
  Segment Sections...
   00     
   01     
   02     .text 

main.c

void main(){}

objdump -z -M intel -S -D build/os/main

Disassembly of section .text:

00000600 <main>:
void main(){}
 600:   55                      push   ebp
 601:   89 e5                   mov    ebp,esp
 603:   90                      nop
 604:   5d                      pop    ebp
 605:   c3                      ret    

But this is GDB's output by setting a breakpoint at main 0x600

0x600 <main>    jg     0x647                                               │
│   0x602 <main+2>  dec    esp                                                 │
│   0x603 <main+3>  inc    esi                                                 │
│   0x604 <main+4>  add    DWORD PTR [ecx],eax                                 │

why is this happening? Am I loading at the wrong address? How do I find the correct address to load at?

edit: here is the code for compiling;

nasm -f elf bootloader.asm -F dwarf -g -o ../build/bootloader/bootloader.o
ld -m elf_i386 -T bootloader.lds ../build/bootloader/bootloader.o -o ../build/bootloader/bootloader.o.elf
objcopy -O binary ../build/bootloader/bootloader.o.elf ../build/bootloader/bootloader.o
gcc -ffreestanding -nostdlib -fno-pic -gdwarf-4 -m16 -ggdb3 -c main.c -o ../build/os/main.o
ld -m elf_i386 -nmagic -T os.lds  ../build/os/main.o -o ../build/os/main
dd if=/dev/zero of=disk.img bs=512 count=2880
2880+0 records in
2880+0 records out
1474560 bytes (1.5 MB, 1.4 MiB) copied, 0.0150958 s, 97.7 MB/s
dd conv=notrunc if=build/bootloader/bootloader.o of=disk.img bs=512 count=1 seek=0
1+0 records in
1+0 records out
512 bytes copied, 0.000127745 s, 4.0 MB/s
dd conv=notrunc if=build/os/main.o of=disk.img bs=512 count=$((8504/512))
seek=1
16+0 records in
16+0 records out
8192 bytes (8.2 kB, 8.0 KiB) copied, 0.000184251 s, 44.5 MB/s
qemu-system-i386 -machine q35 -fda disk.img -gdb tcp::26000 -S

and gdb code for displaying main code;

set architecture i8086
target remote localhost:26000
b *0x7c00
set disassembly-flavor intel
layout asm
layout reg
symbol-file build/os/main
b main
c
assembly
gdb
kernel
bootloader
asked on Stack Overflow Aug 21, 2020 by USER149372 • edited Aug 22, 2020 by USER149372

1 Answer

1

jg / dec esp / inc esi is the ELF magic number, not machine code! You'll see the same thing from the start of the output of ndisasm -b32 /bin/ls. (ndisasm always treats its input as a flat binary; it doesn't look for any metadata.)

7F 45 4C 46 is the string "ELF" after a 0x7F byte, the ELF magic number that identifies the file format as ELF. It's followed by more ELF header bytes before the actual machine code for main. objdump -D disassembles all ELF sections, but it still parses the ELF headers, not disassembling them like ndisasm does. So you still just end up seeing the code from the .text section because the others are empty (because you linked this executable without libc or CRT startfiles, and with C main as the ELF entry point?!?)

You're jumping to the start of the ELF file as if it was a flat binary. It's not, writing an ELF program loader is not that simple. The ELF program headers (which readelf can parse) tell you which file offset goes at which address. The start of the .text section will be at some offset into the file, not overlapping the ELF magic number for obvious reasons. (Although it can overlap with the ELF header if you can find a way to make it fit: http://www.muppetlabs.com/~breadbox/software/tiny/teensy.html)

Then once you have the file mapped into memory as specified in the program headers, you jump to the ELF entry point address (0x600 in your case). (Which is normally not a function; under a real OS like Linux, you can't ret from the entry point. Instead you need to make an exit system call.) You can't here, either, because you jmp to it instead of call.

This is why _start is separate from main; building a program with a compiler-generated main as its entry point doesn't work.

Of course most of this effort is doomed because you're jumping to your main with the CPU still in 16-bit real mode. But your main is compiled/assembled for 32-bit mode. You could somewhat work around that with gcc -m16 to assemble gcc output for 16-bit mode, using operand-size + address-size prefixes as necessary.

The machine code for that do-nothing main will actually work the in both 16 and 32-bit mode. If you'd used a return 0 without optimization, that wouldn't be the case: the opcode (without prefixes) for mov eax, imm32 implies a different instruction length depending on what mode the CPU decodes it in, so decoding in 16-bit mode would write AX and leave 2 bytes of zeros.


Most likely the easiest thing to do is turn your "kernel" into a flat binary, instead of writing an ELF program loader in your bootloader. Follow an osdev tutorial because lots can go wrong, and you have to be careful about static data for example.

Or see How to make the kernel for my bootloader? for an example bootloader that calls a C function after switching to 32-bit protected mode.

See more links in https://stackoverflow.com/tags/x86/info.

answered on Stack Overflow Aug 21, 2020 by Peter Cordes • edited Aug 21, 2020 by Peter Cordes

User contributions licensed under CC BY-SA 3.0