QEMU "-bios" vs. "-kernel" vs. "-device loader,file=..."

4

For background, I'm running bare-metal QEMU-4.1.0 on aarch64.

There are several ways to get QEMU to load compiled code into memory. I'd like to understand what the underlying differences are, because I see very different behavior and the documentation doesn't shed any light.

Consider this first command line:

qemu-system-aarch64 \
  -s -S \
  -machine virt,secure=on,virtualization=on \
  -m 512M \
  -smp 4 \
  -display none \
  -nographic \
  -semihosting \
  -serial mon:stdio \
  -kernel my_file.elf \
  -device loader,addr=0x40004000,cpu_num=0 \
  -device loader,addr=0x40004000,cpu_num=1 \
  -device loader,addr=0x40004000,cpu_num=2 \
  -device loader,addr=0x40004000,cpu_num=3 \
  ;

In another shell, if I launch gdb to see what QEMU has loaded into memory, it corresponds exactly to what I expect. In fact, gdb has a built-in command for this...

(gdb) compare-sections
Section .start, range 0x40004000 -- 0x40006164: matched.
Section .vectors, range 0x40006800 -- 0x40006f90: matched.
Section .text, range 0x40006fc0 -- 0x4002ca7c: matched.
...
Section .stacks, range 0x4207c120 -- 0x420bc120: matched.
(gdb) x/10x 0x40004000
0x40004000 <_start>:  0x14000800 0x00000000 0x00000000 0x00000000
...

Perfect! Everything in my ELF is at 0x40004000, and I see it all in memory, just how I expect! My first core boots and runs as I expect.

It is interesting to note that if I dump what is at location zero in memory, then there is stuff loaded down there. I didn't ask for it. I didn't explicitly load it. I don't execute it. It's not in my ELF file. I don't know what it is or where it came from. My GUESS is that QEMU has ASSUMED that I want some BIOS in the flash and has put one there. I don't know for sure. It also places something (small) at 0x40000000. I don't know what that is either... I do want to be careful that if I load something, we won't step on each other...

  1. My first question then becomes: Can I enable some debug messages to understand what QEMU is loading where, and perhaps even WHY?

Continuing... If I change my command line to REPLACE the "-kernel my_file.elf" switch with "-bios my_file.elf" switch (changing nothing else), and I repeat my run/gdb, then I see two things that are different...

First, I see that all my cores are running. I don't need to use PSCI calls to start them. Okay, but I don't think that's relevant to my issue. Second (and VERY important) is that my memory does NOT contain what I expect!

(gdb) compare-sections
Section .start, range 0x40004000 -- 0x40006164: MIS-MATCHED!
Section .vectors, range 0x40006800 -- 0x40006f90: MIS-MATCHED!
Section .text, range 0x40006fc0 -- 0x4002ca7c: MIS-MATCHED!
...
Section .stacks, range 0x4207c120 -- 0x420bc120: matched.
(gdb) x/8x 0x40000000
0x40004000 <_start>:  0x00000000 0x00000000 0x00000000 0x00000000
0x40004010 <_start+16>:  0x00000000 0x00000000 0x00000000 0x00000000
(gdb) x/8x 0x40006800
0x40006800 <my_vector_name>:  0x00000000 0x00000000 0x00000000 0x00000000
0x40006810 <my_vector_name+16>:  0x00000000 0x00000000 0x00000000 0x00000000
(gdb) x/8x 0x40006fc0
0x40006800 <my_symbol_name>:  0x00000000 0x00000000 0x00000000 0x00000000
0x40006810 <my_symbol_name+16>:  0x00000000 0x00000000 0x00000000 0x00000000

Everything is zero. I don't see MY code anywhere, although the mysterious code is still getting loaded at both 0x0 and 0x4000000. As you might also expect, the cores immediately die with an "Undefined Instruction" exception as soon as I issue "nexti" in my gdb.

Hmmm...

Okay, now I'll change the "-bios my_file.elf" to "-device loader,file=my_file.elf". I get the same result. I cannot find my code in memory.

  1. What happens differently under-the-hood in QEMU between "-bios" and "-kernel"? Where is that documented, or where in the source can I follow it? How can I best debug this?

Thank you kind Sir/Madam!


Edit:

For debugging, all the good/relevant stuff seems to be in "virt.c"...


More Edit (to add info from "-device loader=my_file.elf")

My command line is:

/tools/gnu/qemu-4.1.0/bin/qemu-system-aarch64 \
  -s -S \
  -machine virt,secure=on,virtualization=on \
  -cpu cortex-a53 \
  -d int \
  -m 512M \
  -smp 4 \
  -display none \
  -nographic \
  -semihosting \
  -serial mon:stdio \
  -device loader,file=NEW_AT_ZERO.elf \
  ;

Here's some of the relevant section of NEW_AT_ZERO.dis:

NEW_AT_ZERO.elf:     file format elf64-littleaarch64
NEW_AT_ZERO.elf
architecture: aarch64, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000000000000000

Program Header:
    LOAD off    0x0000000000010000 vaddr 0x0000000000000000 paddr 0x0000000000000000 align 2**16
       filesz 0x00000000020b8120 memsz 0x00000000020b8120 flags rwx
    NOTE off    0x0000000000043484 vaddr 0x0000000000033484 paddr 0x0000000000033484 align 2**2
       filesz 0x0000000000000024 memsz 0x0000000000000024 flags r--
private flags = 0:

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .start        00002164  0000000000000000  0000000000000000  00010000  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  1 .vectors      00000790  0000000000002800  0000000000002800  00012800  2**11
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  2 .text         00025bbc  0000000000002fc0  0000000000002fc0  00012fc0  2**6
                  CONTENTS, ALLOC, LOAD, READONLY, CODE
  3 .bss          0000a904  0000000000028b80  0000000000028b80  00038b7c  2**3
              ALLOC
....
Contents of section .start:
 0000 00080014 00000000 00000000 00000000  ................
....
 1000 fd170094 a00038d5 01044092 020c7892  ......8...@...x.
 1010 261842aa 660000b5 00038052 00005ed4  &.B.f......R..^.
 1020 7f2003d5 ffffff17 00000000 00000000  . ..............
....

...but of course...

GNU gdb (Linaro_GDB-2017.05.09) 7.12.1.20170417-git
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "--host=x86_64-unknown-linux-gnu --target=aarch64-none-elf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Remote debugging using localhost:1234

warning: No executable has been specified and target does not support
determining executable automatically.  Try using the "file" command.
0x0000000000000000 in ?? ()
Reading symbols from ./NEW_AT_ZERO.elf...done.
(gdb) compare_sections
Undefined command: "compare_sections".  Try "help".
(gdb) compare-sections
Section .start, range 0x0 -- 0x2164: MIS-MATCHED!
Section .vectors, range 0x2800 -- 0x2f90: MIS-MATCHED!
Section .text, range 0x2fc0 -- 0x28b7c: MIS-MATCHED!
....
Section .stacks, range 0x2078120 -- 0x20b8120: matched.
warning: One or more sections of the target image does not match
the loaded file

(gdb) x/4x 0
0x0 <_start>:   0x00000000  0x00000000  0x00000000  0x00000000
(gdb) x/12x 0x1000
0x1000 <symbol>:    0x00000000  0x00000000  0x00000000  0x00000000
0x1010 <symbol+16>: 0x00000000  0x00000000  0x00000000  0x00000000
0x1020 <end_symbol>:    0x00000000  0x00000000  0x00000000  0x00000000
qemu
arm64
asked on Stack Overflow Oct 16, 2019 by Lance E.T. Compte • edited Oct 17, 2019 by Lance E.T. Compte

1 Answer

10

QEMU's command line options for loading code into the guest are various, and often have different semantics between architectures or even between machine types for the same architecture. This is unfortunate but is the result of backwards-compatibility with older QEMU versions and a gradual accumulation of "it would be nice to Do The Right Thing for this image file type" special cases.

Broad summary:

-kernel is the "load a Linux kernel" option. It will load and boot the kernel in whatever way seems best for the architecture being used. For instance, for the x86 PC machine it will just provide the file to the guest BIOS and rely on the guest BIOS to do the actual loading of the file into RAM. On Arm, loading a Linux kernel means that we follow the rules the kernel lays down for how to boot it (https://www.kernel.org/doc/Documentation/arm64/booting.txt for 64-bit or https://www.kernel.org/doc/Documentation/arm/Booting for 32-bit), and we achieve that with a little bit of stub bootloader code (this is what you are seeing in low memory). The kernel boot rules also require that we provide it with a device tree blob in RAM, and this is the data at 0x40000000. We also, in accordance with Linux kernel boot expectations, handle secondary CPUs by either keeping them in PSCI powered-off state or via a little bit of secondary-CPU bootloader code which uses a WFI loop so that the primary can wake them up. (Which we do depends on the board model being used, because we do what the real board does, which especially for 32-bit boards varies a lot.)

As an oddball exception, for Arm if you pass an ELF file to -kernel, we'll assume that it is not a Linux kernel, and will boot it by just starting at the ELF entrypoint. We provide the DTB blob at the base of RAM, but only if it wouldn't overlap with the loaded ELF file. (Aside: for 'virt' in particular you want the DTB anyway, because we don't guarantee to keep devices in the same physical addresses between QEMU versions -- the DTB is how we tell guest code where it should look for things. You can rely on flash at 0x0 and RAM starting at 0x4000_0000, but really should pull all other device addresses from the DTB. In practice we have made efforts to avoid rearranging the board memory map, but reading the DTB is the right thing for guest code to do.)

-device loader is the "generic loader", which behaves the same on any architecture. It just loads an ELF image into guest RAM, and doesn't do anything to change the CPU reset behaviour. This is a good choice if you have a completely bare-metal image which includes the exception vector table and want to have it start in the same way the hardware would out of reset.

-bios is the "load a bios image, in whatever way seems good for this machine model" option. Again, this is a "do what I mean" kind of option whose specifics vary from machine model to machine model and from architecture to architecture; some machines don't support it at all. Some machines (eg x86 PC) will always load a bios, using a default binary if the user didn't specify. Some will load a bios if the user asks, but not otherwise (the arm virt board is like this). Generally a bios image is expected to be a "bare metal raw binary" image which will get loaded into some flash or ROM memory which corresponds to wherever the hardware starts execution when it comes out of reset. On at least some machines, including 'virt', you can instead provide the contents of the flash/ROM devices using a command line like "-drive if=pflash,...". This is an example of a common pattern in QEMU where you can either use a short "do what I mean" option that is convenient but has a lot of magic under the hood, or a longer "orthogonal" option which lets you specify lots of sub-options and get exactly the behaviour you want. Note that BIOS images should not be ELF files, they're expected to just be the raw data to put into the ROMs.

A lot of this is undocumented, because "I want to run a bare metal program of my own devising" is a very niche use case and because we don't have a good place in our documentation to make it easy to document the specifics of different board models.

answered on Stack Overflow Oct 17, 2019 by Peter Maydell

User contributions licensed under CC BY-SA 3.0