Precise details of writing a byte into PCIe address space from CPU

0

I am extremely confused about the exact series of steps involved in having the CPU write a value into a PCIe card's memory. It's very difficult to understand the precise meaning of stuff you read on the internet, so I'm hoping someone can read my theory of what's happening and point out any mistakes.

Setup

Suppose I have a PCIe card with some memory on it. For the sake of discussion, assume the following concrete setup:

  • It has 4 MB to be accessed via base address register 0 (whatever that means)
  • It is the only PCIe card in the whole system
  • It is plugged into a PCIe slot, which connects to the root complex via copper wires in the motherboard
  • There is a root complex connected directly to the CPU's bus (is this the normal way things are connected?)
  • The PCIe card is somehow configured to be device number 0 (how is this done?)
  • We're using Linux.

Let's also settle on terminology:

  • The system bus is the CPU's own bus.
  • The PCIe bus refers the literal wires on the motherboard between the CPU and PCIe slot
  • A driver is a Linux kernel module
  • A device is a literal physical object
  • A device struct is the pci_dev structure filled by the kernel
  • A BAR (base address register) is the field inside a PCIe device's configuration space
  • A BAR space is the address space which is indicated (?) by the value in a BAR.

My theory of what's going on

  1. At boot time, Linux starts probing different addresses to see if there's anything there.

    • Since the PCIe bus wires are only connected to the system bus wires via a bridge (i.e. the PCIe controller on the system bus) Linux must know how to interact with the PCIe controller.
    • Linux sends special commands to the PCIe controller (through memory-mapped IO?) that end up triggering the correct series of voltages on the PCIe wires
    • If it gets a response, it will fill in a pci_dev struct with other information in the configuration space
    • At some point in the future (when?) the kernel will iterate through the list of PCI drivers to try and match them to devices
  2. When Linux detects a device, it will map its BAR space into the system bus. (How is this done?) Suppose it maps the BAR0 space to address 0x55500000 to 0x5550FFFF on the system bus.
    • Linux must have to tell the root complex to listen to these addresses, and that they correspond to the PCIe card it detected.
    • By the way, Linux will set base address register 0 on our PCIe card to have 0x55500000 in its base address field (why bother?)
  3. Subsequent writes on the system bus between 0x55500000 and 0x5550FFFF are "caught" by the root complex, and issued out to the PCIe card
    • The root complex will essentially build a packet will all the headers and checksums and the like and blast them out over the motherboard's copper wires to the PCIe slot
  4. Supposing the CPU wrote 0xDEADBEEF to address 0x55501230 on the system bus, and that the root complex sent out the packet to the PCIe card, the card receives the packet and writes 0xDEADBEEF to 0x01230 in its local 4 MB memory

So: what parts of this are right and which are wrong?

linux
pci-e
asked on Stack Overflow Oct 18, 2019 by Marco Merlini

1 Answer

2

My experience is with Intel processors, so some of the details below may be specific to Intel processors, but it is mostly general. Also I don't know the details of how Linux identifies the driver to load for each device, so I skipped that question.

Modern CPUs don't have a system bus (except hidden within the CPU itself). They have memory channels, PCIe root ports, and a DMI port that connects to the chipset (also called the peripheral controller hub, or PCH). The PCH contains additional devices and may have additional root ports. The root complex comprises circuitry integrated into both the CPU and PCH. (Some CPU SoCs don't have DMI or a PCH, and all the root complex circuitry is within the CPU SoC.)

Even if your card is the only PCIe card in the whole system, there are other PCIe devices integrated into the root complex (called RCIEP or root-complex integrated endpoints). These may be within the CPU or in the PCH.

Your device, connected to a PCIe root port, will be configured as device 0 on some non-zero bus number. The bus number is dependent on the PCIe root port (i.e., slot) that the device is connected to and the way in which the BIOS configures the PCIe bus. (The same slot will generally have the same bus number, but it may not, depending on what is connected to the other PCIe root ports.)

The rest of your assumptions and terminology are fine.

Software accesses PCI config space either by using in/out instructions to the I/O ports 0xcf8 and 0xcfc, or by using the memory-mapped config space. The memory address range of PCI config space is set up by the BIOS. Software finds out the address by looking at the ACPI tables. The mechanism by which these I/O or memory accesses are converted to PCIe signals is totally within the root complex hardware.

The offset into the PCI config space address range controls which device/register software is accessing. For example, an access to MMCFG + 0 accesses register offset 0 of device 0:0.0. An access to MMCFG + 0x1000 accesses register offset 0 of device 0:0.1, and an access to MMCFG + 0x102000 accesses register offset 0 of device 1:0.2.

Software reads the vendor id/device id registers at offset 3:0 of each device address to detect whether a device exists at that device address. If no device is present, the PCI controller returns 0xffffffff. If a device is present, the device returns the vendor id and device id, allowing software to determine the type of device.

Each device has 6 BAR registers, at offsets 0x10, 0x14, ... 0x24. If a device supports 64-bit BARs, two adjacent BAR registers are used to configure a single region. Normally the BIOS configures the BARs of every device and also configures other (hidden) registers within the root complex to enable it to route memory accesses to the proper device. Software normally only writes to the BAR registers to detect the region size and then restores the values that were set by the BIOS. Depending on the root complex hardware, software may or may not be able to change the BAR values and still have accesses work properly.

answered on Stack Overflow Nov 1, 2019 by prl

User contributions licensed under CC BY-SA 3.0