06

The Bus System

How the CPU talks to every other component through shared communication channels.

A CPU by itself can compute, but it cannot do anything useful without talking to other hardware: disks to read from, displays to write to, network adapters to send packets through. The wires and protocols that connect the CPU to everything else are called buses. The bus system is how a computer becomes more than a processor in a void.

During boot, the firmware must discover every device attached to every bus, assign each one an address, and configure it so the operating system can use it later. This process is called bus enumeration, and it happens before the OS loads.

What a Bus Actually Is

A bus is a shared communication channel. In the simplest case, it is a set of parallel wires connecting two or more components. One device puts data on the wires, another reads it off. A protocol defines who gets to talk when and how addresses are specified.

Think of a bus as a shared hallway in an apartment building. Anyone can walk down it, but there are rules: you do not block the hallway, you knock on the right door, and you wait your turn. The hallway itself is just space -- the rules are what make it work.

Key term: Bus A communication pathway shared by multiple devices. A bus includes the physical wires (or lanes), a protocol defining how data is transferred, and an addressing scheme so each device knows which messages are meant for it.

Early personal computers had a single bus for everything. The original IBM PC's ISA bus carried CPU instructions, memory access, and I/O device communication on the same set of wires. Modern systems split this into several specialized buses, each optimized for different types of traffic.

The Modern Bus Hierarchy

A modern x86 system has a layered bus hierarchy. The CPU sits at the top. It connects to memory through dedicated high-speed channels (the memory bus, which we covered in the previous article). It connects to everything else through a hierarchy of buses that branch out like a tree.

Fig. 06a -- Modern bus hierarchy
CPU DDR Memory PCIe Root Complex GPU (x16) NVMe SSD (x4) Chipset (PCH) SATA controller USB controller Ethernet (GbE) Audio HDD / SSD Keyboard / Mouse Network cable

Bandwidth decreases as you move away from the CPU

The CPU connects to memory directly and to everything else through PCIe. High-bandwidth devices (GPU, NVMe) connect to the CPU's PCIe root complex. Lower-speed devices connect through the chipset, which itself is a PCIe device.

The key component between the CPU and most peripherals is the chipset, which Intel calls the Platform Controller Hub (PCH) and AMD calls the Fusion Controller Hub (FCH). The chipset connects to the CPU through a dedicated PCIe link and provides the buses for lower-speed devices: SATA ports, USB controllers, audio, Ethernet, and additional PCIe slots.

PCIe: The Backbone

PCI Express (PCIe) is the dominant interconnect in modern computers. It replaced the older PCI and AGP buses. Unlike those parallel buses, PCIe uses serial point-to-point connections called lanes. Each lane is a pair of differential signal pairs -- one pair for sending, one for receiving -- so data flows in both directions simultaneously.

Key term: PCIe Lane A single bidirectional serial link made of two pairs of wires (four wires total). Each lane can transfer data in both directions at the same time. Devices are connected with x1, x4, x8, or x16 lanes depending on how much bandwidth they need.

A PCIe x1 link has one lane. A x16 slot has sixteen lanes. A single PCIe 4.0 lane delivers about 2 GB/s in each direction. A x16 GPU slot therefore provides roughly 32 GB/s of bandwidth to the graphics card. An NVMe SSD typically uses a x4 link, giving it about 8 GB/s.

PCIe is not a shared bus in the traditional sense. Each device has its own dedicated link to either the CPU's root complex or the chipset. There is no contention for wire time the way there was on the old PCI bus. If two PCIe devices want to send data at the same time, they can -- they are on separate wires.

PCIe Enumeration

At boot, the firmware must discover every PCIe device in the system. It does this by walking the bus hierarchy.

PCIe uses a tree topology. The root complex is at the top. Below it are bridges (or switches) that fan out to more devices. Each device has a configuration space -- a block of registers at a known address that describes the device: its vendor ID, device ID, what class of device it is (storage, network, display, etc.), and what resources it needs (memory ranges, I/O ports, interrupt lines).

The firmware scans this tree by reading the configuration space at every possible bus/device/function address. If something responds, the firmware records it and assigns it resources. If nothing responds, the firmware moves on. This process is called enumeration.

Fig. 06b -- PCIe enumeration: scanning the tree

PCIe Configuration Space Scan

Root Complex (Bus 0) Bus 0, Dev 0 GPU: 10de:2684 Bus 0, Dev 1 PCIe Bridge Bus 0, Dev 2 NVMe: 144d:a80a Bus 1 Bus 1, Dev 0 Ethernet: 8086:15b8 Bus 1, Dev 1 No response Device found No device (skip) Bridge (scan sub-bus)
The firmware reads configuration registers at each possible address. Devices that respond get recorded and assigned resources. Bridges reveal additional buses to scan.

Every PCIe device has a 256-byte (PCIe extends this to 4096 bytes) configuration space. The first 64 bytes are standardized. Byte 0 and 1 are the vendor ID. Bytes 2 and 3 are the device ID. The class code at offset 0x09-0x0B tells the firmware what kind of device it is -- mass storage, network controller, display adapter, and so on. The firmware does not need a device-specific driver to enumerate PCIe. It just reads these standard registers.

Address Spaces: Memory-Mapped I/O

When the firmware assigns resources to a device, it gives that device a range of memory addresses. But these addresses do not point to RAM. They point to the device itself. When the CPU reads or writes to one of these addresses, the transaction travels over the PCIe bus to the device instead of to DRAM. This is called memory-mapped I/O, or MMIO.

Key term: Memory-Mapped I/O (MMIO) A scheme where hardware device registers are assigned addresses in the same address space as system memory. The CPU accesses device registers using the same load and store instructions it uses for RAM. The chipset routes the transaction to the correct bus based on the address.

From the CPU's perspective, the address space is one big range of numbers. Some ranges map to RAM. Other ranges map to PCIe devices. The chipset and root complex route each memory access to the right destination based on address ranges configured during enumeration. The firmware sets up these mappings; the OS can adjust them later.

This is why a 32-bit system with 4 GB of RAM might only report 3.2 GB available. The missing 800 MB of address space is reserved for MMIO -- GPU frame buffers, device registers, and the firmware itself. On 64-bit systems, MMIO regions are typically mapped above the 4 GB line, so all physical RAM is accessible.

The firmware enumerates every bus and assigns every device an address range before the operating system loads. The OS inherits this map and can modify it, but the initial discovery is the firmware's job. Without enumeration, the OS would have no idea what hardware is present.

USB: A Different Model

USB (Universal Serial Bus) takes a different approach from PCIe. Where PCIe devices are discovered by scanning a static tree of addresses, USB devices announce themselves when they are plugged in. The USB host controller detects a voltage change on the port, then initiates a conversation with the new device to learn its type and capabilities.

USB is also hierarchical. A host controller connects to a root hub. The root hub has ports. External hubs can be connected to those ports, creating a tree up to seven levels deep with up to 127 devices. Each device gets a dynamic address assigned by the host controller during enumeration.

The firmware needs USB working early for a practical reason: keyboards and mice. If you want to enter the firmware setup screen or select a boot device, the firmware must have a USB driver running. UEFI firmware includes USB drivers as part of its DXE phase; BIOS implementations had to include USB support in their interrupt service routines.

SATA and Storage Buses

SATA (Serial ATA) connects traditional hard drives and older SSDs. Like PCIe, it uses serial point-to-point links, but it has a simpler protocol optimized for block storage. Each SATA port connects to exactly one device. The SATA controller typically lives on the chipset and presents itself to the firmware as a PCIe device.

Modern NVMe SSDs bypass SATA entirely. They connect directly via PCIe, which gives them access to the full PCIe bandwidth and a much simpler command protocol. A SATA SSD maxes out at about 550 MB/s. An NVMe SSD on PCIe 4.0 x4 can reach 7,000 MB/s. The difference is not just speed -- NVMe supports 65,535 command queues compared to SATA's single queue, which matters enormously for parallel workloads.

Fig. 06c -- Storage bus bandwidth comparison

Storage Interface Bandwidth

SATA III 0.6 GB/s

NVMe PCIe 3.0 3.5 GB/s

NVMe PCIe 4.0 7.0 GB/s

NVMe PCIe 5.0 14.0 GB/s

0 7 GB/s 14 GB/s
Each generation of PCIe roughly doubles the bandwidth available to storage. NVMe SSDs on PCIe 5.0 are more than 20 times faster than a SATA III drive.

DMA: Letting Devices Access Memory Directly

Normally, data moves through the CPU: the CPU reads from the device, then writes to memory, or vice versa. This works but wastes CPU cycles on what is essentially a copy operation.

DMA -- Direct Memory Access -- lets devices read from and write to system memory without involving the CPU at all. A network adapter receiving a packet can write the data directly into a buffer in RAM, then interrupt the CPU to say "your data is ready." The CPU never touches the data during the transfer.

Key term: DMA (Direct Memory Access) A capability that allows hardware devices to transfer data directly to or from system memory without CPU involvement. The device is given a memory address and a length, and it handles the transfer autonomously. The CPU is free to do other work during the transfer.

DMA is essential for high-throughput devices. A PCIe 4.0 NVMe drive can push 7 GB/s. If the CPU had to mediate every byte of that transfer, it would spend most of its time just copying data. DMA lets the drive write directly to RAM while the CPU runs application code.

The security implication is significant. A device with DMA access can read any physical memory address. A malicious PCIe device -- or a compromised firmware on a legitimate device -- could read encryption keys, passwords, or any other data in RAM. This is why modern systems include an IOMMU (Input/Output Memory Management Unit) that restricts which memory regions each device can access. The IOMMU is configured by the firmware and managed by the OS.

The bus system is the circulatory system of the computer. PCIe provides the high-bandwidth backbone, USB handles human-interface and peripheral devices, and SATA connects traditional storage. The firmware discovers all of these during enumeration, assigns addresses, and builds the device map that the operating system will inherit. DMA lets devices move data without burdening the CPU, but it requires careful access control.

With memory working and the bus system enumerated, the firmware knows what hardware is available. The next step is finding something to boot from.

Next: Finding the Boot Device