GRUB has loaded the kernel image into memory and jumped to its entry point. But the code now running is not the Linux kernel. It is the decompression stub -- a small, self-contained program embedded at the front of the bzImage. Its job is to decompress the actual kernel, move it to its final address, and jump to it.
Between the bootloader's jump and the kernel's first real instruction, two major transitions happen. First, the compressed kernel must be extracted into a working executable. Second, the CPU must switch from the constrained 16-bit real mode it has been running in since power-on to the full 32-bit (or 64-bit) protected mode that the kernel requires. These two operations are deeply intertwined.
Why the Kernel Is Compressed
A modern Linux kernel, compiled with a typical desktop or server configuration, produces an uncompressed vmlinux binary of 30 to 80 megabytes. That is a lot of data to load from disk during boot. Disk reads are slow -- even on an NVMe drive, reading 60 MB takes noticeably longer than reading 12 MB.
Compression shrinks the kernel to roughly one-fifth its original size. The trade-off is CPU time: the decompressor must run after loading to recover the original data. But CPUs are fast and disk I/O is relatively slow, so decompressing in memory is almost always faster than loading the uncompressed image from disk. The net boot time is shorter with compression.
The Linux kernel build system supports several compression algorithms:
- gzip -- the original default. Good compression ratio, moderate speed.
- LZ4 -- faster decompression, slightly larger output.
- LZMA / XZ -- higher compression ratio, slower decompression.
- Zstandard (zstd) -- good balance of ratio and speed. Increasingly the default on modern distributions.
The choice is made at kernel compile time (CONFIG_KERNEL_GZIP, CONFIG_KERNEL_LZ4, etc.). The decompression stub is built to match: it contains exactly one decompressor, for whichever algorithm was selected.
The First Instructions: Real-Mode Setup
When GRUB jumps to the kernel's entry point, execution begins in the real-mode setup code -- the part of the bzImage that GRUB loaded to low memory (around 0x10000). This code runs with the CPU still in 16-bit real mode. It performs a series of preparedness checks and hardware queries before the transition to protected mode.
The real-mode setup code does the following:
-
Normalizes the CPU state. Sets segment registers to known values, clears the direction flag, and establishes a stack.
-
Queries the BIOS for hardware information. This is the last chance to use BIOS services, because they only work in real mode. The code calls INT 0x15 to get the system memory map (the E820 map), INT 0x10 to query video modes, and other interrupts to detect hardware features. All results are stored in the boot parameters structure where the kernel will read them later.
-
Enables the A20 line. On the original IBM PC, address line 20 was disabled to maintain compatibility with 8086 programs that relied on address wraparound. Enabling the A20 line allows the CPU to address memory above 1 MB. Without it, protected mode cannot access the full address space. The A20 enabling code tries several methods (keyboard controller, Fast A20 via port 0x92, BIOS INT 0x15) because different hardware requires different approaches.
-
Prepares for the mode switch. The code loads a temporary Global Descriptor Table (GDT) and Interrupt Descriptor Table (IDT) -- the data structures that the CPU requires before it can enter protected mode.
The Mode Switch
The transition from real mode to protected mode is one of the most important moments in the boot process. In real mode, the CPU acts like a fast 1981 processor: 16-bit operations, 1 MB address space, no memory protection. In protected mode, the CPU gains 32-bit operations, a 4 GB address space, memory segmentation with privilege levels, and the ability to run the kind of code that a modern operating system requires.
The switch itself is a single instruction: setting bit 0 (the PE bit, for Protection Enable) in the CR0 control register. But that one instruction only works if the ground has been prepared. The GDT must be loaded. The A20 line must be enabled. Interrupts must be disabled (the real-mode interrupt handlers would crash if called in protected mode).
The sequence looks like this in pseudocode:
cli ; disable interrupts
lgdt [gdt_pointer] ; load the Global Descriptor Table
mov eax, cr0 ; read CR0
or eax, 1 ; set PE bit
mov cr0, eax ; write CR0 -- NOW IN PROTECTED MODE
jmp 0x10:protected_entry ; far jump to reload CS with a 32-bit selector
The far jump immediately after setting CR0 is essential. It flushes the CPU's instruction pipeline and loads the code segment register (CS) with a selector that points to a 32-bit code segment in the GDT. Without this jump, the CPU would continue fetching instructions using the old 16-bit CS value, and the next instruction would be decoded incorrectly.
On 64-Bit Systems: The Long Mode Jump
Modern x86-64 systems take the mode switching further. The kernel needs 64-bit long mode to access the full address space and use 64-bit registers. On these systems, the decompression stub typically transitions through protected mode and then into long mode before decompressing.
The transition to long mode requires additional steps beyond protected mode:
-
Set up page tables. Long mode requires paging to be enabled. The stub creates a minimal identity-mapped page table -- a page table where virtual addresses equal physical addresses. This is temporary; the kernel will build its own page tables later.
-
Enable PAE. Physical Address Extension (bit 5 of CR4) must be set. Long mode requires it.
-
Set the LME bit. The Long Mode Enable bit in the EFER (Extended Feature Enable Register) MSR signals the CPU to prepare for 64-bit operation.
-
Enable paging. Setting the PG bit (bit 31) in CR0, with LME already set, activates long mode.
-
Far jump to 64-bit code. Like the real-to-protected switch, a far jump reloads CS with a 64-bit code segment selector.
After this sequence, the CPU is in 64-bit mode with identity-mapped paging, and the decompressor can use the full address space.
The Decompression
With the CPU in the correct mode, the decompression stub can do its primary job: decompress the kernel.
The stub knows the exact location and size of the compressed data because these values were embedded in the image at build time. The compressed kernel payload sits immediately after the stub code in the bzImage. The stub also knows the expected size of the decompressed output.
The process is straightforward in concept:
-
Determine the output address. The stub decompresses the kernel to a location that does not overlap with the compressed data or the stub itself. The kernel build system calculates safe addresses at compile time.
-
Call the decompressor. The stub invokes the appropriate decompression function (gzip's
inflate, LZ4'sLZ4_decompress, zstd'sZSTD_decompress, etc.). This function reads the compressed data and writes the decompressed kernel to the output address. -
Verify the output. Some configurations check a CRC or other integrity value to confirm the decompression produced correct data.
The decompressed output is vmlinux -- the raw kernel ELF binary, stripped of debug symbols. This is the actual Linux kernel: the scheduler, the memory manager, the device driver framework, the system call interface, everything.
Relocation
The decompressed kernel needs to be at a specific physical address. On x86-64, this is typically 0x1000000 (16 MB) for a relocatable kernel, but the kernel's link address (compiled-in start address) determines the expectation.
If the decompressed data is not already at the correct address, the stub copies it there. Modern kernels are often built as position-independent executables (with CONFIG_RELOCATABLE=y), which means they can run at addresses other than their link address. The kernel adjusts its internal references at startup. But the decompression stub still relocates the image to a clean region of memory before jumping.
The Handoff
The final act of the decompression stub is a jump to the kernel's entry point -- the startup_64 function (on x86-64) or startup_32 (on 32-bit systems). At this instant, control passes from boot code to kernel code.
The kernel inherits a specific machine state:
- CPU mode: 64-bit long mode (or 32-bit protected mode on older systems), with identity-mapped paging.
- Interrupts: disabled. The kernel will set up its own interrupt handlers before re-enabling them.
- Boot parameters: in a known memory location. These contain the memory map, command line, initrd location, video mode, and other data collected by the real-mode setup code and GRUB.
- Memory contents: the decompressed kernel at its final address, the initramfs at its GRUB-assigned address, and the boot parameters structure in low memory.
The kernel does not depend on GRUB's code or data structures. It does not call back to the bootloader. From this point, GRUB might as well not exist. The kernel reads the boot parameters, initializes its own data structures, and begins the process of bringing up a full operating system.
The Boot Parameters (Zero Page)
The boot parameters structure deserves a closer look, because it is the only communication channel between the bootloader world and the kernel world. It occupies the first page (4096 bytes) of the real-mode code segment -- historically called the "zero page" because it starts at offset zero of that segment.
The zero page contains:
- The E820 memory map from BIOS (or EFI memory map on UEFI systems).
- The kernel command-line pointer and length.
- The initrd address and size.
- The video mode and framebuffer information.
- The number of setup sectors and the boot protocol version.
- Hardware detection results from the real-mode setup code.
The kernel's first initialization code reads this structure to learn about the machine. Without it, the kernel would not know how much RAM exists, what video mode is active, where the initrd is, or what command-line parameters the user requested.
Summary of the Journey So Far
From GRUB's jump to the kernel's first real instruction, a remarkable amount happens:
- The real-mode setup code queries the BIOS for hardware information.
- The A20 line is enabled.
- A temporary GDT is loaded.
- The CPU switches to protected mode, then to long mode.
- The decompression stub extracts the compressed kernel.
- The decompressed kernel is placed at its final physical address.
- The stub jumps to the kernel entry point.
All of this happens in a fraction of a second. The user sees nothing -- perhaps a brief flash on screen if verbose boot is enabled. But in that fraction of a second, the machine transforms from a BIOS-era 16-bit environment into a 64-bit system ready to initialize a modern operating system.
The kernel is now running. It has the CPU. It has the memory map. It has the initramfs. The next phase -- kernel initialization -- is where Linux truly comes to life.
Next: Kernel Init