03b

Why Linux?

Why this series uses Linux as its reference operating system.

This series is about what happens between the moment you press the power button and the moment a shell prompt appears. We have covered the hardware and firmware layers so far. From here on, we need to talk about an operating system -- the software that takes control after the firmware finishes its work.

We are going to use Linux. This article explains why.

This is not a value judgment. It is not an argument that Linux is "better" than Windows or macOS or FreeBSD. It is a practical decision based on one overriding requirement: you should be able to see, read, and understand every piece of the system we discuss.

The Requirement: Inspectability

When this series describes a piece of the boot process, you should be able to open the source code and see exactly what happens. Not a summary. Not a vendor's documentation of what they say happens. The actual code that runs on the machine.

With proprietary operating systems, this is impossible. The Windows kernel source code is not available to the public. macOS is built on an open-source foundation (XNU/Darwin), but large portions of the system are closed. When we say "the kernel does X at this stage of boot," you would have to take our word for it.

Key term: Open source Software whose source code is publicly available for anyone to read, modify, and redistribute, subject to the terms of its license. Linux is released under the GNU General Public License version 2 (GPLv2), which requires that anyone distributing the software also make the source code available.

With Linux, every claim in this series is verifiable. When we say the kernel initializes the memory manager at a certain point during boot, you can open mm/init.c and read the function. When we say the init system starts services in a particular order, you can read the systemd source code or the init scripts. Nothing is hidden.

This matters because understanding a computer is not about memorizing facts. It is about building a mental model that you can verify against reality. If the model and the reality disagree, you need access to the reality to figure out which one is wrong.

Where Linux Runs

Linux is not a niche operating system. It is the dominant operating system in several of the largest computing domains.

Fig. 01 -- Linux deployment across computing domains
Servers ~80% of public web servers Cloud ~90% of cloud workloads HPC / Top 500 100% of top supercomputers Embedded Routers, TVs, cars, cameras Mobile ~72% Android = Linux kernel Containers ~99% Docker/K8s on Linux

Understanding Linux boot means understanding how most of the world's computing infrastructure starts up.

Linux dominates servers, cloud, supercomputing, containers, and embedded devices. Android phones run the Linux kernel. Learning the Linux boot process gives you knowledge that applies across the broadest range of real systems.

Servers: Roughly 80% of public-facing web servers run Linux. If you visit a website, there is a strong chance the machine serving the page is running a Linux kernel.

Cloud computing: AWS, Google Cloud, and Azure all offer Linux instances, and the vast majority of cloud workloads run on Linux. AWS even built their own Linux distribution (Amazon Linux) for their infrastructure.

Supercomputers: Every single system on the TOP500 list of the world's fastest supercomputers runs Linux. One hundred percent. This has been true since 2017.

Embedded systems: Your home router almost certainly runs Linux. Many smart TVs, security cameras, automotive infotainment systems, and industrial controllers run Linux. The kernel's configurability makes it adaptable to hardware ranging from a microcontroller with 4 MB of RAM to a server with 4 TB.

Mobile: Android, which runs on roughly 72% of the world's smartphones, uses the Linux kernel. The userspace is different from what you would find on a server, but the kernel -- the part we will study in this series -- is the same codebase.

Containers: Docker and Kubernetes, the dominant container technologies, run on Linux. Container isolation uses Linux-specific kernel features (namespaces and cgroups) that do not exist in other kernels.

If you understand how Linux boots, you understand how most of the world's computing infrastructure starts up.

You Can Read the Source

The Linux kernel source code is available at kernel.org. As of early 2025, it contains roughly 36 million lines of code across approximately 80,000 files. That sounds overwhelming, but the boot path -- the code that executes between the kernel being loaded and the first user process starting -- is a small, well-defined subset.

Key term: Kernel The core of an operating system. The kernel manages hardware resources (CPU time, memory, devices), enforces security boundaries between processes, and provides the system call interface that all programs use to interact with hardware. In Linux, the kernel is a single binary (vmlinuz) loaded by the bootloader.

Here is a concrete example. During boot, the Linux kernel prints messages to the console. You have probably seen them -- lines scrolling by during startup, each prefixed with a timestamp. The function that handles this is printk(), defined in kernel/printk/printk.c. You can read it. You can see how the ring buffer works, how log levels are handled, how early boot messages are stored before the console is initialized.

This level of transparency is unique. No other widely-deployed operating system gives you complete access to its internals. FreeBSD and OpenBSD do, and they are excellent systems, but they do not have Linux's reach across servers, cloud, embedded, and mobile.

The Tooling Ecosystem

Linux comes with tools that let you inspect the boot process while it is happening and after it has completed:

  • dmesg -- Prints the kernel's message buffer, showing every hardware detection, driver initialization, and boot event in chronological order.
  • journalctl -- On systemd-based systems, shows the complete log of the boot process, including both kernel and userspace messages.
  • /proc and /sys -- Virtual filesystems that expose kernel state at runtime. Want to know what the kernel detected about your CPU? Read /proc/cpuinfo. Want to see the memory map? Read /proc/iomem.
  • strace -- Traces system calls made by any process, letting you see exactly how programs interact with the kernel.
  • ftrace / perf -- Kernel tracing tools that can instrument the boot process itself, showing you the exact sequence of function calls and their timing.
Fig. 02 -- Inspecting the boot process with Linux tools
Kernel loaded Shell prompt Early kernel init Driver probing Root mount Userspace init $ dmesg Kernel ring buffer -- hardware + driver messages $ journalctl -b Full boot log: kernel + systemd + services $ cat /proc/iomem Physical memory map -- what the kernel detected $ systemd-analyze blame Time spent in each service during boot ftrace Function-level kernel tracing
Linux provides tools to inspect every phase of the boot process, from kernel ring buffer messages to systemd timing analysis to runtime kernel tracing.

These tools are not afterthoughts. They are built into the system specifically because Linux is designed to be understood. The kernel developers themselves use these tools to debug boot issues, and they are available to anyone running a Linux system.

Linux provides complete source code and built-in inspection tools (dmesg, journalctl, /proc, /sys, ftrace) that let you verify every claim about how the system works. No other widely-deployed OS offers this combination of transparency, tooling, and market reach.

What "Linux" Means in This Series

When people say "Linux," they sometimes mean just the kernel, and sometimes mean an entire operating system including the kernel, system libraries, package manager, desktop environment, and applications. In this series, we will be precise about which we mean.

The Linux kernel is the software written by Linus Torvalds and thousands of contributors, maintained at kernel.org. It handles hardware management, process scheduling, memory management, filesystems, networking, and security. When we say "the kernel does X," we mean this specific piece of software.

A Linux distribution (distro) is a complete operating system built around the Linux kernel. It includes a bootloader (often GRUB), an init system (often systemd), system libraries (glibc or musl), a package manager (apt, dnf, pacman), and a collection of user applications. Examples include Debian, Fedora, Arch Linux, Ubuntu, and Alpine.

Fig. 03 -- Components of a Linux distribution
Hardware (CPU, RAM, Disk, Network) Linux Kernel Process mgmt | Memory mgmt | Filesystems | Drivers | Networking System Libraries glibc / musl Init System systemd / OpenRC / runit Shell + Utils bash, coreutils Package Mgr apt / dnf / pacman Applications nginx, postgres, ... Bootloader (GRUB / systemd-boot) Loads kernel from disk into memory

This series

A Linux distribution is a complete system built in layers. This series covers the full stack from bootloader through kernel initialization to the first shell prompt.

This series covers the boot process from firmware handoff (where the bootloader takes over) through kernel initialization to the moment the first user process runs. That path touches the bootloader, the kernel, the init system, and the early userspace -- components found in every Linux distribution.

Where distribution-specific details matter (such as how GRUB is configured or how systemd orders its services), we will note which distribution we are using and point out where other distributions differ.

Why Not Windows? Why Not BSD?

Windows is a fine operating system. It boots, it runs applications, it handles hardware. But its boot process is closed. You cannot read the Windows kernel source code. You cannot see what ntoskrnl.exe does during initialization. You cannot trace the exact sequence of operations from bootloader to desktop. You can study Microsoft's documentation, which is extensive, but documentation is not source code.

FreeBSD and OpenBSD are open-source, well-documented, and technically excellent. Their boot processes are worth studying. But they serve a smaller user base. If you learn the FreeBSD boot process, you understand how a few percent of servers start up. If you learn the Linux boot process, you understand how the majority of the world's servers, all of its supercomputers, most of its cloud infrastructure, and billions of mobile devices start up.

Key term: Distribution (distro) A complete operating system package that includes the Linux kernel, bootloader, init system, system libraries, package manager, and applications. Different distributions make different choices about these components, but they all share the same kernel. Debian, Fedora, Arch, and Ubuntu are all Linux distributions.

This is not about allegiance. It is about reach. The knowledge you gain from studying the Linux boot process transfers to more real-world systems than any alternative.

What You Will Need

To follow along with this series, you will want access to a Linux system. This could be:

  • A physical machine running any Linux distribution. Debian, Fedora, Ubuntu, Arch -- any will work.
  • A virtual machine using VirtualBox, QEMU/KVM, or VMware. This is the safest option for experimentation, since you can snapshot and restore.
  • A cloud instance on any provider. A small Debian or Ubuntu instance is inexpensive and gives you full root access.
  • Windows Subsystem for Linux (WSL2) -- usable for many exercises, though it does not boot a real Linux kernel in the traditional sense (the kernel is provided by Microsoft's lightweight VM).

The specific distribution does not matter much for this series. We will point out distribution-specific details when they arise. What matters is that you have root access and can inspect the system freely.

This series uses Linux because it is open-source (you can read every line of code), dominant (it runs most of the world's servers, cloud, and mobile devices), and well-tooled (dmesg, /proc, ftrace give you direct visibility into the boot process). It is a practical choice, not a tribal one.

What Comes Next

We have covered the hardware layer (power delivery, the reset vector) and the firmware layer (POST and the BIOS). We briefly discussed why the legacy BIOS has limitations. The next article tackles the firmware that replaced it: UEFI.

UEFI changes the boot process significantly. It replaces the 512-byte MBR with a proper filesystem on a dedicated partition. It runs in 32-bit or 64-bit mode instead of real mode. It provides Secure Boot to verify the integrity of everything in the boot chain. And it standardizes the interface between firmware and operating system in ways the legacy BIOS never did.

Next: UEFI vs Legacy BIOS