19

Standard I/O

Every process has three channels: standard input, standard output, and standard error.

Every program needs to communicate. It needs to receive input, produce output, and report problems. Unix solved this decades ago with an elegant design: every process is born with three open communication channels. No setup required. No configuration files. They are just there.

These three channels are called standard input, standard output, and standard error. Together, they form the foundation of how every command-line program talks to the world.

Three Channels, Always

Imagine a factory machine. It has a conveyor belt feeding raw materials in one side. It has a second conveyor belt carrying finished products out the other side. And it has an alarm buzzer that goes off when something goes wrong. The raw materials belt is standard input. The finished products belt is standard output. The alarm is standard error.

Every process on a Unix system starts with exactly these three channels open:

  • Standard input (stdin) -- where the program reads incoming data
  • Standard output (stdout) -- where the program writes its normal results
  • Standard error (stderr) -- where the program writes error messages and diagnostics

By default, stdin is connected to your keyboard (through the terminal), and both stdout and stderr are connected to your screen. But as we will see in the next article, these connections can be rewired.

Key term: Standard I/O The three default communication channels that every Unix process receives at birth: standard input (stdin) for reading data, standard output (stdout) for writing results, and standard error (stderr) for writing error messages. They are the universal interface between programs and the outside world.
Fig. 19.0 -- The three standard channels
Keyboard Process (e.g., grep) Screen Screen stdin stdout stderr

fd 0 fd 1 fd 2

Default connections: keyboard feeds in, screen receives both outputs

Every process starts with three channels. By default, stdin reads from the keyboard and both stdout and stderr write to the screen. The numbers 0, 1, and 2 are file descriptors -- the kernel's internal names for these channels.

File Descriptors: The Kernel's Numbering System

The kernel does not think in terms of names like "stdin" and "stdout." It uses numbers. Each open channel in a process gets a sequential integer called a file descriptor.

The first three are always assigned the same way:

  • 0 -- standard input
  • 1 -- standard output
  • 2 -- standard error

If a process opens additional files, they get the next available numbers: 3, 4, 5, and so on. But 0, 1, and 2 are reserved by convention, and every program assumes they are there.

Key term: File descriptor A small non-negative integer that the kernel assigns to each open file, socket, pipe, or I/O channel within a process. File descriptors 0, 1, and 2 are pre-assigned to stdin, stdout, and stderr. Every read or write operation in a Unix program ultimately references a file descriptor number.

This numbering system is important because it is how you target specific channels when redirecting. Want to redirect just error messages to a file while keeping normal output on screen? You need to reference file descriptor 2 specifically. We will do exactly that in the next article.

Standard Output: Where Results Go

When a program wants to display results, it writes to stdout (file descriptor 1). The echo command writes its arguments to stdout. The ls command writes its file listing to stdout. The cat command reads files and writes their contents to stdout.

$ echo "Hello, world"
Hello, world

The text "Hello, world" went to stdout, which was connected to your terminal, which displayed it on screen. Simple.

Here is the critical thing: the program does not know or care where stdout actually goes. It just writes bytes to file descriptor 1. Whether those bytes end up on a screen, in a file, or fed into another program is decided by whoever set up the file descriptors before the program started. Usually, that is the shell.

Fig. 19.1 -- stdout does not know its destination
echo writes to fd 1 stdout Screen default File > out.txt Next program | grep

The program writes to fd 1 the same way in all three cases. Only the shell knows where the bytes actually go.

The echo command writes bytes to file descriptor 1 without knowing where they end up. The shell can connect that file descriptor to the screen (default), a file, or the input of another program.
Programs do not write "to the screen" or "to a file." They write to a file descriptor. Where that file descriptor points is decided externally, usually by the shell. This separation is what makes pipes and redirection possible.

Standard Error: A Separate Channel for Problems

Why do error messages need their own channel? Consider this situation: you run a program and send its output to a file. The program encounters an error halfway through. If the error message goes to stdout, it ends up mixed into the output file, corrupting the data. If the error message goes to stderr, it still appears on your screen even though stdout was redirected elsewhere.

$ ls /home /nonexistent
/home:
user1  user2

ls: cannot access '/nonexistent': No such file or directory

In this example, the file listing goes to stdout and the error message goes to stderr. Both happen to be connected to your screen, so you see them interleaved. But if you redirect stdout to a file, the error still shows up on your terminal:

$ ls /home /nonexistent > listing.txt
ls: cannot access '/nonexistent': No such file or directory

The file listing.txt contains the directory listing, clean and uncorrupted. The error message appeared on screen because stderr was not redirected -- only stdout was.

This is why stderr exists: to keep error messages out of the data stream.

Standard Input: How Programs Read

Most interactive programs read from stdin (file descriptor 0). By default, this is connected to your keyboard through the terminal.

When you run cat with no arguments, it reads from stdin:

$ cat
Hello
Hello
World
World

Each line you type is echoed back because cat reads stdin and writes to stdout. Press Ctrl+D to signal "end of input" -- this sends an EOF (end-of-file) marker that tells the program there is nothing more to read.

Programs that process data -- like sort, grep, wc -- can all read from stdin when given no filename argument. This is what makes pipes work, but we are getting ahead of ourselves.

Fig. 19.2 -- stdin sources
Keyboard default File < data.txt Pipe from program echo hi | fd 0 sort reads fd 0

Only one source is connected at a time. The program reads from fd 0 the same way regardless of source.

File descriptor 0 (stdin) can be connected to the keyboard, a file, or the output of another program. The receiving program reads from fd 0 identically in all three cases.

The /dev/null Black Hole

Sometimes you want to throw output away entirely. Maybe a program prints progress messages you do not care about. Maybe you want to run a command silently.

Unix provides a special file for this: /dev/null. It is a device file that accepts any data written to it and discards it immediately. Reading from it immediately returns end-of-file.

$ echo "This disappears" > /dev/null
$ ls /nonexistent 2> /dev/null

The first line sends stdout to /dev/null -- the text vanishes. The second line sends stderr to /dev/null -- the error message is suppressed. (The 2> syntax means "redirect file descriptor 2." We will cover this syntax fully in the next article.)

Key term: /dev/null A special device file that discards all data written to it and returns end-of-file on reads. It is the Unix "black hole" -- useful for silencing unwanted output or error messages. The name literally means the "null device."

Think of /dev/null as a trash can with no bottom. You can throw anything in. Nothing ever comes back out. Nothing accumulates. It is bottomless and always empty.

Everything Is a File (Descriptor)

One of the defining ideas in Unix is that I/O is uniform. Whether a program is reading from a keyboard, a file on disk, a network socket, or a pipe from another program, it uses the same system calls: read() and write(), operating on file descriptors.

This is why the standard channels are called "file" descriptors even though stdin often comes from a keyboard. The kernel presents all I/O sources through the same interface. A program that reads from file descriptor 0 does not know -- and does not need to know -- whether it is reading keystrokes, file contents, or output from another program.

This uniformity is not an accident. It is a deliberate design choice that makes it trivial to compose programs. If every program reads from stdin and writes to stdout using the same interface, then any program's output can be connected to any other program's input. No special adapters. No format conversion. Just bytes flowing through file descriptors.

The power of standard I/O comes from its uniformity. Every program reads and writes the same way, regardless of what is on the other end of the file descriptor. This uniformity is what makes the entire Unix pipeline model possible.
Fig. 19.3 -- Uniform I/O through file descriptors

Kernel System Call Interface

read(fd, buf, n) write(fd, buf, n)

Program sees: "just a number"

fd 0

Could be any of:

Terminal (keyboard) Regular file on disk Pipe from another program Network socket

Same read()/write() calls work on any type of file descriptor.

A program calls read() and write() on file descriptor numbers without knowing what type of resource is on the other end. The kernel handles the translation between the uniform interface and the actual device or file.

Seeing File Descriptors in Action

You can inspect the file descriptors of a running process. Every process has a directory in /proc that lists its open file descriptors:

$ ls -l /proc/self/fd
lrwx------ 1 user user 64 Jan 23 10:00 0 -> /dev/pts/0
lrwx------ 1 user user 64 Jan 23 10:00 1 -> /dev/pts/0
lrwx------ 1 user user 64 Jan 23 10:00 2 -> /dev/pts/0

The /proc/self directory always refers to the currently running process. Here, all three standard file descriptors point to /dev/pts/0, which is a pseudo-terminal device. That is the terminal you are typing in.

If you redirect stdout to a file and check again, you will see file descriptor 1 pointing to that file instead of the terminal.

What You Have Learned

Every Unix process starts with three open file descriptors: stdin (0) for reading input, stdout (1) for writing results, and stderr (2) for writing errors. Programs interact with these channels through uniform read and write system calls without knowing what is on the other end. Standard error exists as a separate channel so that error messages do not contaminate the data stream. The special file /dev/null discards anything written to it.

This system of standard channels is not just a convenience. It is the architectural foundation that makes the next topic possible: connecting programs together with pipes and redirection.

Next: Pipes and Redirection