Files and file systems
When a process completes (or crashes), any data stored in the main memory is lost. But data stored on a hard disk drive (HDD) or solid-state drive (SSD) is “persistent;” that is, it survives after the process completes, even if the computer shuts down.
Hard disk drives are complicated. Data is stored in blocks, which are laid out in sectors, which make up tracks, which are arranged in concentric circles on platters.
Solid-state drives are simpler in one sense because blocks are numbered sequentially, but they raise a different complication: each block can be written a limited number of times before it becomes unreliable.
As a programmer, you don’t want to deal with these complications. What you want is an appropriate abstraction of persistent storage hardware. The most common abstraction is called a “file system.”
- A “file system” is a mapping from each file’s name to its contents. If you think of the names as keys, and the contents as values, a file system is a kind of key-value database (see https://en.Wikipedia.org/wiki/Key-value_database).
- A “file” is a sequence of bytes.
File names are usually strings, and they are usually “hierarchical”; that is, the string specifies a path from a top-level directory (or folder), through a series of subdirectories, to a specific file.
The primary difference between the abstraction and the underlying mechanism is that files are byte-based and persistent storage is block-based. The operating system translates byte-based file operations in the C library into block-based operations on storage devices. Typical block sizes are 1–8 KiB.
For example, the following code opens a file and reads the first byte:
FILE *fp = fopen("/home/downey/file.txt", "r"); char c = fgetc(fp); fclose(fp);
When this code runs:
fopenuses the filename to find the top-level directory, called
/, the subdirectory
home, and the sub-subdirectory
- It finds the file named
file.txtand “opens” it for reading, which means it creates a data structure that represents the file being read. Among other things, this data structure keeps track of how much of the file has been read, called the “file position”.
In DOS, this data structure is called a File Control Block, but I want to avoid that term because in UNIX it means something else. In UNIX, there seems to be no good name for it. It is an entry in the open file table, so I will call it an OpenFileTableEntry.
- When we call
fgetc, the operating system checks whether the next character of the file is already in memory. If so, it reads the next character, advances the file position, and returns the result.
- If the next character is not in memory, the operating system issues an I/O request to get the next block. Disk drives are slow, so a process waiting for a block from disk is usually interrupted so another process can run until the data arrives.
- When the I/O operation is complete, the new block of data is stored in memory, and the process resumes. It reads the first character and stores it as a local variable.
- When the process closes the file, the operating system completes or cancels any pending operations, removes data stored in memory, and frees the OpenFileTableEntry.
The process for writing a file is similar, but there are some additional steps. Here is an example that opens a file for writing and changes the first character.
FILE *fp = fopen("/home/downey/file.txt", "w"); fputc('b', fp); fclose(fp);
When this code runs:
fopenuses the filename to find the file. If it does not already exist, it creates a new file and adds an entry in the parent directory,
- The operating system creates an OpenFileTableEntry that indicates that the file is open for writing, and sets the file position to 0.
fputcattempts to write (or re-write) the first byte of the file. If the file already exists, the operating system has to load the first block into memory. Otherwise it allocates a new block in memory and requests a new block on disk.
- After the block in memory is modified, it might not be copied back to the disk right away. In general, data written to a file is “buffered”, which means it is stored in memory and only written to disk when there is at least one block to write.
- When the file is closed, any buffered data is written to disk and the OpenFileTableEntry is freed.
To summarize, the C library provides the abstraction of a file system that maps from file names to streams of bytes. This abstraction is built on top of storage devices that are actually organized in blocks.
Everything is a file
The file abstraction is really a “stream of bytes” abstraction, which turns out to be useful for many things, not just file systems.
One example is the UNIX pipe, which is a simple form of inter-process communication. Processes can be set up so that output from one process is taken as input into another process. For the first process, the pipe behaves like a file open for writing, so it can use C library functions like
fprintf. For the second process, the pipe behaves like a file open for reading, so it uses
Network communication also uses the stream of bytes abstraction. A UNIX socket is a data structure that represents a communication channel between processes on different computers (usually). Again, processes can read data from and write data to a socket using “file” handling functions.
Reusing the file abstraction makes life easier for programmers, since they only have to learn one API (application program interface). It also makes programs more versatile, since a program intended to work with files can also work with data coming from pipes and other sources