When a process completes (or crashes), any data stored in main memory is lost. But data stored on a hard disk drive (HDD) or solid state drive (SSD) is “persistent;” that is, it survives after the process completes, even if the computer shuts down.
Hard disk drives are complicated. Data is stored in blocks, which are laid out in sectors, which make up tracks, which are arranged in concentric circles on platters.
Solid state drives are simpler in one sense, because blocks are numbered sequentially, but they raise a different complication: each block can be written a limited number of times before it becomes unreliable.
As a programmer, you don’t want to deal with these complications. What you want is an appropriate abstraction of persistent storage hardware. The most common abstraction is called a “file system.”
- A “file system” is a mapping from each file’s name to its contents. If you think of the names as keys, and the contents as values, a file system is a kind of key-value database (see https://en.Wikipedia.org/wiki/Key-value_database).
- A “file” is a sequence of bytes.
File names are usually strings, and they are usually “hierarchical”; that is, the string specifies a path from a top-level directory (or folder), through a series of subdirectories, to a specific file.
The primary difference between the abstraction and the underlying mechanism is that files are byte-based and persistent storage is block-based. The operating system translates byte-based file operations in the C library into block-based operations on storage devices. Typical block sizes are 1–8 KiB.
For example, the following code opens a file and reads the first byte:
FILE *fp = fopen("/home/downey/file.txt", "r"); char c = fgetc(fp); fclose(fp);
When this code runs:
fopenuses the filename to find the top-level directory, called
/, the subdirectory
home, and the sub-subdirectory
- It finds the file named
file.txtand “opens” it for reading, which means it creates a data structure that represents the file being read. Among other things, this data structure keeps track of how much of the file has been read, called the “file position”.
In DOS, this data structure is called a File Control Block, but I want to avoid that term because in UNIX it means something else. In UNIX, there seems to be no good name for it. It is an entry in the open file table, so I will call it an OpenFileTableEntry.
- When we call
fgetc, the operating system checks whether the next character of the file is already in memory. If so, it reads the next character, advances the file position, and returns the result.
- If the next character is not in memory, the operating system issues an I/O request to get the next block. Disk drives are slow, so a process waiting for a block from disk is usually interrupted so another process can run until the data arrives.
- When the I/O operation is complete, the new block of data is stored in memory, and the process resumes. It reads the first character and stores it as a local variable.
- When the process closes the file, the operating system completes or cancels any pending operations, removes data stored in memory, and frees the OpenFileTableEntry.
The process for writing a file is similar, but there are some additional steps. Here is an example that opens a file for writing and changes the first character.
FILE *fp = fopen("/home/downey/file.txt", "w"); fputc('b', fp); fclose(fp);
When this code runs:
fopenuses the filename to find the file. If it does not already exist, it creates a new file and adds an entry in the parent directory,
- The operating system creates an OpenFileTableEntry that indicates that the file is open for writing, and sets the file position to 0.
fputcattempts to write (or re-write) the first byte of the file. If the file already exists, the operating system has to load the first block into memory. Otherwise it allocates a new block in memory and requests a new block on disk.
- After the block in memory is modified, it might not be copied back to the disk right away. In general, data written to a file is “buffered”, which means it is stored in memory and only written to disk when there is at least one block to write.
- When the file is closed, any buffered data is written to disk and the OpenFileTableEntry is freed.
To summarize, the C library provides the abstraction of a file system that maps from file names to streams of bytes. This abstraction is built on top of storage devices that are actually organized in blocks.