Skip to main content
Engineering LibreTexts

11.1: Streams and Files

  • Page ID
    15126
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    As was noted in Chapter 4, all input and output (I/O) in Java is accomplished through the use of input streams and output streams. You are already familiar with input and output streams because we have routinely used the System.out output stream and and the System.in input stream (Fig. [fig-systemout]) in this text’s examples. Recall that System.out usually connects your program (source) to the screen (destination) and System.in usually connects the keyboard (source) to the running program (destination). What you have learned about streams will also be a key for connecting files to a program.

    The Data Hierarchy

    Data, or information, are the contents that flow through Java streams or stored in files. All data are comprised of binary digits or bits. A bit is simply a 0 or a 1, the electronic states that correspond to these values. As we learned in Chapter 5, a bit is the smallest unit of data.

    However, it would be tedious if a program had to work with data in units as small as bits. Therefore, most operations involve various-sized aggregates of data such as an 8-bit byte, a 16-bit short, a 16-bit char, a 32-bit int, a 64-bit long, a 32-bit float, or a 64-bit double. As we know, these are Java’s primitive numeric types. In addition to these aggregates, we can group together a sequence of char to form a String.

    It is also possible to group data of different types into objects. A record, which corresponds closely to a Java object, can have fields that contain different types of data. For example, a student record might contain fields for the student’s name and address represented by (Strings), expected year of graduation (int), and current grade point average represented by (double). Collections of these records are typically grouped into files. For example, your registrar’s office may have a separate file for each of its graduating classes. These are typically organized into a collection of related files, which is called a database.

    Taken together, the different kinds of data that are processed by a computer or stored in a file can be organized into a data hierarchy (Fig. 112).

    It’s important to recognize that while we, the programmers, may group data into various types of abstract entities, the information flowing through an input or output stream is just a sequence of bits. There are no natural boundaries that mark where one byte (or one int or one record) ends and the next one begins. Therefore, it will be up to us to provide the boundaries as we process the data.

    Binary Files and Text Files

    As we noted in chapter 4, there are two types of files in Java: binary files and text files. Both kinds store data as a sequence of bits—that is, a sequence of 0’s and 1’s. Thus, the difference between the two types of files lies in the way they are interpreted by the programs that read and write them. A binary file is processed as a sequence of bytes, whereas a text file is processed as a sequence of characters.

    Text editors and other programs that process text files interpret the file’s sequence of bits as a sequence of characters—that is, as a string. Your Java source programs (.java) are text files, and so are the HTML files that populate the World Wide Web. The big advantage of text files is their portability. Because their data are represented in the ASCII code they can be read and written by just about any text-processing program. Thus, a text file created by a program on a Windows/Intel computer can be read by a Macintosh program.

    In non-Java environments, data in binary files are stored as bytes, and the representation used varies from computer to computer. The manner in which a computer’s memory stores binary data determines how it is represented in a file. Thus, binary data are not very portable. For example, a binary file of integers created on a Macintosh cannot be read by a Windows/Intel program.

    One reason for the lack of portability is that each type of computer uses its own definition for how an integer is defined. On some systems an integer might be 16 bits, and on others it might be 32 bits, so even if you know that a Macintosh binary file contains integers, that still won’t make it readable by Windows/Intel programs. Another problem is that even if two computers use the same number of bits to represent an integer, they might use different representation schemes. For example, some computers might use 10000101 as the 8-bit representation of the number 133, whereas other computers might use the reverse, 10100001, to represent 133.

    The good news for us is that Java’s designers have made its binary files platform independent by carefully defining the exact size and representation that must be used for integers and all other primitive types. Thus, binary files created by Java programs can be interpreted by Java programs on any platform.

    Input and Output Streams

    Java has a wide variety of streams for performing I/O. They are defined in the java.io package, which must be imported by any program that does I/O. They are generally organized into the hierarchy illustrated in Figure [fig-streamhier]. We will cover only a small portion of the hierarchy in this text. Generally speaking, binary files are processed by subclasses of InputStream and OutputStream. Text files are processed by subclasses of Reader and Writer, both of which are streams, despite their names.

    InputStream and OutputStream are abstract classes that serve as the root classes for reading and writing binary data. Their most commonly used subclasses are DataInputStream and DataOutputStream, which are used for processing String data and data of any of Java’s primitive types—char, boolean, int, double, and so on. The analogues of these classes for processing text data are the Reader and Writer classes, which serve as the root classes for all text I/O.

    The various subclasses of these root classes perform various specialized I/O operations. For example, FileInputStream and FileOutputStream are used for performing binary input and output on files. The PrintStream class contains methods for outputting various primitive data—integers, floats, and so forth—as text. The System.out stream, one of the most widely used output streams, is an object of this type. The PrintWriter class, which was introduced in JDK 1.1 contains the same methods as PrintStream but the methods are designed to support platform independence and internationalized I/O—that is, I/O that works in different languages and alphabets.

    The various methods defined in PrintWriter are designed to output a particular type of primitive data (Fig. 11.4). As you would expect, there is both a print() and println() method for each kind of data that the programmer wants to output.

    Table 11.1 briefly describes Java’s most commonly used input and output streams. In addition to the ones we’ve already mentioned, you are already familiar with methods from the BufferedReader and File classes, which were used in Chapter 4.

    Filtering refers to performing operations on data while the data are being input or output. Methods in the FilterInputStream and FilterReader classes can be used to filter binary and text data during input. Methods in the FilterOutputStream and FilterWriter can be used to filter output data. These classes serve as the root classes for various filtering subclasses. They can also be subclassed to perform customized data filtering.

    ll
    Class & Description

    InputStream & Abstract root class of all binary input streams & Provides methods for reading bytes from a binary file & Provides methods required to filter data & Provides input data buffering for reading large files & Provides methods for reading an array as if it were a stream & Provides methods for reading Java’s primitive data types & Provides methods for reading piped data from another thread

    OutputStream & Abstract root class of all binary output streams & Provides methods for writing bytes to a binary file & Provides methods required to filter data & Provides output data buffering for writing large files & Provides methods for writing an array as if it were a stream & Provides methods for writing Java’s primitive data types & Provides methods for writing piped data to another thread & Provides methods for writing primitive data as text

    Reader & Abstract root class for all text input streams & Provides buffering for character input streams & Provides input operations on char arrays & Provides methods for character input on files & Provides methods to filter character input & Provides input operations on Strings

    Writer & Abstract root class for all text output streams & Provides buffering for character output streams & Provides output operations to char arrays & Provides methods for output to text files & Provides methods to filter character output & Provides methods for printing binary data as characters & Provides output operations to Strings

    One type of filtering is buffering, which is provided by several buffered streams, including BufferedInputStream and BufferedReader, for performing binary and text input, and BufferedOutputStream and BufferedWriter, for buffered output operations. As was discussed in chapter 4, a buffer is a relatively large region of memory used to temporarily store data while they are being input or output. When buffering is used, a program will transfer a large number of bytes into the buffer from the relatively slow input device and then transfer these to the program as each read operation is performed. The transfer from the buffer to the program’s memory is very fast.

    Similarly, when buffering is used during output, data are transferred directly to the buffer and then written to the disk when the buffer fills up or when the flush() method is called.

    You can also define your own data filtering subclasses to perform customized filtering. For example, suppose you want to add line numbers to a text editor’s printed output. To perform this task, you could define a FilterWriter subclass and override its methods to perform the desired filtering operation. Similarly, to remove the line numbers from such a file during input, you could define a FilterReader subclass. In that case, you would override its read() methods to suit your goals for the program.

    There are several classes that provide I/O-like operations on various internal memory structures. ByteArrayInputStream, ByteArrayOutputStream, CharArrayReader, and CharArrayWriter are four classes that take input from or send output to arrays in the program’s memory. Methods in these classes can be useful for performing various operations on data during input or output. For example, suppose a program reads an entire line of integer data from a binary file into a ByteArray. It might then transform the data by, say, computing the remainder modulo N of each value. The program now can read these transformed data by treating the byte array as an input stream. A similar example would apply for some kind of output transformation.

    The StringReader and StringWriter classes provide methods for treating Strings and StringBuffers as I/O streams. These methods can be useful for performing certain data conversions.


    This page titled 11.1: Streams and Files is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Ralph Morelli & Ralph Wade via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.