Skip to main content
Engineering LibreTexts

11.8: Special Topic- Databases and Personal Privacy

  • Page ID
    58310
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    During a typical day we all come in contact with lots of electronic databases that store information about us. If you use a supermarket discount card, every purchase you make is logged against your name in the supermarket’s database. When you use your bank card at the ATM machine, your financial transaction is logged against your account. When you charge gasoline or buy dinner, those transactions are logged against your credit card account. If you visit the doctor or dentist, a detailed record of your visit is transmitted to your medical insurance company’s database. If you receive a college loan, detailed financial information about you is entered into several different credit service bureaus. And so on.

    Should we be worried about how this information is used? Many privacy advocates say yes. With the computerization of medical records, phone records, financial transactions, driving records, and many other records, there is an enormous amount of personal information held in databases. At the same time, there are pressures from a number of sources for access to this information. Law-enforcement agencies want to use this information to monitor individuals. Corporations want to use it to help them market their products. Political organizations want to use it to help them market their candidates.

    Recently there has been pressure from government and industry in the United States to use the social security number (SSN) as a unique identifier. Such an identifier would make it easy to match personal information across different databases. Right now, the only thing your bank records, medical records, and supermarket records may have in common is your name, which is not a unique identifier. If all online databases were based on your SSN, it would be much simpler to create a complete profile. While this might improve services and reduce fraud and crime, it might also pose a significant threat to our privacy.

    The development of online databases serve many useful purposes. They help fight crime and reduce the cost of doing business. They help improve government and commercial services on which we have come to depend. On the other hand, databases can be and have been misused. They can be used by unauthorized individuals or agencies or in unauthorized ways. When they contain inaccurate information, they can cause personal inconvenience or even harm.
    There are a number of organizations that have sprung up to address the privacy issues raised by online databases. If you’re interested in learning more about this issue, a good place to start would be the Web site maintained by the Electronic Privacy Information Center (EPIC) at

    http://www.epic.org/

    absolute path name

    binary file

    buffering

    database

    data hierarchy

    directory

    end-of-file

    field

    file

    filtering

    input

    object serialization

    output

    path

    record

    relative path name

    Unicode Text Format (UTF)

    A file is a collection of data stored on a disk. A stream is an object that delivers data to and from other objects.

    An InputStream is a stream that delivers data to a program from an external source—such as the keyboard or a file. System.in is an example of an InputStream. An OutputStream is a stream that delivers data from a program to an external destination—such as the screen or a file. System.out is an example of an OutputStream.

    Data can be viewed as a hierarchy. From highest to lowest, a database is a collection of files. A file is a collection of records. A record is a collection of fields. A field is a collection of bytes. A byte is a collection of 8 bits. A bit is one binary digit, either 0 or 1.

    A binary file is a sequence of 0s and 1s that is interpreted as a sequence of bytes. A text file is a sequence of 0s and 1s that is interpreted as a sequence of characters. A text file can be read by any text editor, but a binary file cannot. InputStream and OutputStream are abstract classes that serve as the root classes for reading and writing binary data. Reader and Writer serve as root classes for text I/O.

    Buffering is a technique in which a buffer, a temporary region of memory, is used to store data while they are being input or output.

    A text file contains a sequence of characters divided into lines by the \n character and ending with a special end-of-file character.

    The standard algorithm for performing I/O on a file consists of three steps: (1) Open a stream to the file, (2) perform the I/O, and (3) close the stream.

    Designing effective I/O routines answers two questions: (1) What streams should I use to perform the I/O? (2) What methods should I use to do the reading or writing?

    To prevent damage to files when a program terminates abnormally, streams should be closed when they are no longer needed.

    Most I/O operations generate an IOException that should be caught in the I/O methods.

    Text input uses a different technique to determine when the end of a file has been reached. Text input methods return null or -1 when they attempt to read the special end-of-file character. Binary files don’t contain an end-of-file character, so binary read methods throw an EOFException when they attempt to read past the end of the file.

    The java.io.File class provides methods that enable a program to interact with a file system. Its methods can be used to check a file’s attributes, including its name, directory, and path.

    Streams can be joined together to perform I/O. For example, a DataOutputStream and a FileOutputStream can be joined to perform output to a binary file.

    A binary file is “raw” in the sense that it contains no markers within it that allow you to tell where one data element ends and another begins. The interpretation of binary data is up to the program that reads or writes the file.

    Object serialization is the process of writing an object to an output stream. Object deserialization is the reverse process of reading a serialized object from an input stream. These processes use the java.io.ObjectOutputStream and java.io.ObjectInputStream classes.

    The JFileChooser class provides a dialog box that enables the user to select a file and directory when opening or saving a file.

    Because FileWriter contains a constructor that takes a file name argument, FileWriter(String), it can be used with PrintWriter to perform output to a text file:

    PrintWriter outStream =     //  Create output stream
        new PrintWriter(new FileWriter(fileName)); // Open file
    outStream.print (display.getText());// Display text
    outStream.close();                  // Close output stream

    An empty file doesn’t affect this loop. If the file is empty, it will print a null line. The test line != null, should come right after the readLine(), as it does in the while loop.

    This loop won’t work on an empty text file. In that case, ch would be set to \(-1\), and the attempt to cast it into a char would cause an error.

    public void getFileAttributes(String fileName) {
        File file = new File (fileName);
        System.out.println(filename);
        System.out.println("absolute path:" 
               + file.getAbsolutePath());
        System.out.println("length:" + file.length());
        if (file.isDirectory())
            System.out.println("Directory");
        else
            System.out.println("Not a Directory");
    } // getFileAttributes()

    The inStream.close() statement is misplaced in readIntegers(). By placing it inside the same try/catch block as the read loop, it will get skipped and the stream will not be closed. The EOFException should be caught in a separate try/catch block from other exceptions, and it should just cause the read loop to exit.

    Yes, a binary file containing several SomeObjects would be “readable” by the BinaryIO program because the program will read a String followed by 64 bytes. However, BinaryIO would misinterpret the data, because it will assume that n1 and n2 together comprise a single int, and n3 (64 bits) will be interpreted as a double. A file of SomeObjects could not be read by the ObjectIO program, because SomeObject does not implement the Serializable interface.

    Explain the difference between each of the following pairs of terms:

    2

    System.in and System.out.

    File and directory.

    Buffering and filtering.

    Absolute and relative path name.

    Input stream and output stream.

    File and database.

    Record and field.

    Binary file and text file.

    Directory and database.

    Fill in the blanks.

    =15pt

    Unlike text files, binary files do not have a special


    character.

    In Java, the String array parameter in the main() method is used for


     .


    files are portable and platform independent.

    A


    file created on one computer can’t be read by another

    =11pt

    Arrange the following kinds of data into their correct hierarchical relationships: bit, field, byte, record, database, file, String, char.

    In what different ways can the following string of 32 bits be interpreted?

    00010101111000110100000110011110

    When reading a binary file, why is it necessary to use an infinite loop that’s exited only when an exception occurs?

    Is it possible to have a text file with 10 characters and 0 lines? Explain.

    In reading a file, why is it necessary to attempt to read from the file before entering the read loop?

    When designing binary I/O, why is it especially important to design the input and output routines together?

    What’s the difference between ASCII code and UTF code?

    Could the following string of bits possibly be a Java object? Explain.

    00010111000111101010101010000111001000100
    11010010010101010010101001000001000000111

    Write a method that could be added to the TextIO program to read a text file and print all lines containing a certain word. This should be a void method that takes two parameters: The name of the file and the word to search for. Lines not containing the word should not be printed.

    Write a program that reads a text file and reports the number of characters and lines contained in the file.

    Modify the program in the previous exercise so that it also counts the number of words in the file. (Hint: The StringTokenizer class might be useful for this task.)

    Modify the ObjectIO program so that it allows the user to designate a file and then input Student data with the help of a GUI. As the user inputs data, each record should be written to the file.

    Write a program that will read a file of ints into memory, sort them in ascending order, and output the sorted data to a second file.

    Write a program that will read two files of ints, which are already sorted into ascending order, and merge their data. For example, if one file contains 1, 3, 5, 7, 9, and the other contains 2, 4, 6, 8, 10, then the merged file should contain 1, 2, 3, 4, 5, 6, 7, 8, 9, 10.

    Suppose you have a file of data for a geological survey. Each record consists of a longitude, a latitude, and an amount of rainfall, all represented by doubles. Write a method to read this file’s data and print them on the screen, one record per line. The method should be void and it should take the name of the file as its only parameter.

    Suppose you have the same data as in the previous exercise. Write a method that will generate 1,000 records of random data and write them to a file. The method should be void and should take the file’s name as its parameter. Assume that longitudes have values in the range \(+/-\) 0 to 180 degrees, latitudes have values in the range \(+/-\) 0 to 90 degrees, and rainfalls have values in the range 0 to 20 inches.

    Design and write a file copy program that will work for either text files or binary files. The program should prompt the user for the names of each file and copy the data from the source file into the destination file. It should not overwrite an existing file, however. (Hint: Read and write the file as a file of byte.)

    Design a class, similar to Student, to represent an Address. It should consist of street, city, state, and zip code and should contain its own readFromFile() and writeToFile() methods.

    Using the class designed in the previous exercise, modify the Student class so that it contains an Address field. Modify the ObjectIO program to accommodate this new definition of Student and test your program.

    Write a program called Directory, which provides a listing of any directory contained in the current directory. This program should prompt the user for the name of the directory. It should then print a listing of that directory. The listing should contain the following information: The full path name of the directory, and then include the file name, length, and last modified date, and a read/write code for each file. The read/write code should be an r if the file is readable and a w if the file is writeable, in that order. Use a “-” to indicate not readable or not writeable. For example, a file that is readable but not writable will have the code r-. Here’s an example listing:

    Listing for directory: myfiles
      name          length modified   code
      index.html    548    129098     rw
      index.gif     78     129190     rw
      me.html       682    128001     r-
      private.txt   1001   129000     --

    Note that the File.lastModified() returns a long, which gives the modification time of the file. This number can’t easily be converted into a date, so just report its value.

    Challenge: Modify the OneRowNimGUI class that is listed in Chapter 4’s Figure 4-25 so that the user can save the position of the game to a file or open and read a game position from a file. You should add two new JButtons to the GUI interface. Use the object serialization example as a model for your input and output streams.

    Challenge: In Unix systems, there’s a program named grep that can list the lines in a text file containing a certain string. Write a Java version of this program that prompts the user for the name of the file and the string to search for.

    Challenge: Write a program in Java named Copy to copy one file into another. The program should prompt the user for two file names, filename1 and filename2. Both filename1 and filename2 must exist or the program should throw a FileNotFoundException. Although filename1 must be the name of a file (not a directory), filename2 may be either a file or a directory. If filename2 is a file, then the program should copy filename1 to filename2. If filename2 is a directory, then the program should simply copy filename1 into filename2. That is, it should create a new file with the name filename1 inside the filename2 directory, copy the old file to the new file, and then delete the old file.


    This page titled 11.8: Special Topic- Databases and Personal Privacy is shared under a CC BY 4.0 license and was authored, remixed, and/or curated by Ralph Morelli & Ralph Wade via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request.