6.2.4: Problem Sets 31-32

Last updated
Save as PDF

Page ID: 76327

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

Problem Set 31: Confidential Bitdiddler\(^*\)

\(^*\) Credit for developing this problem set goes to Sam Madden.

(Chapter 5)

2007–3–16

Ben Bitdiddle is designing a file system for a new handheld computer, the Bitdiddler, which is designed to be especially simple and has a file system without directories. The disk is physically partitioned into three regions: an inode list, a free list, and a collection of 4K data blocks, much like the UNIX file system. Unlike in the UNIX file system, each inode contains the name of the file it corresponds to, as well as a bit indicating whether or not the inode is in use. Like the UNIX file system, the inode also contains a list of blocks that compose the file, as well as metadata about the file, including permission bits, its length in bytes, and modification and creation timestamps. The free list is a bitmap, with one bit per data block indicating whether that block is free or in use. There are no indirect blocks in Ben's file system. The file system provides six primary calls: CREATE, OPEN, READ, WRITE, CLOSE, and UNLINK. Ben implements all six correctly and in a straightforward way. All updates to the disk are synchronous; that is, when a call to write a block of data to the disk returns, that block is definitely installed on the disk. Individual block writes are atomic.

Ben stores many files in the file system on this handheld computer, and runs out of disk space quickly. He looks at the blocks on the disk and discovers that many blocks have the same content. To reduce space consumption he augments the file system implementation as follows:

The file system keeps a table in memory that records for each allocated block a 32-bit non-cryptographic hash of that block. (When the file system starts, it computes this table from the on-disk state.) Ben talks to a hashing expert, who tells Ben to use the b -bit (here b = 32) non-cryptographic hash function H(block) = block modulo P where P is a large b -bit prime number that yields a uniform distribution of hash values throughout the interval [0... 2b- 1].
When the file system writes a block b for a file, it checks if the table contains a block number d whose block content on disk has the same hash value as the hash value for block b . If so, the file system frees b and inserts d into the file’s inode. If there is no block d , the file system writes b to the disk, and puts b 's block number and its hash in the table.

To keep things simple, let's ignore what happens when a user unlinks a file.

Q31.1 Occasionally, Ben finds that his system has files with incorrect contents. He suspects hash collisions are to blame. These might be caused by:

A. Accidental collisions: different data blocks hash to the same 32-bit value.
B. Engineered collisions: adversaries can fabricate blocks that hash to the same 32 bit value.
C. A block whose hash is the same as its block number.

Q31.2 For each of the following proposed fixes, list which of the problem causes listed in Question 31.1 (A, B, or C) it is likely to fix:

A. Use a b = 160-bit non-cryptographic hash in step A of the algorithm.
B. Use a 160-bit cryptographic hash such as SHA-1 in step A of the algorithm.
C. Modify step B of the algorithm so that when a matching hash is found, it compares the contents of the stored block to the data block and treats the blocks as different unless their contents match.

Ben decides he wants to encrypt the contents of the files on disk so that if someone steals his handheld computer, they cannot read the files stored on it. Ben considers two encryption plans:

User-key encryption: One plan is to give each user a different key and use a secure block encryption scheme with no cryptographic vulnerabilities to encrypt the user’s files. Ben implements this by storing a table of (user name, key) pairs, which the system stores securely on disk.
Convergent encryption: One problem with user-key encryption is that it doesn't provide the space saving if blocks in different files of different users have the same content. To address this problem, Ben proposes to use convergent encryption (also called "content hash keying"), which encrypts a block using a cryptographic hash of the content of that block as a shared-secret key (that is, ENCRYPT (block, HASH (block)) . Ben reasons that since the output of the cryptographic hash is pseudorandom, this is just as good as choosing a fresh random key. Ben implements this scheme by modifying the file system to use the table of hash values as before, but now the file system writes encrypted blocks to the disk instead of plaintext ones. This way blocks are encrypted but, because duplicate blocks have the same hash and thus encrypt to the same ciphertext, Ben still gets the space savings for blocks with the same content. The file system maintains a secure table of block hash values so that it can decrypt blocks when an authorized user requests a read operation.

Q31.3 Which of the following statements are true of convergent encryption?

A. If Alyssa can guess the contents of a block (by enumerating all possibilities, or by guessing based on the file metadata, etc), it is easy for her to verify whether her guess of a block's data is correct.
B. If Alyssa can discover the 32-bit block numbers referenced by inodes in the file system, she can learn something about the contents of Ben's files.
C. The file system can detect when an adversary changes the content of a block on disk.

Q31.4 Which of the following statements are true of user-key encryption?

A. If Alyssa can guess the contents of a block but doesn't know Ben's key, it is easy for her to verify whether her guess of a block's data is correct.
B. If Alyssa can discover the 32-bit block numbers referenced by inodes in the file system, she can learn something about the contents of Ben's files.
C. The file system can detect when an adversary changes the content of a block on disk.

Problem Set 32: Beyond Stack Smashing\(^*\)

\(^*\) Credit for developing this problem set goes to Lewis D. Girod. This problem set was inspired by a paper by Jonathan Pincus and Brandon Baker, "Beyond stack smashing: recent advances in exploiting buffer overruns," IEEE Security & Privacy 2, 4 (July/August 2004) pages 20–27. Further details and explanations can be found in that paper.

(Chapter 5)

2008–3-8

You are hired by a well-known OS vendor to help them defend their products against buffer overrun attacks of the kind described in Sidebar 5.2.4. Their team presents several proposed strategies to foil buffer exploits:

Random stack: Place the stack in an area of memory randomly chosen for each new process, rather than at the same address for every process.
Non-executable stack: Set the permissions on the virtual memory containing the stack to allow reading and writing but not execution as a program. Set the permissions on the memory containing the program instructions to read and execute but not write.
Bounds checking: Use a language such as Java or Scheme that checks that all array/buffer indices are valid.

You are aware of several buffer overrun attacks, including the following:

Simple buffer overrun: The victim program has an array on the stack as follows:

procedure VICTIM (data, length) integer buffer[100] COPY (buffer, data, length) // overruns array buffer if length > 100 // ....

The attacker supplies a length > 100 together with an array data that includes some new instructions and places the address of the first instruction in the position where the procedure return is stored. When VICTIM reaches its return , it returns to the attacker's code in the stack rather than the program that originally called VICTIM .
Trampoline: The victim program has an array on the stack as in the code fragment above, but the attacker cannot predict its address, so replacing the procedure return address with the address of the attacker's code won't work. However, the attacker knows that subroutine VICTIM () leaves an address in some register (say R5 ) that points to a known, fixed offset within the array. The other thing that is needed is an instruction anywhere in memory at a known address x that jumps to wherever R5 is pointing. The attacker overruns the array with his new instructions and overwrites the procedure return address with the address x . When VICTIM reaches its return, it returns to address x , which jumps to the address in R5 , which transfers control to the attacker's code.
Arc injection (return-to-library): Taking advantage again of knowledge that VICTIM leaves the address of a fixed offset within the array in register R5 , the attacker provides some carefully selected data at that offset and also overruns the buffer with a new procedure return address. The new procedure return address is chosen to be in some system or library program known to reside elsewhere in the current address space, preferably to a place within that library program after it has checked the validity of its parameters, and is about to do something using the contents of register R5 as the address of one of its parameters. A particularly good library program to jump into is one that calls a procedure whose string name is supplied as an argument. The attacker's carefully selected data is chosen to be the string name of an existing program that the attacker would like to execute.

In the following questions, an attack is considered prevented if the attacker can no longer execute the intended malicious code, even if an overflow can still overwrite data or crash or disrupt the program.

Q32.1 Which of the following attack methods are prevented by the use of the random stack technique?

A. Simple buffer overrun
B. Trampoline
C. Arc injection (return-to-library)

Q32.2 Which of the following attack methods are prevented by the use of the non-executable stack technique?

A. Simple buffer overrun
B. Trampoline
C. Arc injection (return-to-library)

Q32.3 Which of the following attack methods are prevented by the use of the bounds checking technique?

A. Simple buffer overrun
B. Trampoline
C. Arc injection (return-to-library)