# 11.4: Exercises

- Page ID
- 7393

Sometimes when I verify an MD5 hash visually, I just check the first few and the last few hex digits, and don’t really look at the middle of the hash.

Generate two files with opposite meanings, whose MD5 hashes agree in their first 16 bits (4 hex digits) and in their last 16 bits (4 hex digits). It could be two text files that say opposite things. It could be an image of Mario and an image of Bowser. I don’t know, be creative.

As an example, the strings "subtitle illusive planes" and "wantings premises forego" actually agree in the first 20 and last 20 bits (first and last 5 hex digits) of their MD5 hashes, but it’s not clear that they’re very meaningful.

Describe how you generated the files, and how many MD5 evaluations you had to make.

Let \(h:\{0,1\}^{n+t} \rightarrow\{0,1\}^{n}\) be a fixed-length compression function. Suppose we forgot a few of the important features of the Merkle-Damgård transformation, and construct a hash function \(H\) from \(h\) as follows:

- Let \(x\) be the input.
- Split \(x\) into pieces \(y_{0}, x_{1}, x_{2}, \ldots, x_{k}\), where \(y_{0}\) is \(n\) bits, and each \(x_{i}\) is \(t\) bits. The last piece \(x_{k}\) should be padded with zeroes if necessary.
- For \(i=1\) to \(k\), set \(y_{i}=h\left(y_{i-1} \| x_{i}\right)\).
- Output \(y_{k}\)

Basically, it is similar to the Merkle-Damgård except we lost the IV and we lost the final padding block.

- Describe an easy way to find two messages that are broken up into the same number of pieces, which have the same hash value under \(H\).
- Describe an easy way to find two messages that are broken up into different number of pieces, which have the same hash value under \(H\).

**Hint:**

Neither of your collisions above should involve finding a collision in \(h\).

I’ve designed a hash function \(H:\{0,1\}^{*} \rightarrow\{0,1\}^{n}\). One of my ideas is to make \(H(x)=x\) if \(x\) is an \(n\)-bit string (assume the behavior of \(H\) is much more complicated on inputs of other lengths). That way, we know with certainty that there are no collisions among \(n\)-bit strings. Have I made a good design decision?

Same as above, but now if \(x\) is \(n\) bits long, then \(H(x)=x \oplus m\), where \(m\) is a fixed, public string. Can this be a good hash function?

Let \(H\) be a hash function and let \(t\) be a fixed constant. Define \(H^{(t)}\) as: \[H^{(t)}(x)=\underbrace{H(\cdots H(H(x)) \cdots)}_{t \text { times }}\] Show that if you are given a collision under \(H^{(t)}\) then you can efficiently find a collision under \(H\).

In this problem, if \(x\) and \(y\) are strings of the same length, then we write \(x \sqsubseteq y\) if \(x=y\) or \(x\) comes before \(y\) in standard dictionary ordering.

Suppose a function \(H:\{0,1\}^{*} \rightarrow\{0,1\}^{n}\) has the following property. For all strings \(x\) and \(y\) of the same length, if \(x \sqsubseteq y\) then \(H(x) \sqsubseteq H(y)\). Show that \(H\) is not collision resistant (describe how to efficiently find a collision in such a function).

**Hint:**

Suppose a function \(H:\{0,1\}^{*} \rightarrow\{0,1\}^{n}\) has the following property. For all strings \(x\) and \(y\) of the same length, \(H(x \oplus y)=H(x) \oplus H(y)\). Show that \(H\) is not collision resistant (describe how to efficiently find a collision in such a function).

Let \(H\) be a salted hash function with \(n\) bits of output, and define the following function:

Note that \(H^{*}\) can take inputs of any length \((k)\). Show how to find collisions in \(H^{*}\) when \(k>n\)

Generalize the Merkle-Damgård construction so that it works for arbitrary input lengths (and arbitrary values of \(t\) in the compression function). Extend the proof of Claim \(11.3\) to your new construction.

Let \(F\) be a secure PRF with \(n\)-bit inputs, and let \(H\) be a collision-resistant (salted) hash function with \(n\)-bit outputs. Define the new function \(F^{\prime}((k, s), x)=F(k, H(s, x))\), where we interpret \((k, s)\) to be its key. Prove that \(F^{\prime}\) is a secure PRF with arbitrary-length inputs.

Let MAC be a secure MAC algorithm with \(n\)-bit inputs, and let \(H\) be a collision-resistant (salted) hash function with \(n\)-bit outputs. Define the new function \(\operatorname{MAC}^{\prime}((k, s), x)=\) \(\operatorname{MAC}(k, H(s, x))\), where we interpret \((k, s)\) to be its key. Prove that MAC \({ }^{\prime}\) is a secure \(M A C\) with arbitrary-length inputs.

More exotic issues with the Merkle-Damgård construction:

(a) Let \(H\) be a hash function with \(n\)-bit output, based on the Merkle-Damgård construction. Show how to compute (with high probability) 4 messages that all hash to the same value under \(H\), using only \(\sim 2 \cdot 2^{n / 2}\) calls to \(H\).

**Hint: **

(b) Show how to construct \(2^{d}\) messages that all hash to the same value under \(H\), using only \(O\left(d \cdot 2^{n / 2}\right)\) evaluations of \(H\).

(c) Suppose \(H_{1}\) and \(H_{2}\) are (different) hash functions, both with \(n\)-bit output. Consider the function \(H^{*}(x)=H_{1}(x) \| H_{2}(x)\). Since \(H^{*}\) has \(2 n\)-bit output, it is tempting to think that finding a collision in \(H^{*}\) will take \(2^{(2 n) / 2}=2^{n}\) effort.

However, this intuition is not true when \(H_{1}\) is a Merkle-Damgård hash. Show that when \(H_{1}\) is Merkle-Damgård, then it is possible to find collisions in \(H^{*}\) with only \(O\left(n 2^{n / 2}\right.\) ) effort. The attack should assume nothing about \(H_{2}\) (i.e., \(H_{2}\) need not be Merkle-Damgård)

**Hint:**

Let \(H\) be a collision-resistant hash function with output length \(n\). Let \(H^{*}\) denote iterating \(H\) in a manner similar to \(\mathrm{CBC}-\mathrm{MAC}\):

Show that \(H^{*}\) is** not** collision-resistant. Describe a successful attack.

Show that a bare PRP is not collision resistant. In other words, if \(F\) is a secure PRP, then show how to efficiently find collisions in \(H(x \| y)=F(x, y)\).

Show that the CBC-MAC construction applied to a PRP is not collision-resistant. More precisely, let \(F\) be a secure PRP. Show how to efficiently find collisions in the following salted hash function \(H\) :

Here we are interpreting \(k\) as the salt. This is yet another example of how collisionresistance is different than authenticity (MAC).

Let \(H:\{0,1\}^{\lambda} \rightarrow\{0,1\}^{\lambda}\) be any function, and define the following function \(H^{*}:\) \(\{\theta, 1\}^{2 \lambda} \rightarrow\{0,1\}^{\lambda}\)

Show how to succeed in an efficient second-preimage attack on \(H^{*}\).

Adding a plain hash to a plaintext does not result in CCA security. Consider the following approach for encryption, that uses a plain (unsalted) hash function \(H\). To encrypt plaintext \(m\), simply encrypt \(m \| H(m)\) under CTR mode. To decrypt, use normal CTR mode decryption but return err if the plaintext does not have the form \(m \| H(m)(\) i.e., if the last \(n\) bits are not a hash of the rest of the CTR-plaintext).

Show that the scheme does not have CCA security.

In the discussion of length-extension attacks, we noted that a natural way to stop them is to "do something different" for the last block of Merkle-Damgård. Suppose after performing the final call to \(h\) in Merkle-Damgård, we complement the value \(\left(y_{k+1}\right)\). Does this modified scheme still have length-extension properties?