# 11.2: Merkle-Damgård Construction

- Page ID
- 7391

In this section we’ll see a simple application of collision resistance. It is a common theme in cryptography, that instead of dealing with large data it is often sufficient to deal with only a hash of that data. This theme is true in particular in the context of MACs.

One particularly simple way to construct a secure MAC is to use a PRF directly as a MAC. However, PRFs (and in particular, block ciphers) often have a short fixed input length, making them suitable only for MACs of such short messages. To extend such a MAC to longer inputs, it suffices to compute a MAC of the hash of the data. This idea is formalized in the following construction:

**Construction 12.2: Hash-Then-MAC**

*Let M be a MAC scheme with message space* ? = {0,1}* ^{n} and let* ℋ

*be a hash family with output length n. Then*

**hash-then-MAC**(HtM) refers to the following MAC scheme:**Claim 12.3**

*Construction 12.2 is a secure MAC, if* ℋ

*is collision-resistant and M is a secure MAC.*

**Proof**

We prove the security of *HtM* using a standard hybrid approach.

The starting point is ℒ^{HtM}_{mac-real}, shown here with the details of *HtM* filled in. Our goal is to eventually reach ℒ_{mac-fake}, where the VER subroutine returns false unless (*m*,*t*) was generated by the GETMAC subroutine.

We have applied the MAC security of *M*, omitting the usual details (factor out, swap libraries, inline). Now VER returns false unless (*H*(*m*),*t*) ∈ ?.

Here we have applied the security of the hash family ℋ. We have factored out all calls to *H* in terms of ℒ^{ℋ}_{cr-real}, replaced ℒ_{cr-real} with ℒ_{cr-fake}, and then inlined.

Now we have simply added another set ?’ which is never actually used. We point out two things: First, in GETMAC, (*y*,*t*) added to ? if and only if (*H*^{−1} [*y*],*t*) is added to ?’. Second, if the last line of VER is reached, then the library has not self destructed. So *H*^{−1} [*y*] = *m*, and in fact *H*^{−1} [*y*] has never been defined to be anything else. This means the last line of VER is equivalent to:

(*y*,*t*) ∈ ? ⇔ (*H*^{−1} [*y*],*t*) ∈ ?’ ⇔ (*m*,*t*) ∈ ?’

We have replaced the condition (*H*(*m*),*t*) ∈ ? with (*m*,*t*) ∈ ?’ . But we just argued that these statements are logically equivalent within the library

We remove the variable *H*^{−1} and **self destruct** statements via a standard sequence of changes (*i.e.*, factor out in terms of ℒ_{cr-fake}, replace with ℒ_{cr-real}, inline). We also remove the now-unused variable ?. The result is ℒ^{HtM}_{mac-fake}, as desired.

The next-to-last hybrid is the key step where collision-resistance comes into our reasoning. Indeed, a collision would break down the argument. If the adversary manages to find a collision *y* = *H*(*m*) = *H*(*m’*), then it could be that (*y,t*) ∈ ? and (*m,t*) ∈ ?’ but (*m’,t*) ∉ ?. This corresponds to a forgery of *HtM* in a natural way: Ask for a MAC of *m*, which is *t* = MAC(*k,H*(*m*)); then *t* is also a valid MAC of *m’*.