Search

Text Color

Margin Size

Font Type

Enable Dyslexic Font

11.2: Merkle-Damgård Construction

Last updated

Dec 27, 2022
Save as PDF
- 11.1: Security Properties for Hash Functions
- 11.3: Hash Functions vs. MACs: Length-Extension Attacks

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\id}{\mathrm{id}}$ $\newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$ $\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$ $\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\id}{\mathrm{id}}$

$\newcommand{\Span}{\mathrm{span}}$

$\newcommand{\kernel}{\mathrm{null}\,}$

$\newcommand{\range}{\mathrm{range}\,}$

$\newcommand{\RealPart}{\mathrm{Re}}$

$\newcommand{\ImaginaryPart}{\mathrm{Im}}$

$\newcommand{\Argument}{\mathrm{Arg}}$

$\newcommand{\norm}[1]{\| #1 \|}$

$\newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$\newcommand{\Span}{\mathrm{span}}$ $\newcommand{\AA}{\unicode[.8,0]{x212B}}$

$\newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$\newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$\newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vectorC}[1]{\textbf{#1}}$

$\newcommand{\vectorD}[1]{\overrightarrow{#1}}$

$\newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}}$

$\newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}}$

$\newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} }$

$\newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}}$

$\newcommand{\avec}{\mathbf a}$

$\newcommand{\bvec}{\mathbf b}$

$\newcommand{\cvec}{\mathbf c}$

$\newcommand{\dvec}{\mathbf d}$

$\newcommand{\dtil}{\widetilde{\mathbf d}}$

$\newcommand{\evec}{\mathbf e}$

$\newcommand{\fvec}{\mathbf f}$

$\newcommand{\nvec}{\mathbf n}$

$\newcommand{\pvec}{\mathbf p}$

$\newcommand{\qvec}{\mathbf q}$

$\newcommand{\svec}{\mathbf s}$

$\newcommand{\tvec}{\mathbf t}$

$\newcommand{\uvec}{\mathbf u}$

$\newcommand{\vvec}{\mathbf v}$

$\newcommand{\wvec}{\mathbf w}$

$\newcommand{\xvec}{\mathbf x}$

$\newcommand{\yvec}{\mathbf y}$

$\newcommand{\zvec}{\mathbf z}$

$\newcommand{\rvec}{\mathbf r}$

$\newcommand{\mvec}{\mathbf m}$

$\newcommand{\zerovec}{\mathbf 0}$

$\newcommand{\onevec}{\mathbf 1}$

$\newcommand{\real}{\mathbb R}$

$\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$

$\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$

$\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$

$\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$

$\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$

$\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$

$\newcommand{\laspan}[1]{\text{Span}\{#1\}}$

$\newcommand{\bcal}{\cal B}$

$\newcommand{\ccal}{\cal C}$

$\newcommand{\scal}{\cal S}$

$\newcommand{\wcal}{\cal W}$

$\newcommand{\ecal}{\cal E}$

$\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$

$\newcommand{\gray}[1]{\color{gray}{#1}}$

$\newcommand{\lgray}[1]{\color{lightgray}{#1}}$

$\newcommand{\rank}{\operatorname{rank}}$

$\newcommand{\row}{\text{Row}}$

$\newcommand{\col}{\text{Col}}$

$\renewcommand{\row}{\text{Row}}$

$\newcommand{\nul}{\text{Nul}}$

$\newcommand{\var}{\text{Var}}$

$\newcommand{\corr}{\text{corr}}$

$\newcommand{\len}[1]{\left|#1\right|}$

$\newcommand{\bbar}{\overline{\bvec}}$

$\newcommand{\bhat}{\widehat{\bvec}}$

$\newcommand{\bperp}{\bvec^\perp}$

$\newcommand{\xhat}{\widehat{\xvec}}$

$\newcommand{\vhat}{\widehat{\vvec}}$

$\newcommand{\uhat}{\widehat{\uvec}}$

$\newcommand{\what}{\widehat{\wvec}}$

$\newcommand{\Sighat}{\widehat{\Sigma}}$

$\newcommand{\lt}{<}$

$\newcommand{\gt}{>}$

$\newcommand{\amp}{&}$

$\definecolor{fillinmathshade}{gray}{0.9}$

Building a hash function, especially one that accepts inputs of arbitrary length, seems like a challenging task. In this section, we’ll see one approach for constructing hash functions, called the Merkle-Damgård construction.

Instead of a full-fledged hash function, imagine that we had a collision-resistant function whose inputs were of a single fixed length, but longer than its outputs. In other words, $h:\{\theta, 1\}^{n+t} \rightarrow\{0,1\}^{n}$ , where $t>0$ . We call such an $h$ a compression function. This is not compression in the usual sense of the word - we are not concerned about recovering the input from the output. We call it a compression function because it "compresses" its input by $t$ bits (analogous to how a pseudorandom generator "stretches" its input by some amount).

The following construction is one way to build a full-fledged hash function (supporting inputs of arbitrary length) out of such a compression function:

Construction $11.2$ (Merkle-Damgård)

Let be a compression function. Then the Merkle-Damgård transformation of $h$ is $M D_{h}:\{0,1\}^{*} \rightarrow\{0,1\}^{n}$ , where:

The idea of the Merkle-Damgård construction is to split the input $x$ into blocks of size $t$ . The end of the string is filled out with $\vartheta s$ if necessary. A final block called the "padding block" is added, which encodes the (original) length of $x$ in binary.

Example

Suppose we have a compression function $h:\{0,1\}^{48} \rightarrow\{0,1\}^{32}$ , so that $t=16$ . We build a Merkle-Damgård hash function out of this compression function and wish to compute the hash of the following 5-byte (40-bit) string:

We must first padx appropriately $(M D P A D(x))$ :

Since $x$ is not a multiple of $t=16$ bits, we need to add 8 bits to make it so.
Since $|x|=40$ , we need to add an extra 16-bit block that encodes the number 40 in $\operatorname{binary}(101000)$

After this padding, and splitting the result into blocks of length 16 , we have the following:

The final hash of $x$ is computed as follows:

We are presenting a simplified version, in which $\mathrm{MD}_{h}$ accepts inputs whose maximum length is $2^{t}-1$ bits (the length of the input must fit into $t$ bits). By using multiple padding blocks (when necessary) and a suitable encoding of the original string length, the construction can be made to accommodate inputs of arbitrary length (see the exercises).

The value $y_{0}$ is called the initialization vector (IV), and it is a hard-coded part of the algorithm.

As discussed above, we will not be making provable security claims using the librarystyle definitions. However, we can justify the Merkle-Damgård construction with the following claim:

Claim 11.3

Suppose $h$ is a compression function and $M D_{h}$ is the Merkle-Damgård construction applied to $h$ . Given a collision $x, x^{\prime}$ in $M D_{h}$ , it is easy to find a collision in $h$ . In other words, if it is hard to find a collision in $h$ , then it must also be hard to find a collision in $M D_{h}$ .

Proof

Suppose that $x, x^{\prime}$ are a collision under $M_{h}$ . Define the values $x_{1}, \ldots, x_{k+1}$ and $y_{1}, \ldots, y_{k+1}$ as in the computation of $M D_{h}(x)$ . Similarly, define $x_{1}^{\prime}, \ldots, x_{k^{\prime}+1}^{\prime}$ and $y_{1}^{\prime}, \ldots, y_{k^{\prime}+1}^{\prime}$ as in the computation of $\operatorname{MD}_{h}\left(x^{\prime}\right)$ . Note that, in general, $k$ may not equal $k^{\prime}$

Recall that: $\begin{gathered} \operatorname{MD}_{h}(x)=y_{k+1}=h\left(y_{k} \| x_{k+1}\right) \\ \mathrm{MD}_{h}\left(x^{\prime}\right)=y_{k^{\prime}+1}^{\prime}=h\left(y_{k^{\prime}}^{\prime} \| x_{k^{\prime}+1}^{\prime}\right) \end{gathered}$ Since we are assuming $\operatorname{MD}_{h}(x)=\operatorname{MD}_{h}\left(x^{\prime}\right)$ , we have $y_{k+1}=y_{k^{\prime}+1}^{\prime} .$ We consider two cases:

Case 1: If $|x| \neq\left|x^{\prime}\right|$ , then the padding blocks $x_{k+1}$ and $x_{k^{\prime}+1}^{\prime}$ which encode $|x|$ and $\left|x^{\prime}\right|$ are not equal. Hence we have $y_{k}\left\|x_{k+1} \neq y_{k^{\prime}}^{\prime}\right\| x_{k^{\prime}+1}^{\prime}$ , so $y_{k} \| x_{k+1}$ and $y_{k^{\prime}}^{\prime} \| x_{k^{\prime}+1}^{\prime}$ are a collision under $h$ and we are done.

Case 2: If $|x|=\left|x^{\prime}\right|$ , then $x$ and $x^{\prime}$ are broken into the same number of blocks, so $k=k^{\prime}$ . Let us work backwards from the final step in the computations of $M D D_{h}(x)$ and $M_{h}\left(x^{\prime}\right)$ .

We know that: $\begin{aligned} &y_{k+1}=h\left(y_{k} \| x_{k+1}\right) \\ &= \\ &y_{k+1}^{\prime}=h\left(y_{k}^{\prime} \| x_{k+1}^{\prime}\right) \end{aligned}$ If $y_{k} \| x_{k+1}$ and $y_{k}^{\prime} \| x_{k+1}^{\prime}$ are not equal, then they are a collision under $h$ and we are done. Otherwise, we can apply the same logic again to $y_{k}$ and $y_{k}^{\prime}$ , which are equal by our assumption.

More generally, if $y_{i}=y_{i}^{\prime}$ , then either $y_{i-1} \| x_{i}$ and $y_{i-1}^{\prime} \| x_{i}^{\prime}$ are a collision under $h$ (and we say we are "lucky"), or else $y_{i-1}=y_{i-1}^{\prime}$ (and we say we are "unlucky"). We start with the premise that $y_{k}=y_{k}^{\prime}$ . Can we ever get "unlucky" every time, and not encounter a collision when propagating this logic back through the computations of $\mathrm{MD}_{h}(x)$ and $\mathrm{MD}_{h}\left(x^{\prime}\right)$ ? The answer is no, because encountering the unlucky case every time would imply that $x_{i}=x_{i}^{\prime}$ for all $i$ . That is, $x=x^{\prime}$ . But this contradicts our original assumption that $x \neq x^{\prime}$ . Hence we must encounter some "lucky" case and therefore a collision in $h$ .

Construction 11.211.2 (Merkle-Damgård)

Example

Claim 11.3

Support Center

How can we help?

Construction $11.2$ (Merkle-Damgård)