10.3: MACs for Long Messages
- Page ID
- 7364
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)
( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\id}{\mathrm{id}}\)
\( \newcommand{\Span}{\mathrm{span}}\)
\( \newcommand{\kernel}{\mathrm{null}\,}\)
\( \newcommand{\range}{\mathrm{range}\,}\)
\( \newcommand{\RealPart}{\mathrm{Re}}\)
\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)
\( \newcommand{\Argument}{\mathrm{Arg}}\)
\( \newcommand{\norm}[1]{\| #1 \|}\)
\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)
\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)
\( \newcommand{\vectorA}[1]{\vec{#1}} % arrow\)
\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow\)
\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vectorC}[1]{\textbf{#1}} \)
\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)
\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)
\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)
\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)
\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)
\(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)Using a PRF as a MAC is useful only for short, fixed-length messages, since most PRFs that exist in practice are limited to such inputs. Can we somehow extend a PRF to construct a MAC scheme for long messages, similar to how we used block cipher modes to construct encryption for long messages?
How NOT to do it
To understand the challenges of constructing a MAC for long messages, we first explore some approaches that don’t work. The things that can go wrong in an insecure MAC are quite different in character to the things that can go wrong in a block cipher mode, so pay attention closely!
Let \(F\) be a PRF with in \(=\) out \(=\lambda\). Below is a MAC approach for messages of length \(2 \lambda\). It is inspired by ECB mode, so you know it’s going to be a disaster:
One problem with this approach is that, although the PRF authenticates each block \(m_{1}, m_{2}\) individually, it does nothing to authenticate that \(m_{1}\) is the first block but \(m_{2}\) is the second one Translating this observation into an attack, an adversary can ask for the \(M A C\) tag of \(m_{1} \| m_{2}\) and then predict/forge the tag for \(m_{2} \| m_{1}\) :
When \(\mathcal{A}\) is linked to \(\mathcal{L}_{\text {mac-real, }}\), it always return true, since we can tell that \(t_{2} \| t_{1}\) is indeed the valid tag for \(1^{\lambda} \| 0^{\lambda}\). When \(\mathcal{A}\) is linked to \(\mathcal{L}_{\text {mac-fake, }}\), it always return false, since the calling program never called GETTAG with input \(1^{\lambda} \| 0^{\lambda}\). Hence, \(\mathcal{A}\) distinguishes the libraries with advantage \(1 .\)
This silly MAC construction treats both \(m_{1}\) and \(m_{2}\) identically, and an obvious way to try to fix the problem is to treat the different blocks differently somehow:
Let \(F\) be a PRF with in \(=\lambda+1\) and out \(=\lambda\). Below is another MAC approach for messages of length \(2 \lambda\) :
This MAC construction does better, as it treats the two message blocks \(m_{1}\) and \(m_{2}\) differently. Certainly the previous attack of swapping the order of \(m_{1}\) and \(m_{2}\) doesn’t work anymore. (Can you see why?)
The construction authenticates (in some sense) the fact that \(m_{1}\) is the first message block, and \(m_{2}\) is the second block. However, this construction doesn’t authenticate the fact that this particular \(m_{1}\) and \(m_{2}\) belong together. More concretely, we can "mix and match" blocks of the tag corresponding to different messages:
In this attack, we combine the \(t_{1}\) block from the first tag and the \(t_{2}\) block from the second tag
We are starting to see the challenges involved in constructing a MAC scheme for long messages. A secure MAC should authenticate each message block, the order of the message blocks, and the fact that these particular message blocks are appearing in a single message. In short, it must authenticate the entirety of the message.
Think about how authentication is significantly different than privacy/hiding in this respect. At least for CPA security, we can hide an entire plaintext by hiding each individual piece of the plaintext separately (encrypting it with a CPA-secure encryption). Authentication is fundamentally different.
How to do it: CBC-MAC
We have seen some insecure ways to construct a MAC for longer messages. Now let’s see a secure way. A common approach to constructing a MAC for long messages involves the CBC block cipher mode.
Let \(F\) be a \(P R F\) with in \(=\) out \(=\lambda . C B C-M A C\) refers to the following \(M A C\) scheme:
Unlike CBC encryption, CBC-MAC uses no initialization vector (or, you can think of it as using the all-zeroes IV), and it outputs only the last block.
If \(F\) is a secure PRF with in \(=\) out \(=\lambda\), then for any fixed \(\ell, C B C\) - \(M A C\) is a secure \(M A C\) when used with message space \(\mathcal{M}=\{0,1\}^{\lambda \ell}\).
Pay close attention to the security statement. It says that if you only ever authenticate 4-block messages, CBC-MAC is secure. If you only ever authenticate 24-block messages, \(\mathrm{CBC}-\mathrm{MAC}\) is secure. However, if you want to authenticate both 4-block and 24-block messages (i.e., under the same key), then CBC-MAC is not secure. In particular, seeing the CBC-MAC of several 4-block messages allows an attacker to generate a forgery of a 24-block message. The exercises explore this property.
More Robust CBC-MAC
If \(C B C-M A C\) is so fragile, is there a way to extend it to work for messages of mixed lengths? One approach is called ECBC-MAC, and is shown below. It works by treating the last block differently - specifically, it uses an independent PRF key for the last block in the CBC chain.
Let \(F\) be a PRF with in \(=\) out \(=\lambda .\) ECBC-MAC refers to the following scheme:
If \(F\) is a secure PRF with in \(=\) out \(=\lambda\), then \(E C B C-M A C\) is a secure MAC for message space \(\mathcal{M}=\left(\{0,1\}^{\lambda}\right)^{*}\)
In other words, \(\mathrm{ECBC}-\mathrm{MAC}\) is safe to use with messages of any length (that is a multiple of the block length). To extend ECBC-MAC to messages of any length (not necessarily a multiple of the block length), one can use a padding scheme as in the case of encryption. \({ }^{1}\)
\({ }^{1}\) Note that if the message is already a multiple of the block length, then padding adds an extra block. There exist clever ways to avoid an extra padding block in the case of MACs, which we don’t discuss further.