4.3: Indistinguishability

Last updated
Save as PDF

Page ID: 7336

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

So far we have been writing formal security definitions in terms of interchangeable libraries, which requires that two libraries have exactly the same effect on every calling program. Going forward, our security definitions will not be quite as demanding. First, we only consider polynomial-time calling programs; second, we don’t require the libraries to have exactly the same effect on the calling program, only that the difference in effects is negligible.

Definition \(4.5\) (Indistinguishable)

Let \(\mathcal{L}_{\text {left }}\) and \(\mathcal{L}_{\text {right }}\) be two libraries with a common interface. We say that \(\mathcal{L}_{\text {left }}\) and \(\mathcal{L}_{\text {right }}\) are indistinguishable, and write \(\mathcal{L}_{\text {left }} \approx \mathcal{L}_{\text {right }}\), if for all polynomial-time programs \(\mathcal{A}\) that output a single bit, \(\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right] \approx \operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1\right]\)

We call the quantity \(\left|\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1\right]\right|\) the advantage or bias of \(\mathcal{A}\) in distinguishing \(\mathcal{L}_{\text {left }}\) from \(\mathcal{L}_{\text {right }}\). Two libraries are therefore indistinguishable if all polynomial-time calling programs have negligible advantage in distinguishing them.

From the properties of the " \(\approx\) symbol, we can see that indistinguishability of libraries is also transitive, which allows us to carry out hybrid proofs of security in the same way as before.

Example

Here is a very simple example of two indistinguishable libraries:

Imagine the calling program trying to predict which string will be chosen when uniformly sampling from \(\{0,1\}^{\lambda}\). The left library tells the calling program whether its prediction was correct. The right library doesn’t even bother sampling a string, it just always says "sorry, your prediction was wrong."

Here is one obvious strategy (maybe not the best one, we will see) to distinguish these libraries. The calling program \(\mathcal{A}_{\text {obvious }}\) calls PREDICT many times and outputs 1 if it ever received true as a response. Since it seems like the argument to PREDICT might not have any effect, let’s just use the string of all-Os as argument every time.

\(-\mathcal{L}_{\text {right }}\) can never return true, so \(\operatorname{Pr}\left[\mathcal{A}_{\text {obvious }} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1\right]=0\).
In \(\mathcal{L}_{\text {left }}\) each call to PREDICT has an independent probability \(1 / 2^{\lambda}\) of returning true. So \(\operatorname{Pr}\left[\mathcal{A}_{\text {obvious }} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right]\) is surely non-zero. Actually, the exact probability is a bit cumbersome to write:
\[\begin{aligned} \operatorname{Pr}\left[\mathcal{A}_{\text {obvious }} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right] &=1-\operatorname{Pr}\left[\mathcal{A}_{\text {obvious }} \diamond \mathcal{L}_{\text {left }} \Rightarrow 0\right] \\ &=1-\operatorname{Pr}[\text { all } q \text { independent calls to pREDICT return false] }\\ &=1-\left(1-\frac{1}{2^{\lambda}}\right)^{q} \end{aligned}\]

Rather than understand this probability, we can just compute an upper bound for it. Using the union bound, we get:\[\begin{aligned} & \operatorname{Pr}\left[\mathcal{A}_{\text {obvious }} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right] \leqslant \operatorname{Pr}[\text { first call to PREDICT returns true] } \\ & +\operatorname{Pr}[\text { second call to PREDICT returns true }]+\cdots \\ & =q \frac{1}{2^{\lambda}} \end{aligned}\]

This is an overestimate of some probabilities (e.g., if the first call to PREDICT returns true, then the second call isn’t made). More fundamentally, \(q / 2^{\lambda}\) exceeds 1 when \(q\) is large. But nevertheless, \(\operatorname{Pr}\left[\mathcal{A}_{\text {obvious }} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right] \leqslant q / 2^{\lambda} .\)

We showed that \(\mathcal{A l}_{\text {obvious }}\) has non-zero advantage. This is enough to show that \(\mathcal{L}_{\text {left }} \equiv \mathcal{L}_{\text {right }}\).

We also showed that \(\mathcal{A}_{\text {obvious }}\) has advantage at most \(q / 2^{\lambda}\). Since \(\mathcal{A}_{\text {obvious }}\) runs in polynomial time, it can only make a polynomial number \(q\) of queries to the library, so \(q / 2^{\lambda}\) is negligible. However, this is not enough to show that \(\mathcal{L}_{\text {left }} \approx \mathcal{L}_{\text {right }}\) since it considers only a single calling program. To show that the libraries are indistinguishable, we must show that every calling program’s advantage is negligible.

In a few pages, we will prove that for any \(\mathcal{A}\) that makes \(q\) calls to PREDICT,

\[\left|\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\mathrm{left}} \Rightarrow 1\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1\right]\right| \leqslant \frac{q}{2^{\lambda}} .\]

For any polynomial-time \(\mathcal{A}\), the number \(q\) of calls to PREDICT will be a polynomial in \(\lambda\), making \(q / 2^{\lambda}\) a negligible function. Hence, \(\mathcal{L}_{\text {left }} \approx \mathcal{L}_{\text {right }} .\)

Other Properties

Lemma \(4.6\) \((\approx\) facts)

If \(\mathcal{L}_{1} \equiv \mathcal{L}_{2}\) then \(\mathcal{L}_{1} \approx \mathcal{L}_{2}\). Also, if \(\mathcal{L}_{1} \approx \mathcal{L}_{2} \approx \mathcal{L}_{3}\) then \(\mathcal{L}_{1} \approx \mathcal{L}_{3}\).

Analogous to Lemma \(2.11\), we also have the following library chaining lemma, which you are asked to prove as an exercise:

Lemma \(4.7\) (Chaining)

If \(\mathcal{L}_{\text {left }} \approx \mathcal{L}_{\text {right }}\) then \(\mathcal{L}^{*} \diamond \mathcal{L}_{\text {left }} \approx \mathcal{L}^{*} \diamond \mathcal{L}_{\text {right }}\) for any polynomial-time library \(\mathcal{L}^{*}\).

Bad-Event Lemma

A common situation is when two libraries are expected to execute exactly the same statements, until some rare & exceptional condition happens. In that case, we can bound an attacker’s distinguishing advantage by the probability of the exceptional condition.

More formally,

Lemma \(4.8\) (Bad events)

Let \(\mathcal{L}_{\text {left }}\) and \(\mathcal{L}_{\text {right }}\) be libraries that each define a variable named ’bad’ that is initialized to \(\quad 0 .\) If \(\mathcal{L}_{\text {left }}\) and \(\mathcal{L}_{\text {right }}\) have identical code, except for code blocks reachable only when bad \(=1\) (e.g., guarded by an "if bad \(=1\) "statement), then \[\left|\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1\right]\right| \leqslant \operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \text { sets bad }=1\right] .\]

\(\star\) Proof

Fix an arbitrary calling program \(\mathcal{A}\). In this proof, we use conditional probabilites \({ }^{8}\) to isolate the cases where bad is changed to 1 . We define the following events:

\(-\mathcal{B}_{\text {left }}:\) the event that \(\mathcal{A} \diamond \mathcal{L}_{\text {left }}\) sets bad to 1 at some point.
\(-\mathcal{B}_{\text {right }}\) : the event that \(\mathcal{A} \diamond \mathcal{L}_{\text {right }}\) sets bad to 1 at some point.

We also write \(\overline{\mathcal{B}_{\text {left }}}\) and \(\overline{\mathcal{B}_{\text {right }}}\) to denote the corresponding complement events. From conditional probability, we can write:

\[\begin{aligned} \operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right]=\operatorname{Pr}[&\left.\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \mathcal{B}_{\text {left }}\right] \operatorname{Pr}\left[\mathcal{B}_{\text {left }}\right] \\ &+\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {left }}}\right] \operatorname{Pr}\left[\overline{\mathcal{B}_{\text {left }}}\right] \\ \operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1\right]=\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \mathcal{B}_{\text {right }}\right] \operatorname{Pr}\left[\mathcal{B}_{\text {right }}\right] \\ &+\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {right }}}\right] \operatorname{Pr}\left[\overline{\mathcal{B}_{\text {right }}}\right] \end{aligned}\] Our first observation is that \(\operatorname{Pr}\left[\mathcal{B}_{\text {left }}\right]=\operatorname{Pr}\left[\mathcal{B}_{\text {right }}\right]\). This is because at the time bad is changed to 1 for the first time, the library has only been executing instructions that are the same in \(\mathcal{L}_{\text {left }}\) and \(\mathcal{L}_{\text {right. }}\). In other words, the choice to set bad to 1 is determined by the same sequence of instructions in both libraries, so it occurs with the same probability in both libraries.

As a shorthand notation, we define \(p^{*} \stackrel{\operatorname{def}}{=} \operatorname{Pr}\left[\mathcal{B}_{\text {left }}\right]=\operatorname{Pr}\left[\mathcal{B}_{\text {right }}\right]\). Then we can write the advantage of \(\mathcal{A}\) as:

\(\operatorname{advantage}_{\mathcal{A}}=\left|\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1\right]\right|\)

\[\begin{gathered} =\left|\begin{array}{c} \left(\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \mathcal{B}_{\text {left }}\right] \cdot p^{*}+\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {left }}}\right]\left(1-p^{*}\right)\right) \\ -\left(\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \mathcal{B}_{\text {right }}\right] \cdot p^{*}+\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {right }}}\right]\left(1-p^{*}\right)\right) \\ p^{*}\left(\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \mathcal{B}_{\text {left }}\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \mathcal{B}_{\text {right }}\right]\right) \\ \left(1-p^{*}\right)\left(\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {left }}}\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {right }}}\right]\right) \end{array}\right| \end{gathered}\]

In both of the expressions \(\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {left }}}\right]\) and \(\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \overline{\mathcal{B}_{\text {right }}}\right]\), we are conditioning on bad never being set to 0 . In this case, both libraries are executing the same sequence of instructions, so the probabilities are equal (and the difference of the probabilities is zero). Substituting in, we get:

\[\operatorname{advantage}_{\mathcal{A}}=p^{*}\left|\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \Rightarrow 1 \mid \mathcal{B}_{\text {left }}\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {right }} \Rightarrow 1 \mid \mathcal{B}_{\text {right }}\right]\right|\]

Intuitively, the proof is confirming the idea that differences can only be noticed between \(\mathcal{L}_{\text {left }}\) and \(\mathcal{L}_{\text {right }}\) when bad is set to 1 (corresponding to our conditioning on \(\mathcal{B}_{\text {left }}\) and \(\mathcal{B}_{\text {right }}\) ).

The quantity within the absolute value is the difference of two probabilities, so the largest it can be is 1 . Therefore, \[\text { advantage }_{\mathcal{A}} \leqslant p^{*} \stackrel{\text { def }}{=} \operatorname{Pr}\left[\mathcal{B}_{\text {left }}\right]=\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {left }} \text { sets bad }=1\right]\] This completes the proof.

Example

Consider \(\mathcal{L}_{\text {left }}\) and \(\mathcal{L}_{\text {right }}\) from the previous example (where the calling program tries to "predict" the result of uniformly sampling a \(\lambda\)-bit string). We can prove that they are indistinguishable with the following sequence of hybrids:

Let us justify each of the steps:

\(\mathcal{L}_{\text {left }} \equiv \mathcal{L}_{\mathrm{hyb}-\mathrm{L}}:\) The only difference is that \(\mathcal{L}_{\mathrm{hyb}-\mathrm{L}}\) maintains a variable "bad." Since it never actually reads from this variable, the change can have no effect.
\(\mathcal{L}_{\mathrm{hyb}-\mathrm{L}}\) and \(\mathcal{L}_{\mathrm{hyb}-\mathrm{R}}\) differ only in the highlighted line, which can only be reached when bad \(=1\). Therefore, from the bad-event lemma:\[\left|\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {hyb-L }} \Rightarrow 1\right]-\operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {hyb-R }} \Rightarrow 1\right]\right| \leqslant \operatorname{Pr}\left[\mathcal{A} \diamond \mathcal{L}_{\text {hyb-L }} \text { sets bad }=1\right]\] But \(\mathcal{A} \diamond \mathcal{L}_{\mathrm{hyb}-\mathrm{L}}\) only sets bad \(=1\) if the calling program successfully predicts \(s\) in one of the calls to PREDICT. With q calls to PREDICT, the total probability of this happening is at most \(q / 2^{\lambda}\), which is negligible when the calling program runs in polynomial time. Hence \(\mathcal{L}_{\text {hyb-L }} \approx \mathcal{L}_{\text {hyb-R. }}\)
\(\mathcal{L}_{\mathrm{hyb}-\mathrm{R}} \equiv \mathcal{L}_{\text {right }}\) : Similar to above, note how the first 3 lines of PREDICT in \(\mathcal{L}_{\text {hyb-R }}\) don’t actually do anything. The subroutine is going to return false no matter what. Both libraries have identical behavior.

Since \(\mathcal{L}_{\text {left }} \equiv \mathcal{L}_{\text {hyb-L }} \approx \mathcal{L}_{\text {hyb-R }} \equiv \mathcal{L}_{\text {right }}\), this proves that \(\mathcal{L}_{\text {left }} \approx \mathcal{L}_{\text {right }}\)

\({ }^{8}\) The use of conditional probabilites here is delicate and prone to subtle mistakes. For a discussion of the pitfalls, consult the paper where this lemma first appeared: Mihir Bellare & Phillip Rogaway: "Code-Based Game-Playing Proofs and the Security of Triple Encryption," in Eurocrypt 2006. ia.cr/2004/331