5.4: Extending the Stretch of a PRG

Last updated
Save as PDF

Page ID: 86457

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \) \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)\(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\) \(\newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\) \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\) \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\) \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\) \( \newcommand{\Span}{\mathrm{span}}\)\(\newcommand{\AA}{\unicode[.8,0]{x212B}}\)

The stretch of a PRG measures how much longer its output is than its input. Can we use a PRG with small stretch to construct a PRG with larger stretch? The answer is yes, but only if you do it the right way!

Two Approaches to Increase Stretch

Suppose \(G:\{0,1\}^{\lambda} \rightarrow\{0,1\}^{2 \lambda}\) is a length-doubling PRG (i.e., a PRG with stretch \(\lambda\) ). Below are two ideas for constructing a PRG with longer stretch:

Although the constructions are similar, only one of them is secure. Before reading any further, can you guess which of \(H_{1}, H_{2}\) is a secure PRG and which is insecure? By carefully comparing these two approaches, I hope you develop a better understanding of the PRG security definition.

A Security Proof

I think it’s helpful to illustrate the "stragey" of security proofs by starting from the desired conclusion and working backwards. What better way to do this than as a Socratic dialogue in the style of Galileo? \(?^{2}\)

SALVIATI:	I’m sure that \(H_{1}\) is the secure PRG.
SIMPLICIO:	If I understand the security definition for PRGs correctly, you mean that the output of \(H_{1}\) looks indistinguishable from uniform, when the input to \(H_{1}\) is uniform. Why do you say that?
SALVIATI:	Simple! \(H_{1}\) ’s output consists of segments called \(x, u\), and \(v\). Each of these are outputs of \(G\), and since \(G\) itself is a PRG its outputs look uniform.
SIMPLICIO:	I wish I had your boldness, Salviati. I myself am more cautious. If \(G\) is a secure PRG, then its outputs are indeed indistinguishable from uniform, but surely only when its input is uniform! Are you so sure that’s the case here?
SALVIATI:	You raise a good point, Simplicio. In these endeavors it is always preferable to err on the side of caution. When we want to claim that \(H_{1}\) is a secure PRG, we consider the nature of its outputs when its seeds is uniform. Since \(H_{1}\) sends that seed s directly into \(G\), your concern is addressed.
SImPLICIO:	Yes, I can see how in the expression \(x \\| y:=G(s)\) the input to \(G\) is uniform, and so its outputs \(x\) and \(y\) are indistinguishable from random. Since \(x\) is part of \(H_{1}\) ’s output, we are making progress towards showing that the entire output of \(H_{1}\) is indistinguishable from random! However, the output of \(H_{1}\) also contains terms \(u\) and \(v\). When I examine how they are generated, as \(u \\| v:=G(y)\), I become concerned again. Surely \(y\) is not uniform, so I see no way to apply the security if \(G\) !
SALVIATI:	Oh, bless your heart. The answer could not be any more obvious! It is true that \(y\) is not uniformly distributed. But did you not just convince yourself that \(y\) is indistinguishable from uniform? Should that suffice?
SIMPLICIO:	Incredible! I believe I understand now. Let me try to summarize: We suppose the input \(s\) to \(H_{1}\) is chosen uniformly, and examine what happens to \(H_{1}\) ’s outputs. In the expression \(x \\| y:=G(s)\), the input to \(G\) is uniform, and thus \(x\) and \(y\) are indistinguishable from uniform. Now, considering the expression \(u \\| v:=G(y)\), the result is indistinguishable from a scenario in which \(y\) is truly uniform. But if \(y\) were truly uniform, those outputs \(u\) and \(v\) would be indistinguishable from uniform! Altogether, \(x, u\), and \(v\) (the outputs of \(H_{1}\) ) are each indistinguishable from uniform!

I hope that was as fun for you as it was for me. \({ }^{3}\) The formal security proof and its sequence of hybrids will follow the outline given in Simplicio’s summary. We start by applying the PRG security definition to the first call to \(G\), and replace its outputs with truly uniform values. After this change, the input to the second call to \(G\) becomes uniform, allowing us to apply the PRG security definition again.

Claim 5.5

If \(G\) is a secure length-doubling \(P R G\), then \(H_{1}\) (defined above) is a secure (length-tripling) PRG.

Proof

One of the trickier aspects of this proof is that we are using a secure PRG \(G\) to prove the security of another PRG \(H_{1}\). That means both \(\mathcal{L}_{\mathrm{prg}-\star}^{H_{1}}\) and \(\mathcal{L}_{\mathrm{prg}-\star}^{G}\) will appear in this proof. Both libraries/interfaces have a subroutine named "QUERY", and we will rename these subroutines QUERY \(H_{1}\) and QUERY QU disambiguate. \(^{\text {and }}\) to

We want to show that \(\mathcal{L}_{\text {prg-real }}^{H_{1}} \approx \mathcal{L}_{\text {prg-rand }}^{H_{1}} .\) As usual, we do so with a hybrid sequence. Since we assume that \(G\) is a secure \(P R G\), we are allowed to use the fact that \(\mathcal{L}_{\text {prg-real }}^{G} \approx\) \(\mathcal{L}_{\text {prg-rand }}^{G}\).

\(\mathcal{L}_{\text {prg-real }}^{H}:\left\{\begin{array}{l}\mathcal{L}_{\mathrm{prg} \text {-real }}^{H_{1}} \\ \frac{\text { QUERY }_{H_{1}}():}{s \leftarrow\{0,1\}^{\lambda}} \\ x \\| y:=G(s) \\ u \\| v:=G(y) \\ \text { return } x\\|u\\| v\end{array}\right\} H_{1}(s)\)	The starting point is \(\mathcal{L}_{\text {prg-real }}^{H_{1}}\), shown here with the details of \(H_{1}\) filled in.
	The first invocation of \(G\) has been factored out into a subroutine. The resulting hybrid library includes an instance of \(\mathcal{L}_{\mathrm{prg}-\mathrm{real}}^{G}\)
	From the PRG security of \(G\), we can replace the instance of \(\mathcal{L}_{\text {prg-real }}^{G}\) with \(\mathcal{L}_{\text {prg-rand }}^{G}\). The resulting hybrid library is indistinguishable.
	A subroutine has been inlined.
	Choosing \(2 \lambda\) uniformly random bits and then splitting them into two halves has exactly the same effect as choosing \(\lambda\) uniformly random bits and independently choosing \(\lambda\) more.
	The remaining appearance of \(G\) has been factored out into a subroutine. Now \(\mathcal{L}_{\mathrm{prg}-\mathrm{real}}^{G}\) makes its second appearance.
	Again, the PRG security of \(G\) lets us replace \(\mathcal{L}_{\text {prg-real }}^{G}\) with \(\mathcal{L}_{\text {prg-rand }}^{G}\). The resulting hybrid library is indistinguishable.
	A subroutine has been inlined.
	Similar to above, concatenating \(\lambda\) uniform bits with \(2 \lambda\) independently uniform bits has the same effect as sampling \(3 \lambda\) uniform bits. The result of this change is \(\mathcal{L}_{\mathrm{prg} \text {-rand }}^{H_{1}}\).

Through this sequence of hybrid libraries, we showed that: \[\mathcal{L}_{\text {prg-real }}^{H_{1}} \equiv \mathcal{L}_{\text {hyb-1 }} \approx \mathcal{L}_{\text {hyb-2 }} \equiv \mathcal{L}_{\text {hyb-3 }} \equiv \mathcal{L}_{\text {hyb-4 }} \equiv \mathcal{L}_{\text {hyb-5 }} \approx \mathcal{L}_{\text {hyb-6 }} \equiv \mathcal{L}_{\text {hyb-7 }} \equiv \mathcal{L}_{\text {prg-rand }}^{H_{1}}\]

Hence, \(H_{1}\) is a secure PRG.

Where the Proof Breaks Down for \(\mathrm{H}_{2}\)

The only difference between \(H_{1}\) and \(H_{2}\) is that the variable \(y\) is included in the output. How does that minor change affect the reasoning that we applied to \(H_{1}\)?

We argued that outputs \(u\) and \(v\) are indistinguishable from uniform since its input \(y\) is also indistinguishable from random. But it’s not quite so simple: A PRG’s output is indistinguishable from random if (1) its seed is uniform, and (2) the seed is not used for anything else! This construction \(\mathrm{H}_{2}\) violates condition (2) because it includes the "seed" \(y\) in the output.

We can see this idea reflected in the formal PRG definition. In \(\mathcal{L}_{\text {prg-real, }}\), the seed \(s\) is chosen uniformly, given as input to \(G\), and then goes out of scope! If we try to reproduce the security proof for \(H_{1}\) with \(H_{2}\) instead, we’ll get stuck when we are trying to factor out the second call to \(G\) in terms of \(\mathcal{L}_{\text {prg-real }}\):

We are trying to factor out the two highlighted lines into a separate library, renaming \(y\) into \(s\) in the process. But \(s\) can only exist inside the private scope of the new library, while there still exists a "dangling reference" \(y\) in the original library.

Speaking more generally about PRGs, suppose we have a call to \(G\) somewhere and want to argue that its outputs are pseudorandom. We can only express this call to \(G\) in terms of \(\mathcal{L}_{\mathrm{prg} \text {-real }}^{G}\) if the input to \(G\) is uniform and is used nowhere else. That’s not true here - we can’t express one of the calls to \(G\) in terms of \(\mathcal{L}_{\mathrm{prg} \text {-real' }}^{G}\), so we can’t be sure that the outputs of that call to \(G\) look random.

These subtle issues are not limited to PRGs. Every hybrid security proof in this course includes steps where we factor out some statements in terms of some pre-existing library. Don’t take these steps for granted! They will fail (often because of scoping issues) if the construction isn’t using the building block correctly. You should always treat such "factoring out" steps as "sanity checks" for your proof.

A Concrete Attack on \(\mathrm{H}_{2}\)

So far, we’ve only demonstrated that we get stuck when trying to prove the security of \(\mathrm{H}_{2}\). But that doesn’t necessarily mean that \(\mathrm{H}_{2}\) is insecure - it could mean that we’re just not clever enough to see a different security proof. To show that \(\mathrm{H}_{2}\) is actually insecure, we must demonstrate a successful distinguishing attack.

Attacking a PRG amounts to finding "patterns" in their outputs. Does \(H_{2}\) have a pattern in its outputs? Yes, in this case the pattern is that if you write the output in the form \(x\|y\| u \| v\), then \(u \| v\) is always equal to \(G(y)\). The calling program can check for this condition, which is unlikely to happen for truly uniform values.

You may wonder, is it legal for the calling program to compute \(G(y)\) ? Well, \(G\) is a publicly known algorithm (Kerckhoffs’ principle!), and \(y\) is right there as part of the input. Nothing prevents the calling program from running \(G\) "in its head".

Claim 5.6

Construction \(\mathrm{H}_{2}\) is not a secure \(P R G\), even if \(G\) is.

Proof

Consider the following distinguisher \(\mathcal{A}\) : \[\begin{aligned} &x\|y\| u \| v:=\text { QUERY }() \\ &\text { return } G(y) \stackrel{?}{=} u \| v \end{aligned}\] When \(\mathcal{A}\) is linked to \(\mathcal{L}_{\text {prg-real }}^{H_{2}}\), the outputs indeed satisfy the condition \(G(y)=u \| v\), so \(\mathcal{A}\) outputs true with probability \(1 .\)

When \(\mathcal{A}\) is linked to \(\mathcal{L}_{\text {prg-rand }}^{H_{2}}\), the outputs are truly uniform. It is helpful to imagine \(x\) and \(y\) being chosen before \(u\) and \(v\). As soon as \(y\) is chosen, the value \(G(y)\) is uniquely determined, since \(G\) is a deterministic algorithm. Then \(\mathcal{A}\) will output true if \(u \| v\) is chosen exactly to equal this \(G(y)\). Since \(u\) and \(v\) are chosen uniformly, and are a total of \(2 \kappa\) bits long, this event happens with probability \(1 / 2^{2 \kappa}\).

\(\mathcal{A}\) ’s advantage is the difference in these probabilities: \(1-1 / 2^{2 \kappa}\), which is nonnegligible.

Discussion

In the attack on \(\mathrm{H}_{2}\), we never tried to distinguish the output of \(G\) from uniform. \(\mathrm{H}_{2}\) is insecure even if \(G\) is the best PRG in the world! It’s insecure because of the incorrect way it uses \(G\).

From now on in this book, we’ll be studying higher-level constructions that are assembled from various building blocks \(-\) in this chapter, fancy PRGs constructed from simpler PRGs. "Security" means: if the building blocks are secure then the construction is secure. "Insecurity" means: even if the building blocks are secure, the construction can be insecure. So when you’re showing insecurity, you shouldn’t directly attack the building blocks! You should assume the building blocks are secure and attack the way that the building blocks are being used.

\({ }^{2}\) Don’t answer that.

\({ }^{3}\) If you’re wondering what the hell just happened: In Galileo’s 1632 book Dialogue Concerning the Two Chief World Systems, he lays out the arguments for heliocentrism using a dialog between Salviati (who advocated the heliocentric model) and Simplicio (who believed the geocentric model).

\({ }^{4}\) Compare to the case of distinguishing \(G(s)\) from uniform, for a secure \(G\). The calling program knows the algorithm \(G\) but doesn’t have the seed \(s-\) it only knows the output \(G(s)\). In the case of \(H_{2}\), the calling program learns both \(y\) and \(G(y)\) !