Skip to main content
[ "article:topic", "showtoc:no", "license:ccbync", "authorname:mrosulek" ]
Engineering LibreTexts

2.1: Reasoning about Information Hiding via Code Libraries

  • Page ID
    7320
  • All of the security definitions in this course are defined using a common methodology, which is based on familiar concepts from programming. The main idea is to formally define the “allowed” usage of a cryptographic scheme through a programmatic interface, and to define what information is hidden in terms of two implementations of that interface (libraries).

    Defintion \(\PageIndex{1}\) : Libraries

    library \(\mathscr{L}\) is a collection of subroutines and private/static variables. A library’s interface consists of the names, argument types, and output type of all of its subroutines. If a program ? includes calls to subroutines in the interface of  \(\mathscr{L}\), then we write \(?\diamondsuit\mathscr{L}\) to denote the result of linking ? to \(\mathscr{L}\) in the natural way (answering those subroutine calls using the implementation specified in \(\mathscr{L}\)). We write \(?\diamondsuit\mathscr{L}\Rightarrow z\) to denote the event that program \(?\diamondsuit\mathscr{L}\) outputs the value \(z\).

    Some more specifics:

    • If \(?\diamondsuit\mathscr{L}\) is a program that makes random choices, then its output is also a random variable.
    • We can consider compound programs like \(?\diamondsuit\mathscr{L}_1\diamondsuit\mathscr{L}_2\). Our convention is that subroutine calls only happen from left to right across the \(\diamondsuit\) symbol, so in this example, \(\mathscr{L}_2\) doesn’t call subroutines of \(?\). We can then think of \(?\diamondsuit\mathscr{L}_1\diamondsuit\mathscr{L}_2\) as (\(?\diamondsuit\mathscr{L}_1)\diamondsuit\mathscr{L}_2\) (a compound program linked to \(\mathscr{L}_2)) or as \(?\diamondsuit(\mathscr{L}_1\diamondsuit\mathscr{L}_2)(?\) linked to a compound library), whichever is convenient.
    • We try to make formal security definitions less daunting by expressing them in terms of elementary CS concepts like libraries, scope, etc. But one must not forget that the ultimate goal of security definitions is to be mathematically precise enough that we can actually prove things about security.

    For this reason, we need to have a handle on exactly what information the calling program can obtain from the library. We assume that the library’s explicit interface is the only way information gets in and out of the library. This is at odds with real-world software, where you can find implicit channels of information (e.g., peeking into a library’s internal memory, measuring the response time of a subroutine call, etc.).\(^{[1]}\) In short, don’t get too carried away with the terminology of real-world software — you should think of these libraries more as mathematical abstractions than software.

     

    Imagine the interface of a sorting library: when you call a subroutine SORT(A), you expect back a list that contains a sorted arrangement of the contents of A. As you know, there might be many ways to implement such a subroutine SORT. If you have access to only the input/output of SORT, then you would not be able to determine whether SORT was being implemented by mergesort or quicksort, for example, because both algorithms realize the same input-output behavior (sorting a list).

    This kind of idea (that there can be two implementations of the same input-output behavior) is the basis for all of our security definitions. Since we will consider libraries that use internal randomness, the outputs of subroutines may be probabilistic and we have to be careful about what we mean by “same input-output behavior.” A particularly convenient way to express this is to say that two libraries \(\mathscr{L}_{left}\) and \(\mathscr{L}_{right}\) have the same input-output behavior if no calling program behaves differently when linking to \(\mathscr{L}_{left}\) vs. \(\mathscr{L}_{right}\). More formally,

    Definition \(\PageIndex{2}\) : Interchangeable

    Let \(\mathscr{L}_{left}\) and \(\mathscr{L}_{right}\) be two libraries with a common interface. We say that \(\mathscr{L}_{left}\) and \(\mathscr{L}_{right}\) are interchangeable, and write \(\mathscr{L}_{left} \space\equiv\space\mathscr{L}_{right}\), if for all programs \(?\) that output a single bit, \(Pr[A\diamondsuit\mathscr{L}_{left)\Rightarrow\space1] = Pr[A\diamondsuit\mathscr{L}_{right}\Rightarrow\space 1]\).

    Discussion:

    • There is nothing special about defining interchangeability in terms of the calling program giving output 1. Since the only possible outputs are 0 and 1, we have:

    \[Pr[A\diamondsuit\mathscr{L}_{left}\Rightarrow\space1] = Pr[A\diamondsuit\mathscr{L}_{right}\Rightarrow\space 1]\\ \Leftrightarrow \space\space\space\space\space\space 1-Pr[A\diamondsuit\mathscr{L}_{left}\Rightarrow\space 1] = 1-Pr[A\diamondsuit\mathscr{L}_{right}\Rightarrow\space 1]\\\Leftrightarrow \space\space\space\space\space\space(Pr[A\diamondsuit\mathscr{L}_{left}\Rightarrow\space 0] = Pr[A\diamondsuit\mathscr{L}_{right}\Rightarrow\space 0]\nonumber\].

    • It is a common pitfall to imagine the program ? being simultaneously linked to both libraries. But in the definition, calling program ? is only ever linked to one of the libraries at a time.
    • We have restricted the calling program ? to a single bit of output, which might seem unnecessarily restrictive. However, the definition says that the two libraries have the same effect on all calling programs. In particular, the libraries must have the same effect on a calling program ? whose only goal is to distinguish between these particular libraries. A single output bit is necessary for this distinguishing task — just interpret the output bit as a “guess” for which library ? thinks it is linked to. For this reason, we will often refer to the calling program ? as a distinguisher.
    • Taking the previous observation even further, the definition applies against calling programs ? that “know everything” about (more formally, whose code is allowed to depend arbitrarily on) the two libraries. This is a reflection of Kerckhoffs’ principle, which roughly says “assume that the attacker has full knowledge of the system.”\(^{[2]}\) 

      There is, however, a subtlety that deserves some careful attention, though. Our definitions will typically involve libraries that use internal randomness. Kerckhoffs’ principle allows the calling program to know which libraries are used, which in this case corresponds to how a library will choose randomness (i.e., from which distribution). It doesn’t mean that the adversary will know the result of the libraries’ choice of randomness (i.e., the values of all internal variables in the library). It’s the difference between knowing that you will choose a random card from a deck (i.e., the uniform distribution on a set of 52 items) versus reading your mind to know exactly what card you chose. This subtlety shows up in our definitions in the following way. First, we specify two libraries, then we consider a particular distinguisher, and only then do we link and execute the distinguisher with a library. The distinguisher cannot depend on the random choices made by the library, since the choice of randomness “happens after” the distinguisher is fixed.

      In the context of security definitions, think of Kerckhoffs’ principle as: assume that the distinguisher knows every fact in the universe, except for: (1) which of the two libraries it is linked to, and (2) the outcomes of random choices made by the library (often assigned to privately-scoped variables within the library).

    • The definitions here have been chosen to ease gently into future concepts. We will eventually require libraries that have multiple subroutines and maintain state (via static variables) between different subroutine calls.

    Defining interchangeability in terms of distinguishers may not seem entirely natural, but it allows us to later have more granularity. Instead of requiring two libraries to have identical input-output behavior, we will be content with libraries that are “similar enough”, and distinguishers provide a conceptually simple way to measure exactly how similar two libraries are.

    The following lemma will be useful throughout the course as we deal with libraries in security proofs:

    Lemma \(\PageIndex{3}\) : Composition:

    If \(\mathscr{L}_{left}\equiv\space\mathscr{L}_{right}\) then \(\mathscr{L}^{\ast}\diamondsuit\mathscr{L}_{left}\equiv\space\mathscr{L}\diamondsuit\mathscr{L}_{right}\) for any library \(\mathscr{L}\).

    Proof:

    Take an arbitrary calling program ? and consider the compound program \(?\diamondsuit\mathscr{L}^{\ast}\diamondsuit\mathscr{L}_{left}\). We can interpret this program as a calling program ? linked to the library \(\mathscr{L}^{\ast}\diamondsuit\mathscr{L}_{left}\), or alternatively as a calling program \(?\diamondsuit\mathscr{L}^{\ast}\) linked to the library \(\mathscr{L}^{\ast}\). After all, \(?\diamondsuit\mathscr{L}^{\ast}\) is some program that makes calls to the interface of \(\mathscr{L}^{\ast}\). Since \(\mathscr{L}_{left}\equiv\space\mathscr{L}_{right}\), swapping \(\mathscr{L}_{left}\space\text{for}\space\mathscr{L}_{right}\) has no effect on the output of any calling program. In particular, it has no effect when the calling program happens to be \(?\diamondsuit\mathscr{L}^{\ast}\). Hence we have:

    \[Pr[\mathcal{A}(\mathscr{L}^{\ast}\diamondsuit\mathscr{L}_{left})\space\Rightarrow\space 1]=Pr[(\mathcal{A}\diamondsuit \mathscr{L}^{\ast})\diamondsuit\space\mathscr{L}_{left}\space\Rightarrow\space 1]\tag{change of perspective}\]

    \[\space\space\space\space\space\space\space\space\space\space\space\space =Pr[(\mathcal{A}\diamondsuit \mathscr{L}^{\ast})\diamondsuit\space\mathscr{L}_{right}\space\Rightarrow\space 1]\space\space\space(\text{since}\space \mathscr{L}_{left}=\mathscr{L}_{right})\nonumber\]

    \[\space\space\space\space\space\space\space\space\space\space\space\space =Pr[\mathcal{A}\diamondsuit \mathscr{L}^{\ast}\diamondsuit\space(\mathscr{L}_{right})\space\Rightarrow\space 1]\tag{change of perspective}\]

    Since \(?\) was arbitrary, we have proved the lemma.

    What Does Any of This Have to do with Security Definitions?

    Suppose two libraries \(\mathscr{L}_{left}\) and \(\mathscr{L}_{right}\) are interchangeable: two libraries that are different but appearing the same to all calling programs. Think of the internal differences between the two libraries as information that is perfectly hidden to the calling program. If the information weren’t perfectly hidden, then the calling program could get a whiff of whether it was linked to \(\mathscr{L}_{left}\) or \(\mathscr{L}_{right}\), and use it to act differently in those two cases. But such a calling program would contradict the fact that \(\mathscr{L}_{left}\equiv\mathscr{L}_{right}\).

    This line of reasoning leads to:

    \[\text{The Central Principle of Security Definitions}^{\text{TM}}:\nonumber\]

    If two libraries are interchangeable, then their common interface leaks no information about their internal differences.

    We can use this principle to define security, as we will see shortly (and throughout the entire course). It is typical in cryptography to want to hide some sensitive information. To argue that the information is really hidden, we define two libraries with a common interface, which formally specifies what an adversary is allowed to do & learn. The two libraries are typically identical except in the choice of the sensitive information. The Principle tells us that when the resulting two libraries are interchangeable, the sensitive information is indeed hidden when an adversary is allowed to do the things permitted by the libraries’ interface.

    References

    1. This doesn’t mean that it’s impossible to reason about attacks where an adversary has implicit channels of information on our cryptographic implementations. It’s just that such channels must be made explicit as far as the definitions go, if one wishes to prove something about what an adversary can learn by that channel. You can’t reason about information that your definition has no way of expressing.

    2. “Il faut qu’il n’exige pas le secret, et qu’il puisse sans inconvénient tomber entre les mains de l’ennemi.” Auguste Kerckhos, 1883. Translation: [The method] must not be required to be secret, and it can fall into the enemy’s hands without causing inconvenience.