5.2: Computability

Last updated
Save as PDF

Page ID: 6737

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$ \newcommand{\id}{\mathrm{id}}$ $ \newcommand{\Span}{\mathrm{span}}$

( \newcommand{\kernel}{\mathrm{null}\,}\) $ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$ $ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$ $ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\id}{\mathrm{id}}$

$ \newcommand{\Span}{\mathrm{span}}$

$ \newcommand{\kernel}{\mathrm{null}\,}$

$ \newcommand{\range}{\mathrm{range}\,}$

$ \newcommand{\RealPart}{\mathrm{Re}}$

$ \newcommand{\ImaginaryPart}{\mathrm{Im}}$

$ \newcommand{\Argument}{\mathrm{Arg}}$

$ \newcommand{\norm}[1]{\| #1 \|}$

$ \newcommand{\inner}[2]{\langle #1, #2 \rangle}$

$ \newcommand{\Span}{\mathrm{span}}$ $ \newcommand{\AA}{\unicode[.8,0]{x212B}}$

$ \newcommand{\vectorA}[1]{\vec{#1}} % arrow$

$ \newcommand{\vectorAt}[1]{\vec{\text{#1}}} % arrow$

$ \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vectorC}[1]{\textbf{#1}} $

$ \newcommand{\vectorD}[1]{\overrightarrow{#1}} $

$ \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} $

$ \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} $

$ \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } $

$ \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} $

$\newcommand{\avec}{\mathbf a}$ $\newcommand{\bvec}{\mathbf b}$ $\newcommand{\cvec}{\mathbf c}$ $\newcommand{\dvec}{\mathbf d}$ $\newcommand{\dtil}{\widetilde{\mathbf d}}$ $\newcommand{\evec}{\mathbf e}$ $\newcommand{\fvec}{\mathbf f}$ $\newcommand{\nvec}{\mathbf n}$ $\newcommand{\pvec}{\mathbf p}$ $\newcommand{\qvec}{\mathbf q}$ $\newcommand{\svec}{\mathbf s}$ $\newcommand{\tvec}{\mathbf t}$ $\newcommand{\uvec}{\mathbf u}$ $\newcommand{\vvec}{\mathbf v}$ $\newcommand{\wvec}{\mathbf w}$ $\newcommand{\xvec}{\mathbf x}$ $\newcommand{\yvec}{\mathbf y}$ $\newcommand{\zvec}{\mathbf z}$ $\newcommand{\rvec}{\mathbf r}$ $\newcommand{\mvec}{\mathbf m}$ $\newcommand{\zerovec}{\mathbf 0}$ $\newcommand{\onevec}{\mathbf 1}$ $\newcommand{\real}{\mathbb R}$ $\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}$ $\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}$ $\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}$ $\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}$ $\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$ $\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}$ $\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$ $\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}$ $\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}$ $\newcommand{\laspan}[1]{\text{Span}\{#1\}}$ $\newcommand{\bcal}{\cal B}$ $\newcommand{\ccal}{\cal C}$ $\newcommand{\scal}{\cal S}$ $\newcommand{\wcal}{\cal W}$ $\newcommand{\ecal}{\cal E}$ $\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}$ $\newcommand{\gray}[1]{\color{gray}{#1}}$ $\newcommand{\lgray}[1]{\color{lightgray}{#1}}$ $\newcommand{\rank}{\operatorname{rank}}$ $\newcommand{\row}{\text{Row}}$ $\newcommand{\col}{\text{Col}}$ $\renewcommand{\row}{\text{Row}}$ $\newcommand{\nul}{\text{Nul}}$ $\newcommand{\var}{\text{Var}}$ $\newcommand{\corr}{\text{corr}}$ $\newcommand{\len}[1]{\left|#1\right|}$ $\newcommand{\bbar}{\overline{\bvec}}$ $\newcommand{\bhat}{\widehat{\bvec}}$ $\newcommand{\bperp}{\bvec^\perp}$ $\newcommand{\xhat}{\widehat{\xvec}}$ $\newcommand{\vhat}{\widehat{\vvec}}$ $\newcommand{\uhat}{\widehat{\uvec}}$ $\newcommand{\what}{\widehat{\wvec}}$ $\newcommand{\Sighat}{\widehat{\Sigma}}$ $\newcommand{\lt}{<}$ $\newcommand{\gt}{>}$ $\newcommand{\amp}{&}$ $\definecolor{fillinmathshade}{gray}{0.9}$

At this point, it would be useful to look at increasingly complex Turing machines, which compute increasingly complex functions and languages. Although Turing machines are very simple devices, it turns out that they can perform very sophisticated computations. In fact, any computation that can be carried out by a modern digital computer—even one with an unlimited amount of memory—can be carried out by a Turing machine. Although it is not something that can be proved, it is widely believed that anything that can reasonably be called “computation” can be done by a Turing machine. This claim is known as the Church-Turing Thesis.

We do not have time to look at enough examples to convince you that Turing machines are as powerful as computers, but the proof reduces to the fact that computers are actually fairly simple in their basic operation. Everything that a computer does comes down to copying data from one place to another, making simple comparisons between two pieces of data, and performing some basic arithmetic operations. It’s possible for Turing machines to do all these things. In fact, it’s possible to build a Turing ma- chine to simulate the step-by-step operation of a given computer. Doing so proves that the Turing machine can do any computation that the computer could do, although it will, of course, work much, much more slowly.

We can, however, look briefly at some other models of computation and see how they compare with Turing machines. For example, there are various ways in which we might try to increase the power of a Turing machine. For example, consider a two-tape Turing machine that has two tapes, with a read/write head on each tape. In each step of its computation, a two- tape Turing machine reads the symbols under its read/write heads on both tapes. Based on these symbols and on its current state, it can write a new symbol onto each tape, independently move the read/write head on each tape one cell to the left or right, and change state.

It might seem that with two tapes available, two-tape Turing machines might be able to do computations that are impossible for ordinary one-tape machines. In fact, though, this is not the case. The reason, again, is simulation: Given any two-tape Turing machine, it is possible to build a one-tape Turing machine that simulates the step-by-step computation of the two- tape machine. Let M be a two-tape Turing machine. To simulate M with a one-tape machine, K, we must store the contents of both of M’s tapes on one tape, and we must keep track of the positions of both of M’s read/write heads. Let @ and $ be symbols that are not in the alphabet of M . The @will be used to mark the position of a read/write head, and the $ will be used to delimit the parts of K’s tape that represent the two tapes of M. For example, suppose that one of M’s tapes contains the symbols “abb##cca” with the read/write head on the first b, and that the other tape contains “01#111#001” with the read/write head on the final 1. This configuration would be represented on K’s tape as “$a@bb##cca$01#111#00@1$”. To simulate one step of M’s computation, K must scan its entire tape, looking for the @’s and noting the symbol to the right of each @. Based on this information, K can update its tape and its own state to reflect M’s new configuration after one step of computation. Obviously, K will take more steps than M and it will operate much more slowly, but this argument makes it clear that one-tape Turing machines can do anything that can be done by two-tape machines.

We needn’t stop there. We can imagine n-tape Turing machines, forn > 2. We might allow a Turing machine to have multiple read/write heads that move independently on each tape. We could even allow two or three-dimensional tapes. None of this makes any difference as far as computational power goes, since each type of Turing machine can simulate any of the other types.

We have used Turing machines to define Turing-acceptable languages and Turing-decidable languages. The definitions seem to depend very much on the peculiarities of Turing machines. But the same classes of languages can be defined in other ways. For example, we could use programs running on an idealized computer, with an unlimited amount of memory, to accept or decide languages. Or we could use n-tape Turing machines. The resulting classes of languages would be exactly the same as the Turing-acceptable and Turing-decidable languages.

We could look at other ways of specifying languages “computationally.” One of the most natural is to imagine a Turing machine or computer pro- gram that runs forever and outputs an infinite list of strings over some alphabet Σ. In the case of Turing machines, it’s convenient to think of a two-tape Turing machine that lists the strings on its second tape. The strings in the list form a language over Σ. A language that can be listed in this way is said to be recursively enumerable. Note that we make no assumption that the strings must be listed in any particular order, and we allow the same string to appear in the output any number of times. Clearly, a recursively enumerable language is “computable” in some sense. Perhaps we have found a new type of computable language. But no—it turns out that we have just found another way of describing the Turing-acceptable languages. The following theorem makes this fact official and adds one more way of describing the same class of languages:

Theorem 5.1.

Let Σ be an alphabet and let L be a language over Σ. Then the following are equivalent:

There is a Turing machine that accepts L.
There is a two-tape Turing machine that runs forever, making a list of strings on its second tape, such that a string w is in the list if and only if w ∈ L.
There is a Turing-computable function $f :\{a\}^{*} \rightarrow \Sigma^{*}$ such that Lis the range of the function f.

While I will not give a complete, formal proof of this theorem, it’s not too hard to see why it is true. Consider a language that satisfies property 3. We can use the fact that L is the range of a Turing-computable function, f, to build a two-tape Turing machine that lists L. The machine will consider each of the strings $a^{n},$ for $n \in \mathbb{N},$ in turn, and it will compute $f\left(a^{n}\right)$ for each n. Once the value of $f\left(a^{n}\right)$ has been computed, it can be copied onto the machine’s second tape, and the machine can move on to do the same with $a^{n+1} .$ This machine writes all the elements of $L$ (the range of $f )$ onto its second tape, so L satisfies property 2. Conversely, suppose that there is a two-tape Turing machine, $M,$ that lists $L .$ Define a function $g :\{a\}^{*} \rightarrow \Sigma^{*}$ such that for $n \in \mathbb{N}, g\left(a^{n}\right)$ is the $(n+1)^{t h}$ item in the list produced by M. Then the range of g is L, and g is Turing-computable since it can be computed as follows: On input $a^{n},$ simulate the computation of $M$ until it has produced $n+1$ strings, then halt, giving the $(n+1)^{t h}$ string as output. This shows that property 2 implies property 3, so these properties are in fact equivalent.

We can also check that property 2 is equivalent to property 1. Suppose that L satisfies property 2. Consider a two-tape Turing machine, T, that lists the elements of L. We must build a Turing machine, M, which accepts L. We do this as follows: Given an input $w \in \Sigma^{*}, M$ will simulate the computation of T. Every time the simulated T produces a string in the list, M compares that string to w. If they are the same, M halts. If $w \in L,$ eventually it will be produced by $T,$ so $M$ will eventually halt. If $w \notin L,$ then it will never turn up in the list produced by $T,$ so $M$ will never halt. Thus, M accepts the language L. This shows that property 2 implies property 1.

The fact that property 1 implies property 2 is somewhat harder to see. First, we note that it is possible for a Turing machine to generate every possible string in $\Sigma^{*},$ one-by-one, in some definite order (such as order of increasing length, with something like alphabetical order for strings of the same length). Now, suppose that L is Turing-acceptable and that M is a Turing machine that accepts L. We need a two-tape Turing machine,T that makes a list of all the elements of L. Unfortunately, the following idea does not work: Generate each of the elements in $\Sigma^{*}$ in turn, and see whether M accepts it. If so, then add it to the list on the second tape. It looks like we have a machine that lists all the elements of L. The problem is that the only way for T to “see whether M accepts” a string is to simulate the computation of M. Unfortunately, as soon as we try this for any stringw that is not in L, the computation never ends! T will get stuck in the simulation and will never even move on to the next string. To avoid this problem, T must simulate multiple computations of M at the same time.T can keep track of these computations in different regions of its first tape (separated by $\$$ 's ). Let the list of all strings in $\Sigma^{*}$ be $x_{1}, x_{2}, x_{3}, \ldots$ . Then

T should operate as follows:

Set up the simulation of $M$ on input $x_{1},$ and simulate one step of the computation for $x_{1}$
Set up the simulation of $M$ on input $x_{2},$ and simulate one step of the computation for $x_{1}$ and one step of the computation for $x_{2}$
Set up the simulation of $M$ on input $x_{3},$ and simulate one step of each of the computations, for $x_{1}, x_{2},$ and $x_{3}$
...

n. Set up the simulation of $M$ on input $x_{n},$ and simulate one step of each of the computations, for $x_{1}, x_{2}, \ldots, x_{n}$

and so on. Each time one of the computations halts, T should write the corresponding xi onto its second tape. Over the course of time, T simulates the computation of $M$ for each input $w \in \Sigma^{*}$ for an arbitrary number of steps. If $w \in L,$ the simulated computation will eventually end and $w$ will appear on $T^{\prime}$ 's second tape. On the other hand, if $w \notin L,$ then the simulated computation will never end, so w will not appear in the list. So we see thatT does in fact make a list of all the elements, and only the elements of L. This completes an outline of the proof of the theorem.

Next, we compare Turing machines to a completely different method of specifying languages: general grammars. Suppose $G=(V, \Sigma, P, S)$ is a general grammar and that L is the language generated by G. Then there is a Turing machine, M, that accepts the same language, L. The alphabet for $M$ will be $V \cup \Sigma \cup\{\$, \#\},$ where $\$$ is a symbol that is not in $V \cup \Sigma .$ (We also assume that $\#$ is not in $V \cup \Sigma .$ ) Suppose that $M$ is started with input $w,$ where $w \in \Sigma^{*} .$ We have to design $M$ so that it will halt if and only if $w \in L .$ The idea is to have $M$ find each string that can be derived from the start symbol S. The strings will be written to M’s tape and separated by $’s. M can begin by writing the start symbol, S, on its tape, separated from w by a $. Then it repeats the following process indefinitely: For each string on the tape and for each production rule, $x \longrightarrow y,$ of $G,$ search the string for occurrences of x. When one is found, add a $ to the end of the tape and copy the string to the end of the tape, replacing the occurrence of x by y. The new string represents the results of applying the production rule $x \longrightarrow y$ to the string. Each time $M$ produces a new string, it compares that string to w. If they are equal, then M halts. If w is in fact in L, then eventually M will produce the string w and will halt. Conversely, if w is not in L, then M will go on producing strings forever without ever finding w, so M will never halt. This shows that, in fact, the language L is accepted by M.

Conversely, suppose that $L$ is a language over an alphabet $\Sigma,$ and that $L$ is Turing-acceptable. Then it is possible to find a grammar G that generatesL. To do this, it’s convenient to use the fact that, as discussed above, there is a Turing-computable function $f :\{a\}^{*} \rightarrow \Sigma$ such that $L$ is the range of $f$ Let $M=\left(Q, \Lambda, q_{0}, \delta\right)$ be a Turing machine that computes the function $f$ We can build a grammar, G, that imitates the computations performed byM. The idea is that most of the production rules of G will imitate steps in the computation of M. Some additional rules are added to get things started, to clean up, and to otherwise bridge the conceptual gap between grammars and Turing machines.

The terminal symbols of G will be the symbols from the alphabet, Σ. For the non-terminal symbols, we use: the states of M, every member of Λ that is not in Σ, two special symbols < and >, and two additional symbolsS and A. (We can assume that all these symbols are distinct.) S will be the start symbol of G. As for production rules, we begin with the following three rules:

$S \longrightarrow<q_{0} A>$

$A \longrightarrow a A$

$A \longrightarrow \varepsilon$

These rules make it possible to produce any string of the form $<q_{0} a^{n}>$ This is the only role that S and A play in the grammar. Once we’ve gotten rid of S and A, strings of the remaining terminal and non-terminal symbols represent configurations of the Turing machine M. The string will contain exactly one of the states of M (which is, remember, one of the non-terminal symbols of G). This tells us which state M is in. The position of the state- symbol tells us where M is positioned on the tape: the state-symbol is located in the string to the left of the symbol on which M is positioned. And the special symbols < and > just represent the beginning and the end of a portion of the tape of $M .$ So, the initial string $<q_{0} a^{n}>$ represents a configuration in which M is in its start state, and is positioned on the firsta in a string of n a’s. This is the starting configuration of M when it is run with input $a^{n}$.

Now, we need some production rules that will allow the grammar to simulate the computations performed by $M .$ For each state $q_{i}$ and each symbol $\sigma \in \Lambda,$ we need a production rule that imitates the transition rule $\delta\left(q_{i}, \sigma\right)=\left(\tau, d, q_{j}\right) .$ If $d=R,$ that is if the machine moves to the right, then all we need is the rule

$q_{i} \sigma \longrightarrow \tau q_{j}$

This represents that fact that $M$ converts the $\sigma$ to a $\tau,$ moves to the right, and changes to state qj . If d = L, that is if the machine moves to the left, then we will need several rules one rule for each $\lambda \in \Lambda,$ namely

$\lambda q_{i} \sigma \longrightarrow q_{j} \lambda \tau$

This rule says that M changes the $\sigma$ to $a \tau$, moves left, and changes to state $q_{j} .$ The $\lambda$ doesn't affect the application of the rule, but is necessary to represent the fact that M moves left.

Each application of one of these rules represents one step in the compu- tation of M. There is one remaining requirement for correctly simulatingM. Since M’s tape contains an infinite number of cells and we are only representing a finite portion of that tape, we need a way to add and remove#’s at the ends of the string. We can use the following four rules to do this:

$<\longrightarrow<\#$
$<\# \longrightarrow$
$>\#>$
$\#>$

These rules allow blank symbols to appear at the ends of the string when they are needed to continue the computation, and to disappear from the ends of the string whenever we like.

Now, suppose that $w$ is some element of $L .$ Then $w=f\left(a^{n}\right)$ for some $n \in \mathbb{N} .$ We know that on input $a^{n}, M$ halts with output $w .$ If we translate the computation of M into the corresponding sequence of production rules in $G,$ we see that for the grammar $G,<q_{0} a^{n}>\Longrightarrow<h w>,$ where $h$ is the halt state of $M .$ since we already know that $S \Longrightarrow^{*}<q_{0} a^{n}>,$ for every $n \in \mathbb{N},$ we see that in fact $S \Longrightarrow^{*}<h w>$ for each $w \in L .$ We almost have it! We want to show that $S \Longrightarrow^{*} w .$ If we can just get rid of the $<,$ the $h$ and the $>,$ we will have that $<h w>\Longrightarrow^{*} w$ and we can then deduce that $S \Longrightarrow^{*} w$ for each $w \in L,$ as desired. We can do this by adding just a few more rules to G. We want to let the h eliminate the <, move through thew, and then eliminate the > along with itself. We need the rules

$<h \longrightarrow h$
$h>\longrightarrow \varepsilon$

and, for each $\sigma \in \Sigma$,

$h \sigma \longrightarrow \sigma h$

We have constructed G so that it generates every string in L. It is not difficult to see that the strings in L are in fact the only strings that are generated by G. That is, L is precisely L(G).

We have now shown, somewhat informally, that a language L is Turing- acceptable if and only if there is a grammar G that generates L. Even though Turing machines and grammars are very different things, they are equivalent in terms of their ability to describe languages. We state this as a theorem:

Theorem 5.2.

A language L is Turing acceptable (equivalently, recursively enumerable) if and only if there is a general grammar that generates L.

In this section, we have been talking mostly about recursively enumerable languages (also known as the Turing-acceptable languages). What about the Turing-decidable languages? We already know that if a language L is Turing-decidable, then it is Turing-acceptable. The converse is not true (although we won’t be able to prove this until the next section). However, suppose that L is a language over the alphabet Σ and that both $L$ and its complement, $\overline{L}=\Sigma^{*}>L,$ are Turing-acceptable. Then $L$ is Turing-decidable.

For suppose that M is a Turing machine that accepts the language L and that $M^{\prime}$ is a Turing machine that accepts $\overline{L} .$ We must show that $L$ is Turing-decidable. That is, we have to build a Turing machine T that decides $L .$ For each $w \in \Sigma^{*},$ when $T$ is run with input $w,$ it should halt with output 1 if $w \in L$ and with output 0 if $w \notin L .$ To do this, $T$ will simulate the computation of both M and M′ on input w. (It will simulate one step in the computation of M, then one step in the computation ofM′, then one step of M, then one step of M′, and so on.) If and when the simulated computation of M halts, then T will halt with output 1; since M accepts $L,$ this will happen if and only if $w \in L .$ If and when the simulated computation of M′ halts, then T will halt with output 0; since M accepts $L,$ this will happen if and only if $w \notin L .$ So, for any $w \in \Sigma^{*}, T$ halts with the desired output. This means that T does in fact decide the language L.

It is easy to prove the converse, and the proof is left as an exercise. So we see that a language is Turing-decidable if and only if both it and its complement are Turing-acceptable. Since Turing-acceptability can be defined using other forms of computation besides Turing machines, so can Turing-decidability. For example, a language is Turing-decidable if and only if both it and its complement can be generated by general gram- mars. We introduced the term “recursively enumerable” as a synonym for Turing-acceptable, to get away from the association with a particular form of computation. Similarly, we define the term “recursive” as a synonym for Turing-decidable. That is, a language L is said to be recursive if and only if it is Turing-decidable. We then have the theorem:

Theorem 5.3.

Let Σ be an alphabet and let L be a language over Σ. Then L is recursive if and only if both $L$ and its complement, $\Sigma^{*} \backslash L$ are recursively enumerable.

Exercises

The language $L=\left\{a^{m} | m>0\right\}$ is the range of the function $f\left(a^{n}\right)=a^{n+1}$. Design a Turing machine that computes this function, and find the grammar that generates the language L by imitating the computation of that machine.
Complete the proof of Theorem 5.3 by proving the following: If L is a recursive language over an alphabet Σ, then both L and $\Sigma^{*} \backslash L$ are recursively enumerable.
Show that a language L over an alphabet Σ is recursive if and only if there are grammars G and H such that the language generated by G is L and the language generated by $H$ is $\Sigma^{*} \backslash L$
This section discusses recursive languages and recursively enumerable languages. How could one define recursive subsets of $\mathbb{N}$ and recursively enumerable subsets of $\mathbb{N} ?$
Give an informal argument to show that a subset $X \subseteq \mathbb{N}$ is recursive if and only if there is a computer program that prints out the elements of X in increasing order.

Search

Text Color

Text Size

Margin Size

Font Type