Skip to main content
[ "article:topic", "license:ccbyncsa", "showtoc:no", "tableaux", "authorname:mkeet" ]
Engineering LibreTexts

2.2: Reasoning

  • Page ID
  • Having a logic language with a semantics is one thing, but it may be only a means to an end rather than an end in itself. From the computational angle—especially from a logician’s perspective—the really interesting aspect of having such a language and (someone else) having put in the effort to formalise some subject domain, is to reason over it to infer implicit knowledge. Here, we are not talking about making a truth table, which is computationally way too costly when one has to analyse many sentences, but deploying other techniques so that it can be scaled up compared to the manual efforts. Automated reasoning, then, concerns computing systems that automate the ability to make inferences by designing a formal language in which a problem’s assumptions and conclusion can be written and providing correct algorithms to solve the problem with a computer in an efficient way.

    How does one find out whether a formula is valid or not? How do we find out whether our knowledge base is satisfiable? The main proof technique for DL-based ontologies is tableaux, although there are several others3. The following subsections first provide a general introduction, the essential ingredients for automated reasoning, and then describes deduction, abduction, and induction.



    People employ reasoning informally by taking a set of premises and somehow arriving at a conclusion, i.e., it is entailed by the premises (deduction), arriving at a hypothesis (abduction), or generalizing from facts to an assumption (induction). Mathematicians and computer scientists developed ways to capture this formally with logic languages to represent the knowledge and rules that may be applied to the axioms so that one can construct a formal proof that the conclusion can be derived from the premises. This can be done by hand [Sol05] for small theories, but that does not scale up when one has, say, 80 or more axioms even though there are much larger theories that require a formal analysis, such as checking that the theory can indeed have a model and thus does not contradict itself. To this end, much work has gone into automating reasoning. The remainder of this section introduces briefly several of the many purposes and usages of automated reasoning, its limitations, and types of automated reasoners.


    Automated reasoning, and deduction in particular, has found applications in ‘every day life’. A notable example is hardware and (critical) software verification, which gained prominence after Intel had shipped its Pentium processors with a floating point unit error in 1994 that lost the company about $500 million. Since then, chips are routinely automatically proven to function correctly according to specification before taken into production. A different scenario is scheduling problems at schools to find an optimal combination of course, lecturer, and timing for the class or degree program, which used to take a summer to do manually, but can now be computed in a fraction of it using constraint programming. In addition to such general application domains, it is also used for specific scenarios, such as the demonstration of discovering (more precisely: deriving) novel knowledge about protein phosphatases [WSH07]. They represented the knowledge about the subject domain of protein phosphatases in humans in a formal bio-ontology and classified the enzymes of both human and the fungus Aspergillus fumigatus using an automated reasoner, which showed that (i) the reasoner was as good as human expert classification, (ii) it identified additional p-domains (an aspect of the phosphatases) so that the human-originated classification could be refined, and (iii) it identified a novel type of calcineurin phosphatase like in other pathogenic fungi. The fact that one can use an automated reasoner (in this case: deduction, using a Description Logics knowledge base) as a viable method in science is an encouragement to explore such avenues further.


    While many advances have been made in specific application areas, the main limitation of the implementations are due to the computational complexity of the chosen representation language and the desired automated reasoning services. This is being addressed by implementations of optimisations of the algorithms or by limiting the expressiveness of the language, or both. One family of logics that focus principally on ‘computationally well-behaved’ languages is Description Logics, which are decidable fragments of first order logic [BCM+08]; that is, they are languages such that the corresponding reasoning services are guaranteed to terminate with an answer. Description Logics form the basis of most of the Web Ontology Languages OWL and OWL 2 and are gaining increasing importance in the Semantic Web applications area. Giving up expressiveness, however, does lead to criticism from the modellers’ community, as a computationally nice language may not have the features deemed necessary to represent the subject domain adequately.


    There are many tools for automated reasoning, which differ in which language they accept, the reasoning services they provide, and, with that, the purpose they aim to serve.

    There are, among others, generic first- and higher order logic theorem provers (e.g., Prover9, MACE4, Vampire, HOL4), SAT solvers that compute if there is a model for the formal theory (e.g., GRASP, Satz), Constraint Satisfaction Programming for solving, e.g., scheduling problems and reasoning with soft constraints (e.g., Eclipse), DL reasoners that are used for reasoning over OWL ontologies using deductive reasoning to compute satisfiability, consistency, and perform taxonomic and instance classification (e.g., Fact++, RacerPro, Hermit, CEL, QuOnto), and inductive logic programming tools (e.g., PROGOL and Aleph).

    Basic Idea

    Essential to automated reasoning are:

    1. The choice of the class of problems the software program has to solve, such as checking the consistency of a theory (i.e., whether there are no contradictions) or computing a classification hierarchy of concepts subsuming one another based on the properties represented in the logical theory;
    2. The formal language in which to represent the problems, which may have more or less features to represent the subject domain knowledge, such as cardinality constraints (e.g., that spiders have as part exactly eight legs), probabilities, or temporal knowledge (e.g., that a butterfly is a transformation of a caterpillar);
    3. The way how the program has to compute the solution, such as using natural deduction or resolution; and
    4. How to do this efficiently, be this achieved through constraining the language into one of low complexity, or optimising the algorithms to compute the solution, or both.

    Concerning the first item, with a problem being, e.g., “is my theory is consistent?”, then the problem’s assumptions are the axioms in the logical theory and the problem’s conclusion that is computed by the automated reasoner is a “yes” or a “no” (provided the language in which the assumptions are represented is decidable and thus guaranteed to terminate with an answer). With respect to how this is done (item iii), two properties are important for the calculus used: soundness and completeness. To define them, note/recall that “\(\vdash\)” means ‘derivable with a set of inference rules’ and “\(\models\)” denotes ‘implies’, i.e., every truth assignment that satisfies \(\Gamma\) also satisfies \(\phi\). The two properties are defined as follows:

    • Completeness: if \(\Gamma\models\phi\) then \(\Gamma\vdash\phi\) 
    • Soundness: if \(\Gamma\vdash\phi\) then \(\Gamma\models\phi\) 

    If the algorithm it is incomplete, then there exist entailments that cannot be computed (hence, ‘missing’ some results), if it is unsound then false conclusions can be derived from true premises, which is even more undesirable.

    An example is included, once the other ingredients have been introduced as well: proving the validity of a formula (the class of the problem) in propositional logic (the formal language) using tableau reasoning (the way how to compute the solution) with the Tree Proof Generator4 (the automated reasoner); more detail, with reflection and other techniques, can be found in [Por10] among others.

    Deduction, Abduction, and Induction

    There are three principle ways of making the inferences—deduction, abduction, and induction—that are described now.


    Deduction is a way to ascertain if a theory \(T\) represented in a logic language entails an axiom \(\alpha\) that is not explicitly asserted in \(T\) (written as \(T\models\alpha\) ), i.e., whether \(\alpha\) can be derived from the premises through repeated application of deduction rules. For instance, a theory that states that “each Arachnid has as part exactly 8 legs” and “each Tarantula is an Arachnid” then one can deduce—it is entailed in the theory—that “Each Tarantula has as part 8 legs”. An example is included further below (after having introduced a reasoning technique), which formally demonstrates that a formula is entailed in a theory \(T\) using said rules.

    Thus, strictly speaking, a deduction does not reveal novel knowledge, but only that what was already represented implicitly in the theory. Nevertheless, with large theories, it is often difficult to oversee all implications of the represented knowledge and, hence, the deductions may be perceived as novel from a domain expert perspective, such as with the example about the protein phosphatases. (This is in contrast to Abduction and Induction, where the reasoner ‘guesses’ knowledge that is not already entailed in the theory; see below).

    There are various ways how to ascertain \(T\models\alpha\), be it manually or automatically. One can construct a step-by-step proof ‘forward’ from the premises by applying the deduction rules or prove it indirectly such that \(T\cup\) {\(\neg\alpha\)} must lead to a contradiction. The former approach is called natural deduction, whereas the latter is based on techniques such as resolution, matrix connection methods, and sequent deduction (which includes tableaux). How exactly that is done for tableau is described later.


    One tries to infer \(a\) as an explanation of \(b\). That is, we have a set of observations, a theory of the domain of the observations, and a set of (possible, hypothesised) explanations that one would hope to find. For each explanation, it should be the case that the set of observations follows from the combination of the theory and the set of explanations, noting that the combination of the theory and the set of explanation has to be consistent. One can add additional machinery to these basics to, e.g., find out which of the explanations are the most interesting.

    Compared to deduction, there is less permeation of automated reasoning for abduction. From a scientist’s perspective, automation of abduction may seem appealing, because it would help one to generate a hypothesis based on the facts put into the reasoner [Ali04]. Practically, it has been used for, for instance, fault detection: given the knowledge about a system and the observed defective state, find the likely fault in the system. To formally capture theory with assumptions and facts and find the conclusion, several approaches have been proposed, each with their specific application areas; for instance, sequent calculus, belief revision, probabilistic abductive reasoning, and Bayesian networks.


    With induction, one generalises toward a conclusion based on a set of individuals. However, the conclusion is not a logical consequence of the premise. Thus, it allows one to arrive at a conclusion that actually may be false even though the premises are true. The premises provide a degree of support so as to infer \(a\) as an explanation of \(b\). Such a ‘degree’ can be based on probabilities (a statistical syllogism) or analogy. For instance, we have a premise that “The proportion of bacteria that acquire genes through horizontal gene transfer is 95%” and the fact that “Staphylococcus aureus is a bacterium”, then we induce that the probability that S. aureus acquires genes through horizontal gene transfer is 95%.

    Induction by analogy is weaker version of reasoning, in particular in logicbased systems, and yields very different answers than deduction. For instance, let us encode that some instance, Tibbles, is a cat and we know that all cats have the properties of having a tail and four legs and that they are furry. When we encode that another animal, Tib, who happens to have four legs and is also furry, then by inductive reasoning by analogy, we conclude that Tib is also a cat, even though in reality it may well be an instance of cheetah. On the other hand, by deductive reasoning, Tib will not be classified as being an instance of cat (but may be an instance of a superclass of cats (e.g., still within the suborder Feliformia), provided that the superclass has declared that all instances have four legs and are furry. Given that humans do perform such reasoning, there are attempts to mimic this process in software applications, most notably in the area of machine learning and inductive logic programming. The principal approach with inductive logic programming is to take as input positive examples \(+\) negative examples \(+\) background knowledge and then derive a hypothesised logic program that entails all the positive and none of the negative examples.

    Proofs With Tableaux

    Simply put, a proof is a convincing argument expressed in the language of mathematics. The steps in the process of the (automated) reasoning provide a proof. Several outcomes are possible for a given formula:

    • A formula is valid if it holds under every assignment5; this is denoted as “\(\models\phi\)”. A valid formula is called a tautology.
    • A formula is satisfiable if it holds under some assignment.
    • A formula is unsatisfiable if it holds under no assignment. An unsatisfiable formula is called a contradiction.

    The questions that need to be answered to realise the next step are:

    • How do we find out whether a formula is valid or not?
    • How do we find out whether our theory is satisfiable?

    A rather unpractical approach is truth tables, which may be fine in a paper-based logic course, but won’t do for computing. While there are several tools with several techniques, we will look at one that is, at the time of writing, the ‘winner’ in the realm of ontologies: tableaux-based reasoning, which is also the principal approach for DL reasoners and the OWL tools. A tableau provides a sound and complete procedure that decides satisfiability by checking the existence of a model6. It exhaustively looks at all the possibilities, so that it can eventually prove that no model could be found for unsatisfiable formulas. That is:

    • \(\phi\models\psi\) iff \(\phi\wedge\neg\psi\) is NOT satisfiable—if it is satisfiable, we have found a counterexample

    It does this by decomposing the formula in top-down fashion. Tableaux calculus works only if the formula has been translated into Negation Normal Form, however, so the first step in the process is:

    1. Push the negations inside to convert a sentence into Negation Normal Form, if applicable.

    We use the aforementioned equivalences for that. For instance, one of the De Morgan rules is \(\neg(\phi\wedge\psi)\equiv\neg\phi\vee\neg\psi\), where the ‘outer’ negation outside the brackets is pushed inside right in front of the formula, which can be done likewise with the quantifiers, such as substituting \(\neg\forall x.\phi\) with \(\exists x.\neg\phi\). 

    Now it is ready to enter the tableau. Use any or all of the following rules, as applicable (in a ‘smart’ order):


    1. If a model satisfies a conjunction, then it also satisfies each of the conjuncts:



    2. If a model satisfies a disjunction, then it also satisfies one of the disjuncts (which is non-deterministic):

    \(\dfrac{\phi\vee\psi}{\phi | \psi}\)

    3. If a model satisfies a universally quantified formula \((\forall)\), then it also satisfies the formula where the quantified variable has been substituted with some term (and the prescription is to use all the terms which appear in the tableaux),

    \(\dfrac{\forall x.\phi}{\phi\{X/t\}}\)

    \(\forall x.\phi\)

    4. For an existentially quantified formula, if a model satisfies it, then it also satisfies the formula where the quantified variable has been substituted with a new Skolem constant,

    \(\dfrac{\exists x.\phi}{\phi\{X/a\}}\)

    To complete the proof:

    3. Apply the completion rules until either:

    1. an explicit contradiction is generated in each branch due to the presence of two opposite literals in a node (called a clash), or

    2. there is a completed branch where no more rule is applicable.

    4. Determine the outcome:

    1. If all branches result in clashes, i.e., there is no completed branch, then \(\phi\wedge\neg\psi\) is NOT satisfiable, which makes the original one,  \(\phi\models\psi\), satisfiable.

    2. If there is a completed branch, then we have found a model for \(\phi\wedge\neg\psi\), hence, have found a counterexample for some assignments of the original \(\phi\models\psi\), hence \(\phi\nvDash\psi\).

    This completes the procedure.

    One can also do the above for individual formulas and not bother with the negation at the start and then simply check if one can find a model for some formula; i.e., ‘try to build a model’ (to see if everything can be instantiated) with the completion rules cf. the ‘check there is no model (when negated)’ of the tableau procedure above. In that case, the conclusions one should draw are the opposite, i.e., then if there’s a completed branch it means that that formula is satisfiable for it has found a model, and if there are only clashes, the tableau cannot find a model so there’s some contradiction in the formula. For instance, the formula \(\exists x(p(x)\wedge\neg q(x))\wedge\forall y(\neg p(y)\vee q(y))\) is unsatisfiable.

    Let’s have a look at how to apply all this, which is depicted in Figure 2.2.1 and described in the example below.

    Example \(\PageIndex{1}\):

    Let us take some arbitrary theory \(T\) that contains two axioms stating that relation R is reflexive (i.e., \(\forall x(R(x,x))\), a thing relates to itself) and asymmetric (i.e., \(\forall x,y(R(x,y)\to\neg R(y,x))\), if a thing \(a\) relates to \(b\) through relation R, then \(b\) does not relate back to \(a\)). We then can deduce, among others, that \(T\cup\{\neg\forall x,y(R(x,y))\}\) is satisfiable. We do this by demonstrating that the negation of the axiom is unsatisfiable.

    To enter the tableau, we first rewrite the asymmetry into a disjunction using equivalences, i.e., \(\forall x,y(R(x,y)\to\neg R(y,x))\) is equivalent to \(\forall x,y(\neg R(x,y)\vee R(y,x))\), thanks to applying the implication rule. Then add a negation to \(\{\neg\forall x,y(R(x,y))\}\), which thus becomes  \(\forall x,y(R(x,y))\). So, to start the tableau, we have three axioms (1, 2, 3), and subsequently the full tableau as in Figure 2.2.1.

    Screenshot (57).png

    Figure 2.2.1: Tableau example (using notation with a “.” not the brackets).

    This is a fairly simple example that uses rules 2.4 and 2.2 and only a few steps. It can quickly get more elaborate even for simpler languages such as propositional logic. If you have not seen propositional logic, you are free to skip the following example; else you may want to read through it: all the 19 (!) steps use only rules 2.1 and 2.2 (as propositional logic does not have the variables and quantifiers).

    Example \(\PageIndex{2}\):

    A sample computation to prove automatically whether the propositional formula \(((p\vee (q\wedge r))\to ((p\vee q)\wedge (p\vee r))\) is valid or not is included in Figure 2.2.2 and Figure 2.3.1, using tableau reasoning (see Deduction). The tableau method is a decision procedure that checks the existence of a model (i.e., that it can be instantiated). It exhaustively looks at all the possibilities, so that it can eventually prove that no model could be found for unsatisfiable formulas (if it is satisfiable, we have found a counterexample). This is done by decomposing the formula in top-down fashion after it has been translated into Negation Normal Form (i.e., all the negations have been pushed inside), which can be achieved using equivalences. Further, if a model satisfies a conjunction, then it also satisfies each of the conjuncts (“\(\wedge\)”), and if a model satisfies a disjunction (“\(\vee\)”), then it also satisfies one of the disjuncts (this is a non-deterministic rule and it generates two alternative branches). Last, one has to apply these completion rules until either (a) an explicit contradiction is obtained due to the presence of two opposite literals in a node (a clash) is generated in each branch, or (b) there is a completed branch where no more rule is applicable.

    Screenshot (58).png

    Figure 2.2.2: Sample computation using semantic tableau proving that the propositional formula is valid; see Figure 2.3.1 for an explanation.



    3The remainder of Section 2.2 consists of amended versions of my “Reasoning, automated” essay and related definitions that have been published in Springer’s Encyclopedia of Systems Biology.


    5typically, the assignments are ‘true’ and ‘false’, but there are also other logics that allow more/other assignments

    6model in the sense of logic (recall Definition 2.1.11), not in the sense of ‘conceptual data model’ like a UML class diagram.