Philosophical Insights on Deduction
Philosophical Insights on Deduction
C URTIS F RANKS
1. Preview
Attempts to articulate the real meaning or ultimate significance of a famous theorem
comprise a major vein of philosophical writing about mathematics. The subfield of mathe-
matical logic has supplied more than its fair share of case studies to this genre, Gödel’s (1931)
incompleteness theorem being probably the most frequent exhibit. This note is about the De-
duction Theorem—another result from mathematical logic, of roughly the same vintage. I aim
to make clear, in the simplest possible terms, what the theorem says in the several guises it
has taken, how to prove it in each associated framework, and especially the role it has played
in logic’s development over nearly two centuries.
But do not expect the theorem to submit, here, to anything like a final analysis. I intend
the exercise to serve as an ancillary to a thesis contrary to such ambitions: that the meaning of
important mathematics is unfixed—expanding over time, informed by new perspectives and
theoretical advances, and in turn giving rise to new developments. I want to avoid any im-
pression that the Deduction Theorem is the sort of thing that can be fully understood. Ideally,
familiarity with the ways in which it has taken on new meaning over time will prepare readers
for its continued reinvention, even to be the authors of its future embellishments.
Other histories of the Deduction Theorem (Pogorzelski 1968, Porte 1982, Czelakowski
1985) have been written, but our focus will differ from theirs. Rather than document when the
theorem was proved for different systems, when its minimal conditions were specified, or how
it was generalized from the setting of sentential logic to larger classes of algebraic structures,
we will not look beyond the basic phenomenon. My plan is that by taking a second look,
and then a third and a fourth, we might extract its several meanings according to the different
perspectives from which logicians have encountered it.
2. Classical truth
To begin thinking about the Deduction Theorem, consider first the concept of validity in
classical propositional logic. In this setting, we have compound formulas built up recursively
from atoms with propositional connectives ∨, ∧, ¬, and ⊃. An interpretation of a formula
is an assignment of truth values from {T, F} to its atoms. The truth value of a formula
1
on an interpretation is then determined according to a composition of functions, where each
propositional connective is a ‘truth function’ mapping a truth value (in the case of ¬) or pair
of truth values (in the other cases) to another truth value. (For example, ⊃ is interpreted as the
‘material conditional’ which maps the pair h T, F i to F and all other pairs to T.)
The classical propositional validities are the formulas that receive the value T on every
interpretation. When a propositional formula A is classically valid, this fact is denoted CPL A
(here CPL abbreviates classical propositional logic). In his logical writings1 W. V. O. Quine
insisted that for these validity claims and their generalizations as described below, one write
CPL
‘A’ and ‘A’ CPL ‘B’, indicating with quotation marks that the turnstile symbol abbreviates
ordinary language, whereas one mentions but does not use the formulas A and B. Preferring
legibility and conformity with modern usage, we will not heed Quine’s scruples on this matter.
But it is important to be aware that Quine is correct that in making a validity claim, one is
saying that the formula is valid, not making a compound statement out of the formula as one
does with expressions like ¬A and A ⊃ B.
Generalizing the concept of validity, one also writes A CPL B to say that on every in-
terpretation on which A receives the value T, so too does B. Even more generally, if Γ is a
set of classical propositional formulas, then Γ CPL B means that B receives the value T on
every interpretation on which every formula in Γ does.2 Thus just as the material conditional
A ⊃ B is the classical analysis of the conditional utterance ‘If A, then B’, Γ CPL B is the
classical analysis of the claim that, taken jointly, the formulas in Γ logically imply the formula
B. In the case when Γ is empty, this last claim is just that B is logically implied without any
assumptions at all, i.e., that B is logically valid in its own right. That is the sense in which
classical implication is a generalization of classical validity.
Now a certain similarity between the material conditional of classical logic and the con-
cept of logical implication, classically analyzed, is hard to miss. But many beginners (and,
famously, occasional experts like Bertrand Russell) are misled by the similarity to simply iden-
tify the two. So even though it is an elementary point, let us make their relationship precise.
The claim that, if you arrive at my house before 3pm, then I will give you a ride to the concert
can be formalized A ⊃ B. Whether or not this conditional claim is true on the classical anal-
ysis depends only on the truth or falsity of A and B. But even on this analysis, if you arrive
at my house before 3pm and I do give you a ride to the concert, so that A ⊃ B turns up true,
it would be wrong to say that A logically implies B. The truth of the claim that A logically
implies B does not depend in any way on whether you show up on time or what I proceed to
do afterwards. In fact, if A did logically imply B, then there would be no reason for me to
1
See any of the papers in Quine 1995.
2
We follow the convention of denoting the set containing a single formula A by ‘A’ and denoting the union
of Γ and ∆ by ‘Γ, ∆’.
2
assure you of anything with an utterance like A ⊃ B. The utterance would be uninformative,
because you could just do some logic in private and determine that since B is what you want
and B is implied by A, you need only worry yourself with A, knowing that if you succeed, B
is automatic.
So what is the relationship? Assume that Γ, A logically implies B, i.e. Γ, A CPL B. Then
by definition it is not possible to assign truth values to atoms so that each formula in Γ and A
all turn up T and B turns up F. That means that on every interpretation of Γ, A, and B, either
at least one formula among Γ and A receives the value F or B receives the value T. Either way,
observe that Γ CPL A ⊃ B. Assume conversely that Γ CPL A ⊃ B. That means that A ⊃ B
receives the value T on every interpretation that assigns T to each formula in Γ. For this to
be so, there must be no interpretation on which each formula in Γ and A all receive T and B
receives F. Therefore Γ, A CPL B.
We have just verified the fundamental identity
Γ, A CPL
B if, and only if, Γ CPL
A ⊃ B. (1)
On the classical analysis, logical implication is the same, not as the truth of a conditional state-
ment, but as the validity of one. (The same identity holds, and by the same line of reasoning,
for classical quantification theory.)
This ‘theorem’ might be too trivial to deserve a name. As is typical for the verification
of identities among constructions in an elementary semantic environment, our demonstration
just involved verifying that the definitions amount to the same thing. But it does encode some
remarkable depth. Neither the classical analysis of logical implication nor the material condi-
tional are without controversy. But they developed in historical isolation from one another. It
is not hard to locate advocates of classical implication who reject the truth-functional analysis
of conditional expressions, and vice versa. That these two analyses are so closely related,
however trivial it is to verify, is therefore surprising.
On the other hand, one might have the intuition that logical implication ought to corre-
spond to the validity of a conditional statement, independently of any specification of either
concept. This leads to a more abstract understanding of (1), untethered from the classical
theory of truth. Such appears to have been Bolzano’s approach to conditionals. In §224 of
Wissenschaftslehre Bolzano claimed that from the observation that (a) one can ‘deduce’ a
proposition from a set of premises one can conclude that (b) it is possible to ‘deduce’ from a
subset of those premises a conditional whose antecedent is the conjunction of the remaining
sentences from the original premises and whose consequent is the conclusion of the origi-
nal ‘deduction’. Some commentators (see van Benthem 1985, Šebestik 2016) have read in
Bolzano’s inference from (a) to (b) an anticipation of the identity (2) articulated in the follow-
ing section. However, Bolzano’s concept of deduction (Ableitbarkeit) deals not with formal
3
derivability as understood by modern logicians but with a characteristically semantic notion,
the preservation of truth under reassignments of ‘ideas’ to the terms appearing in sentences.3
It is clear that his claim is instead closer to (1). But for Bolzano, there was no question of
verifying this claim. In his direct treatment of conditionals (§179), he does not provide any-
thing like a full analysis in these terms. Instead, the claim of §224 is his official account of
conditional language.
The reason I say that Bolzano’s claim is ‘close to’ rather than identical to (1) is that
Bolzano’s concept of ‘deducibility’, differs in important ways from the classical account of
logical consequence.4 Already from this it is clear that the form of identity (1) does not de-
pend essentially on all the details of classical logic. More importantly, we see Bolzano using
a version of (1) to define conditionals in terms of his (non-classical) concept of deducibil-
ity. Following Bolzano, rather than view (1) as an identity in need of verification, one could
just assume it. Then beginning with the classical analysis of logical implication, one could
establish via (1) the truth-functional analysis of conditionals.
3. Classical inference
The identity between logical implication, on the one hand, and the validity of a corre-
sponding conditional claim, on the other hand, is a template for another putative correspon-
dence. Running parallel to the analysis of logic in terms of truth conditions is an alterna-
tive analysis in terms of inference. According to this analysis, the logicality of a formula
amounts to its derivability in some predetermined systematic manner, and that of an implica-
tion amounts to one formula’s systematic derivability from others. One may ask: whenever an
implication is verified with a derivation in this manner, is a corresponding conditional claim
also necessarily derivable?
Obviously, the clarity and full meaning of this question depends on the specification of
the manner of derivation. The earliest articulation of a formal system of logical deduction
sufficient to pose this question meaningfully appears in the (1879) Begriffsschrift of Gottlob
Frege. There, Frege introduced the idea of a formal proof system for logic with designated
axiomatic formulas and rules of inference. In modern notation, the propositional axioms of
Begriffsschrift are
1. A ⊃ (B ⊃ A)
2. (C ⊃ (B ⊃ A)) ⊃ ((C ⊃ B) ⊃ (C ⊃ A))
3
Further details about Bolzano’s theory of deduction and its relationship to his theory of ground and conse-
quent can be found in Franks 2014. The second of these is the theory more closely related to the modern concept
of derivability.
4
Again, see Franks 2014.
4
3. (B ⊃ A) ⊃ (¬A ⊃ ¬B)
4. A ⊃ ¬¬A
5. ¬¬A ⊃ A
and the single rule of inference is modus ponens: A ⊃ B A .5 A formal proof in such a
B
system as Frege’s is a finite list of formulas, each of which is either an axiom or follows from
previous entries according to one of the inference rules. The propositional fragment of the
‘laws of thought’ are then the formulas that can be obtained as the final line of a formal proof.
The turnstile notation introduced above in fact originates in Frege’s work, where he de-
scribes the symbol as being made up of two parts, the horizontal ‘content stroke’ and the
vertical ‘judgement stroke’.6 There is debate about the significance of the turnstile and its parts
in Frege’s thought, fueled in large part by some perplexing remarks that Frege made. There
is, for example, the notorious doctrine, repeated by Frege enthusiastically on many occasions,
that it is not possible to infer anything from ‘mere assumptions’. Frege insisted that inference
could only be performed from judgements. This sentiment is what drove Frege’s critique of
the method of testing the consistency of some arbitrary hypotheses by seeing whether it is pos-
sible to infer from them a contradiction. According to Frege, this enterprise makes no sense,
because if you haven’t already ‘judged’ the truth of your hypotheses, then you cannot infer
anything from them, and if you have judged their truth, then you know in advance that they
are consistent and have nothing to test by inferring from them (apparently one cannot ‘judge’
incorrectly).
A related idea appears to motivate Frege’s theory of the conditional. Frege is often at-
tributed with advocacy of the material conditional, but the truth of the matter is more subtle.
He did, in point of fact, define the expression A ⊃ B as the denial of the case where A ob-
tains and B doesn’t. However, on numerous occasions he explicitly denied that this is an
adequate translation of conditional expressions. In Begriffsschrift itself, he emphasized that
‘the causal connection implicit in the word “if” . . . is not expressed by our symbols’. Indeed,
after pointing out that the nested expression A ⊃ (B ⊃ C) ‘denies the case in which C is
denied and B and A are affirmed’ he added that ‘if a causal connection is present, we can also
5
Frege also listed (D ⊃ (B ⊃ A)) ⊃ (B ⊃ (D ⊃ A) (formula 8) as an axiom, although it can be derived from
the five designated here (in fact, from just the first two). Following Frege, we will call any substitution instance
of one of the above axioms an axiom and do without a ‘rule of substitution’. We thereby avoid the complication
that a substitution rule introduces when one considers Γ-derivations, as defined below. With such a rule, one has
to keep track of the pedigree of every formula in a derivation to know whether substitution is allowed.
6
Frege’s turnstile looked like this, with single vertical and horizontal strokes, even though they expanded
continuously into sometimes quite complicated arrays of ‘conditional strokes’ and associated continuations. A
standard modern usage reserves turnstiles with single horizontal strokes to represent derivability in formal proof
systems, distinguished from turnstiles (like those in §2) with double horizontal strokes which stand for semantic
consequence.
5
say . . . “If the circumstances B and A occur, then C occurs also”’. But then Frege remarked
that ‘a judgement of this kind can be made only on the basis of such a connection’. So we see
that on Frege’s view material conditionals do not express ordinary ‘if . . . then’ usage, and yet
they can only be judged to be true based on the sort of connection such usage indicates. Thus
Frege’s analysis of conditional expressions is in terms, not of the expression A ⊃ B, but of the
expression A ⊃ B.
Perhaps because of the intrigue of what I called Frege’s notorious doctrine, it is typically
forgotten that in a discussion of expressions without the judgement stroke in §2 of Begriffs-
schrift Frege wrote that one might present such an assertion ‘in order to derive some con-
clusions from it and with these test the correctness of the thought’. Also, in a letter to Hugo
Dingler in which he again objected to the idea of ‘draw[ing] conclusions from the propositions
of a group’ without ‘first exclud[ing] all propositions whose truth is doubtful’, Frege never-
theless mentioned that ‘from the thought that 2 is less than 1 and the thought that if something
is less than 1 then it is greater than 2, one can derive that 2 is greater than 2’ (1917, p. 17).
Clearly Frege maintained that ‘when we infer, we recognize a truth on the basis of other
previously recognized truths according to a logical law’ (1917, p. 17) but that a type of rea-
soning unworthy of the label ‘inference’ is also possible: A derivation, not from judgements
but from ‘thoughts’, can establish that a conditional statement is true. Here is how Frege
elaborated this point in 1910: ‘I can, indeed, investigate what consequences result from the
supposition that A is true without having recognized the truth of A; but the result will then
contain the condition if A is true’. Given what we have observed about Frege’s account of
conditionals, it seems that Frege is saying that such derivations from suppositions establish
more than mere statements of the form A ⊃ B—they establish judgements of such state-
ments’ truth. Indeed, Frege continued: ‘Under [such] circumstances we can, by means of a
chain of conditions, obtain a concluding judgement of the form A ⊃ (B ⊃ (C ⊃ D))’.
Now, I am aware that the suggestion cuts against a long-established orthodoxy of Frege
interpretation, and I doubt that Frege himself had entirely consistent views on the matter, but
the ideas assembled here strongly suggest that Frege’s turnstile served a role analogous to the
modern turnstile of formal derivation.7 There are, at any rate, only three ways that Frege ever
goes about establishing a judgement: informal verification of an axiom, proof in the system
of Begriffsschrift, and derivation of one formula from another, taken merely as a supposition,
establishing thereby a conditional.
Let us define a Γ-derivation as a finite list of formulas, each of which is either a member
of the set Γ, an axiom, or the result of an application of the rule modus ponens to two for-
7
Before the establishment of an orthodoxy, Quine spotted the same analogy. In 1951, he said that the meaning
Frege attached to this symbol, though ‘somewhat obscure’, was ‘near enough’ to formal derivability for him to
retain the notation (p. 88).
6
mulas occurring earlier in the list. With the label B indicating the propositional fragment of
Begriffsschrift, we index Frege’s turnstile and denote the existence of a proof of a formula A
by B A and the existence of a Γ-derivation of A by Γ B A. Clearly, in case Γ B A ⊃ B,
it follows that Γ, A B B: just append to the end of a given Γ-derivation of A ⊃ B two final
entries, A and B, and observe that because B follows by modus ponens from A and A ⊃ B,
the amended list is a Γ, A-derivation of B. Frege appears to be committed also to the converse
claim, A B B only if B A ⊃ B. This generalizes to a syntactic analogue to the identity (1)
from the previous section:
4. Two proofs
It is natural to wonder why Frege, given his celebrated exactitude, did not bother to
verify the identity (2) that closed the last section. He does tell us, also without providing
any justification, that his formal system is complete in the sense that all ‘laws of thought’
(presumably all the formulas that would submit to the sort of informal verification that Frege
provides for his axioms) have proofs. Frege’s conviction that derivation from assumptions
generates laws of thought in conditional form supplies us with a convenient way to test his
completeness claim: A verification of (2) would be considerable evidence that the claim is
correct. Alternatively, one could be so convinced of the completeness of the proof system as
to use it to determine whether derivation from assumptions does in fact yield true judgements:
again, verification of (2) does the trick.
Speculation about Frege’s lack of interest in this question can lead in any number of di-
rections. I would only point out that the observation from §2, that Frege’s remarks ‘strongly
suggest’ an analogy between his turnstile and our own, can only be made in hindsight. It
is an anachronism, but a useful one. Whatever Frege might have ultimately thought about
the significance of the judgement stroke—and, again, it is not at all clear that his various
pronouncements are all consistent—he did not have in mind what we think of today. Our
provability turnstile is informed by, but is a refinement of, his distinction between mere asser-
tion and judgement. Armed with it, we are able to read in Frege’s writing a question that he
appears not to have noticed himself.
Setting aside the origins of (2) and turning to its verification, one thought that might
occur to you is that given the soundness and completeness of the propositional fragment of
the system in Begriffsschrift (Frege was proven correct about this a few decades later by Paul
Bernays (Bernays 1918)) with respect to the classical truth functional semantics described in
§1, a very simple verification is possible.
7
Proof. Recall that we already verified one direction of (2), from B A ⊃ B to A B B. Here
is an alternative demonstration of the same fact: Assume B A ⊃ B. Because the system is
sound, it follows that CPL A ⊃ B. Then by (1), A CPL B, and by the system’s completeness,
A B B. Because of the simplicity of the argument in §1, this alternative demonstration seems
to be needlessly baroque, appealing as it does to such notions as truth-functions that do not
pertain to the question at hand. But it has the advantage of being easily reversible: From the
assumption A B B, soundness gives A CPL B, which by (1) implies CPL A ⊃ B, from which
B A ⊃ B follows by completeness.
This proof of (2) uses only the identity (1) and the semantic completeness of proposi-
tional logic, ideas that were explicit already in Paul Bernays’s dissertation and David Hilbert’s
lectures from 1917–18 (see Franks 2017 for details). But neither the proof nor the identity
were noted by anyone in that decade. There are several reasons why this is so.
First of all, the proof is unnatural. As already indicated, (2) is a question wholly in the
realm of syntax. It is intrusive to tether this context to the theory of truth functions in order to
establish how the syntax hangs together. Certainly, given their aversion to semantic arguments,
the Hilbert school would have been disinclined to proceed in this way (Franks 2017 again has
references).
It is also uninformative. The proof tells us that a certain list of formulas exists, but
it doesn’t indicate how to construct it. The whole interest in ‘proof theoretical’ facts like
(2) derives from an interest in actually obtaining proofs, not merely from knowing that they
exist. Recall from §2 the simple proof of the inference from B A ⊃ B to A B B. The
A-derivation witnessing that second claim is such a simple adjustment of the original proof
that witnesses the first claim that verifying the inference without spotting this construction just
seems unacceptable.
Notice as well that the proof’s simplicity is an illusion. The proof of the completeness of
propositional logic, suppressed here, is complex relative to the problem at hand. A proof that
does not rely on an established correspondence between syntax and semantics promises to be
less complex, not only more natural. This is especially true when one leaves the context of
propositional logic. The same identity holds in first-order quantification theory. Its verification
via a completeness theorem is again available. But Gödel’s completeness theorem is highly
complex, incorporating infinitary principles of reasoning.
This leads to a final point. The proof just given lacks modularity. To prove (2), we appeal
to Bernays’s completeness theorem. To prove its analog for first-order quantification theory,
we invoke Gödel’s result. What about second-order logic or intuitionistic logic? We may be
lacking completeness results for some of these theories. Or we may have completeness, but
8
the analog of (1) might be difficult to verify directly. In any case, it doesn’t seem at all right
that the verifications of each analog of (2) should differ so greatly. The right proof of (2)
should pick out a reason why the identity holds that applies in all the analogous settings.
Contrast the following proof of Γ, A B B only if Γ B A ⊃ B, using the principle of
mathematical induction on the length of the Γ, A-derivation of B:
The following three Γ-derivations exhibit the fact that Γ B A ⊃ B in case 1, 2, and 3,
respectively:
It is easy to check that each line of each derivation is either a member of Γ, an axiom, or the
results of an application of modus ponens.8 The middle sequence qualifies as a Γ-derivation
of A ⊃ B because of the assumption that A is B.
(INDUCTION STEP) Assume that whenever there is a Γ, A-derivation of B that is n or fewer
lines long, there is a Γ-derivation of A ⊃ B. Now suppose there is an n+1 line long Γ, A-
derivation (called D) of B. It is possible that the justification for line n+1 is one of the possi-
bilities from the base case—that B is an axiom, is from the set Γ, or is identical to A. In any
of those cases, the Γ-derivation of A ⊃ B would be constructed as in the base case.
Ordinarily, though, a multi-line derivation is long for a reason, and its last line appears as
the conclusion from an application of modus ponens. In this case, the formulas C ⊃ B and C
occur as earlier lines in D. Therefore Γ, A-derivations that are n lines long or shorter of C ⊃ B
and C appear as subsequences of D. The induction hypothesis then guarantees the existence
of Γ-derivations of A ⊃ (C ⊃ B) and of A ⊃ C. Let D1 = hS1 , S2 , . . . , A ⊃ (C ⊃ B)i and
D2 = hT1 , T2 , . . . , A ⊃ Ci be examples of such derivations, and consider the sequence:
8
See footnote 5.
9
D1
D2
(A ⊃ (C ⊃ B)) ⊃ ((A ⊃ C) ⊃ (A ⊃ B))
(A ⊃ C) ⊃ (A ⊃ B)
A⊃B
This is a Γ-derivation of A ⊃ B.
Observe that this proof has all the features that the first proof was missing. It is natural in
the sense that we expect reasoning about what derivations must exist based on what derivations
are known already to exist to be about those derivations themselves. The proof does not change
topics in order to reach its conclusions.
It is also informative, in that it tells us how actually to use the original derivation as
raw material and construct out of it the derivation we are interested in. This is a popular
pedagogical use of the Deduction Theorem: To find a Frege-style proof, say, of (A ⊃ B) ⊃
((C ⊃ D) ⊃ ((B ⊃ C) ⊃ (A ⊃ D))) might be quite challenging. The proof of the Deduction
Theorem just given makes it easy: Begin with a five-line A ⊃ B, C ⊃ D, B ⊃ C, A-derivation
of D, transform this according to the construction given in the proof into a A ⊃ B, C ⊃
D, B ⊃ C-derivation of A ⊃ D, and continue in similar fashion. More importantly, though,
the proof is for many students a first encounter with a ‘proof-theoretical’ result, an argument
that an object with certain properties must exist based on available manipulations of related
objects assumed already to exist. We will have an occasion to say something more about the
significance of this later on.
The proof is certainly simpler than any line of reasoning that courses through the com-
pleteness theorem. In fact, the proof makes evident that (2) does not depend even on all the
features of classical propositional logic. Frege’s axioms 3, 4, and 5 are never invoked.
Its modularity lies in its simplicity. Immediately we see that the same identity holds for
the full system of Begriffsschrift, not just for its propositional fragment. All that matters is
the presence of axioms 1 and 2. Recall that the first proof of (2) suggested that its analog
for classical quantification theory would depend on Gödel’s completeness theorem and left us
wondering if the same identity even holds in intuitionistic logic. We see now that (2) depends
on no such high-level properties of any formal system and is true in quantification theory for
the same reason that it is true in propositional logic, in their classical, intuitionistic, and many
other varieties.
5. Origins
10
The proof of (2) just presented is customarily attributed to Jacques Herbrand. In chapter
3 of his (1930) thesis, Herbrand proved that if a mathematical theory T can be axiomatized in
the language of the system of Principia Mathematica, then
• because of the finite nature of proofs, any formula P that can be shown to be derivable
in T can be derived in fact from some finite number of T’s axioms (2.41)
• if we let H denote the conjunction of a finite number of T’s axioms, then a necessary
and sufficient condition for the derivability of P from H is the provability of H ⊃ P (2.4)
• if T1 and T2 differ only in the inclusion of the axiom A in the former, then a necessary
and sufficient condition for the derivability of P from T1 is the derivability of A ⊃ P
from T2 (2.43)
Herbrand’s proof of 2.4 is not particularly lucid or memorable. He remarks that the rea-
soning behind it is recursive, distinguishes the relevant base and inductive cases, and indicates
the key formulas that underlie the proof transformations that can be made. But, unhelpfully,
he throws considerations about quantified formula transformations into the mix, deals with
two separate propositional transformation rules (a rule of ‘simplification’ in addition to modus
ponens), and does not highlight at all the fact that proofs are to be constructed in the way
indicated in the last section. For these reasons, one must have some significant acquaintance
with the system he is using—which is not suited to make the essential components of the
proof perspicuous—and various facts he has already proved about it, as well, it seems, as
some intuition about the kind of construction he is aiming for in order to follow the argument.
By contrast, we have seen, the system in Begriffsschrift seems almost to have been reverse-
engineered to accommodate the proof of (2) given in §4: among other things, its first two
axioms are just the formulas the proof calls for, despite Frege’s observation that ‘of course,
it must be admitted that the reduction’ of the infinitely many laws of thought to a few basic
axioms ‘is possible in other ways besides this particular one’ (1879, §13).
But the proof of the Deduction Theorem is elementary in any setting, and it is a mistake
to think that Herbrand’s accomplishment is its discovery. There is no doubt that Frege would
have produced it readily had it occurred to him to look for such a thing. He did not look,
because the relationship he observed between derivability from assumptions and direct proof
did not strike him as the sort of thing that stood in need of verification. It is worth considering
why Frege’s view down this path did not extend as far as Herbrand’s.
Already we have noted that despite certain affinities between his usage and our own, the
turnstile did not represent for Frege exactly what it represents for modern logicians. In his
own words, it indicates only that a proposition is judged to be true, and he does not clearly
distinguish logical and factual truth. This alone explains little, however. Herbrand followed
Frege (and Russell) on this score: Like Frege, he said that the turnstile flagged a proposition
11
as ‘true’, rather than ‘valid’ or ‘provable’; and like Herbrand, Frege described no method for
the discovery of a ‘truth’ worthy of a turnstile other than proof in his formal system and ver-
ification of a conditional by deriving, in the same system, its consequent from its antecedent.
Evidently the vivid distinction between truth and validity, and between conditional statements
and implication claims9 , that figures so prominently in our understanding of the Deduction
Theorem played no role in Herbrand’s discovery.
Probably a greater obstacle was Frege’s ambiguous and somewhat strained attitude about
reasoning from arbitrary assumptions. As indicated in §3, such reasoning does not qualify as
inference, and therefore among other things it cannot be used to test the mutual compatibility
of a set of hypotheses. Although he did, when pressed, grant a legitimate use of such rea-
soning, these acknowledgements were never occasions for recommendation. Frege stressed:
Yes, derivation from assumptions is possible, but all you will generate is a conditional. His
negative tone, together with his primary agenda of defending his peculiar concept of inference
from any intrusions of this sort, indicates that it did not occur to him that this might in fact be
a particularly convenient or efficient way of judging a conditional’s truth.
Most importantly, though, the general research setting that inspired Herbrand’s whole
discussion of these matters was very distant from Frege’s thought. Herbrand viewed logic
primarily as a vehicle for extracting consequences of mathematical axioms. He hastened to
point out, for example, that his fundamental theorem—establishing that the modus ponens
rule could be systematically eliminated from derivations of pure logic—does not extend to
formal axiomatizations of arithmetic, where ‘the rule remains necessary’ (1930, chapter 5,
6.1), that although the rule is ‘useless in logic’, it ‘remains indispensable in mathematical
theories’ (1929). By contrast, Frege conceived of logic as the basis of mathematical theories,
especially of arithmetic, whose axioms, being rules of thought, would be demonstrable in
the system of Begriffsschrift. Frege (wrongly) believed that there would be no occasion for
establishing conditionals like H ⊃ P, neither directly in Begriffsschrift nor by deriving P
from a mathematical theory: Because arithmetical truths are laws of thought, so are their
consequences. Therefore, if H is a conjunction of axioms of arithmetic, then instead of proving
H ⊃ P one could just prove P on its own.
In any event, Herbrand’s breakthrough was conceptual. Realizing that logical derivation
from arithmetical axioms differs in important ways from reasoning in logic purely, the ques-
tion about how these activities align became, for him, central and urgent. That explains why,
despite it holding for arbitrary assumptions, Herbrand frames the Deduction Theorem in the
specific terms of the axioms of a mathematical theory. It also explains why the initial insight
9
Like Russell, Herbrand called ⊃ the ‘implication’ sign and referred to modus ponens as the ‘rule of impli-
cation’. In this he seems to have been further removed from the modern conception of the turnstile than Frege,
who we saw distinguished the assertion of A ⊃ B, which he said is not an accurate translation of conditional
expressions, from its judgement.
12
that we saw Frege gesturing towards, that derivation from an assumption corresponds to proof
of a conditional, not only struck Herbrand as standing in need of proof but came in the fully
general form ‘derivation from assumptions corresponds to derivation of a conditional with one
fewer assumption’—our (2) and Herbrand’s (2.43): for Frege derivation from assumptions is
a shift away from reasoning from the laws of thought, something to be explained away in
terms of those laws and, in the event that the assumption was an arithmetical axiom, to be
further exposed as a needless activity. Because Herbrand saw derivation from assumptions
instead as essential to mathematical thought, he turned to logic as a means for understanding
and working with it, not for eliminating it.
6. Abstraction
Some writers, including Tarski himself, have disputed the attribution of the Deduction
Theorem to Herbrand, indicating that Alfred Tarski had known and used the result years ear-
lier. What is certain is that Axiom 8 in Tarski 1930 says that ‘if . . . z ∈ Cn(X + y), then
c(y, z) ∈ Cn(X)’. In Tarski’s work, Cn is an operator that, applied to a set of sentences,
generates the set of those sentences’ (syntactic) consequences, i.e., everything derivable from
them according to inference rules; c is an operator that, applied to an ordered pair of sen-
tences, returns another sentence. Axiom 8 and its converse, Axiom 7, are understood by
Tarski as a definition of c. Clearly, to say of the sentence denoted by z that it is a mem-
ber of Cn(X) is just to say that X z. Therefore, Axioms 7 and 8 together express the
identity Γ, A B if, and only if, Γ c(A, B). As Church (1947) observed, Tarski seems at
least to have established this version of the Deduction Theorem, ‘independently and nearly
simultaneously’ with Herbrand.
Did Tarski in fact have a proof of the Deduction Theorem as early as 1921? I see no rea-
son to doubt his (1956, p. 32) report that he had, although his first discussion of establishing
it for a specific deductive system is for theorem 2 of Tarski 1933. But the question deflects at-
tention to a far more important achievement of Tarski’s, an advance beyond anything found in
Herbrand’s work. For, as Church observed in the continuation of his comparison of Herbrand
and Tarski, the identity in Tarski 1930 appears as ‘a general methodological postulate’. In
Tarski’s hands, what for Herbrand was a demonstrable feature of a specific deductive system
is reworked as a condition on all deductive systems. By subsuming in this way the ‘theorem’
into the axiomatic framework, Tarski was able to investigate deductive systems as a class of
structures. This abstract perspective ushered in several novel considerations.10
10
The general investigation of deductive systems as a class of structures originated in the work of Paul Hertz
(1929). In important ways, Hertz’s framework was more purely logical than Tarski’s because it did not rely on
set theory. Unlike Tarski, Hertz did not consider a language with a sentential connective like ⊃ that would allow
for an expression of the Deduction Theorem. Franks 2010 and Franks 2014 present Hertz’s contributions here
13
Most obviously, Tarski was able to distinguish deductive systems for which Axioms 7
and 8 hold (which we may call implicative) from those for which they fail. This leads natu-
rally to an interest in proving results that hold for all deductive systems as well as results that
hold only for implicative systems. More profoundly, it uncovers hidden relationships among
distinct deductive systems. For example, familiar axiomatizations of the classical proposi-
tional calculus and of the intuitionistic propositional calculus each fall into this class—in each
framework the conditional symbol meets the conditions for the operator c—so one can de-
rive axiomatically facts that pertain to both of these logical frameworks. But observe that, as
emphasized in Łukasiewicz and Tarski 1930, c does not denote a connective in any particular
deductive system (although as we just observed, a sentential connective could satisfy the con-
ditions of c in some cases).11 Therefore, an implicative system needn’t have a connective that
meets the conditions of axioms 7 and 8: there need only be some operation, denoted in the
general axiomatic theory of deductive systems, mapping pairs of sentences to sentences in a
way that satisfies these axioms. Such a mapping might not pick out any syntax recognizable
as expressing conditionalization or implication, or it might not be one-to-one.
This perspective, where a theorem is inverted into an axiom, might seem at first glance to
bring us back to Frege’s position. Like Tarski, Frege did not bother to prove (2) for the system
in Begriffsschrift. He simply announced it. Could he have been defining the conditional?
In fact, Tarski’s view is better understood as opposing Frege’s. Frege defined the condi-
tional twice. He defined it indirectly in terms of the classical truth conditions of the expression
A ⊃ B together with the understanding that although such an inscription does not express a
conditional, such could only be judged true based on one. He also defined it directly by
specifying a canonical way to judge such expressions’ truth, in terms of formal derivability:
Conditionals are just those statements of the form A ⊃ B that can be proved from the axioms
of Begriffsschrift and modus ponens. Why Frege felt confident that these two definitions—one
semantic and the other syntactic—would be extensionally equivalent is a matter of specula-
tion. But it is certain that Frege did not have in addition to these a third definition of the
conditional in terms of deducibility from assumptions. That is what makes the question so
urgent to Frege’s modern readers: How can he just announce that conditionals are always true
when their consequents are derivable from their antecedents? Doesn’t this need proof?
Tarski, by contrast, by axiomatizing the conditional operator, has things the other way
around: The conditional is defined by Axioms 7 and 8. Individual deductive systems (and
especially as precursors of Gentzen’s analysis of logical consequence.
11
Cf. p. 39. Łukasiewicz and Tarski distinguished A ⊃ B, a sentence in the sentential calculus, from c(x, y),
a name of that sentence ‘in the metasentential calculus’. In light of Quine’s analysis, rehearsed in §2, it is
disorienting to find this use/mention distinction emphasized in a logical monograph together with the statement
that A ⊃ B ‘expresses the implication between “A” and “B”, but those are the words of Łukasiewicz and Tarski.
One could also press the point further and stress that the operator c makes perfect sense even when it does not
‘name’ any sentence in the system one is studying.
14
other structures) might fail to be implicative by not satisfying those constraints. From such a
‘failure of the Deduction Theorem’, the thing to conclude is, not that these systems’ ‘condi-
tionals’ fail to properly reflect the consequence operation, but that such systems cannot express
conditionals.
7. Natural deduction
The next stage in the Deduction Theorem’s evolution is to my mind the most radical,
even if again it appears in our retrospective view to have been preconditioned by the work that
preceded it. This is the collapse, in the hands of Gentzen and Jaśkowski12 , of Łukasiewicz and
Tarski’s distinction between the sentential and the meta-sentential calculi of formal logic.
Consider, once again, a natural understanding of the identity (2) in the context of the
system in Frege’s Begriffsschrift: It tells us that proofs of statements of the form A ⊃ B are
available whenever there is a A-derivation of B. As indicated in §4, a constructive proof of this
theorem, such as the one Herbrand provided, even indicates how to build such a proof, using as
raw material the A-derivation of B. There might be occasions when building such proofs are
of interest, but more often knowing that they exist suffices. For those purposes, it might make
sense to supplement the system of Begriffsschrift with a ‘virtual rule of inference’, leading
from the ‘premise’ Γ, A B B to the ‘conclusion’ Γ B A ⊃ B. There is a textbook tradition of
proceeding in this way: first introducing a sentential calculus and notion of formal proof, then
verifying the Deduction Theorem, and then supplementing the sentential calculus with a meta-
sentential or virtual inference rule.13 The resulting hybrid system B+ is far more efficient and
student-friendly than the original system B. Some care is needed, though, because in B+ one
needs to allow applications of modus ponens not only to theorems but also to assumptions and
sentences derived from assumptions—an allowance that cannot be extended to a ‘substitution
rule’. In B+, modus ponens also is ‘meta-sentential’. And importantly, what the Deduction
Theorem means in this context is that anything provable in B+ is in fact provable in B—the
virtual rule is in principle ‘eliminable’. Notice that even when working in B+, the turnstile
refers back to the ‘object level’, to provability in B.
What might occur to you—although it seldom occurs to students—is that by instead
retaining the ‘virtual rule’ allowing the inference of Γ A ⊃ B from Γ, A B, one could
instead eliminate other features of B+, specifically the two axioms (1 and 2) used in the proof
of the Deduction Theorem. Call the resulting system G. Shortly we will verify that axioms 1
and 2 of the Begriffsschrift are in fact provable in G, and therefore that G, B+, and B are all
12
See Jaśkowski 1934.
13
This was Quine’s approach in several monographs, which inspired many later texts. Church’s (1947) review
of one of them describes the move as follows: ‘By this device the development is greatly simplified . . . , though
at the cost of allowing a primitive rule of different and much more complex character than is necessary’.
15
equivalent. But when working in G the turnstile refers, no longer to provability in B or any
other object-level sentential calculus, but to provability in G itself. The turnstile has become
self-referential!
The way of thinking leading from Frege and Russell through Herbrand to Quine and the
standard textbook tradition of formal logic in the 20th Century obscures the possibility of this
move. In that tradition, the conditional (or implication) is defined by axioms of a sentential
calculus, and the Deduction Theorem is a meta-theoretic fact about how derivability relates to
such statements’ validity (or truth). Derivability cannot refer to reasoning carried out at the
meta-level where the Deduction Theorem lives. But if one follows Tarski (and before him,
Bolzano) and understands the identity of which (1) and (2) are instances as a definition of the
conditional, this whole framework is inverted. The Deduction Theorem is merely a verification
that one of a specific formal system’s sentential connectives operates as a conditional. What
is important is the conditional operator itself, though, which one might as well build directly
into the notion of formal derivability.
This perspective culminated in Gerhard Gentzen’s (1934–35) Untersuchungen über das
logische Schließen. There, Gentzen designed ‘natural deduction’ proof systems for proposi-
tional and quantificational logic in which not only the conditional14 but in fact all the logical
operators are characterized by pairs of inference rules, an ‘elimination rule’ characterizing
how to reason from a statement governed by that operator and an ‘introduction rule’ charac-
terizing how to reason to such a statement. For example, ¬ is defined by the elimination rule
[A]
¬A A and the introduction rule
⊥ ⊥ , and ∧ is defined by the elimination rule allowing
¬A
both A ∧ B and A ∧ B and the introduction rule A B .
A B A∧B
To this day it is routinely said that the value of natural deduction lies principally in its
efficiency and the intuitive manner in which its proofs can be constructed. Gentzen himself
made occasional remarks along these lines. For example in the synopsis of the Untersuchun-
gen he wrote, ‘I intended first to set up a formal system which comes as close as possible to
actual reasoning’. But two points are worth stressing. First, at best a formal system could
capture the flow of reasoning required to follow the written record of a mathematical discov-
ery. (Gentzen himself claimed that his introduction and elimination rules are the result of an
empirical discovery of such written records.) This should not be mistaken with the claim that
these same rules are constitutive of the original context of mathematical discovery. Second,
even if Gentzen is to be believed that his original intention was to emulate actual mathemat-
ical reasoning, and that it only ‘turned out’ afterwards that the systems designed to this end
had ‘certain special properties’, the value of natural deduction to logical theory lies in those
properties.
14
Gentzen actually perpetuated the practice of referring to ⊃ as the implication operator.
16
The property of natural deduction that I want to focus on here15 is the way that the In-
tro/Elim scheme captures the meaning of a logical particle. This is perhaps most vivid when
the rules are rewritten in turnstile notation. For example, the rules for ∧ can be written thus:
A∧B A and A ∧ B B
For all C, if C A and C B, then C A∧B
¬A, A ⊥
For all C, if C, A ⊥, then C ¬A
Again the first condition tells us that ¬A is an object that, together with A allows the inference
to ⊥ (absurdity), and the second condition tells us that this is the full account of ¬A, that any
other object that pairs with A to allow an inference to ⊥ is at least as strong as ¬A in the sense
that from it one can infer ¬A directly.
This same scheme presents a novel route to the Deduction Theorem, as implicitly con-
tained in the rule modus ponens. For if one understands modus ponens not just as one among
many patterns of valid inference but as saying that it is definitive of the conditional A ⊃ B that
it licenses the inference from A to B, then part of what one means is that any sentence that can
pair up with A to allow the inference to B expresses something equivalent to or stronger than
A ⊃ B. To cast this into the setting of natural deduction, one has to first understand modus
ponens as applying not just to Fregean judgements, but even to arbitrary assumptions. Then it
is just ⊃-Elim, rewritten in the turnstile notation as
A ⊃ B, A B
Understood as a definition of ⊃, this determines that ⊃-Intro must be the following form of
the Deduction Theorem, stating that A ⊃ B can be inferred from any other formula that fills
this role:
17
In the language of category theory, this understanding of the way Intro/Elim rules define
a logical operator is called a ‘universal mapping property’. Thus the ‘categorial product’ of
two objects A, B is another object that ‘points’ to them both, and not only that, but it is the
quintessential such object in the sense that any other object that points to both A and B must
also point to their product. Dually, the categorial coproduct of A and B is the quintessential
thing they both point to, quintessential in the sense that anything else pointed to by both A and
B is pointed to by this coproduct.16
One might say that the fundamental idea of natural deduction is that propositions can
be viewed as the objects of a mathematical category, where the partial order on propositions
given by the deducibility relation are the category’s arrows. Of course, this understanding can
only be applied to Gentzen, who wrote in the 1930’s, with the benefit of hindsight. But in
essence his calculi of natural deduction are a presentation of logical operators defined in terms
of universal mapping properties on this category of propositions, related by deducibility. For
his own part, Gentzen wrote: ‘The introductions represent . . . the “definitions” of the symbols
concerned, and the eliminations are no more, in the final analysis, than the consequences of
these definitions. . . . By making these ideas more precise it should be possible to display the
Elim-rules as unique functions of their corresponding Intro-rules’ (§II 5.13). The concept
of a universal mapping property is the precisification Gentzen foresaw: Elimination rules
are not consequences of their corresponding introduction rules, but they do follow from the
conception of those introduction rules as definitions. Conjunction is the categorial product;
disjunction, the categorial coproduct. What conditionalization is will be described in §9.17
When Gentzen announced that his natural deduction calculi ‘turned out to have certain
16
The full elaboration of this understanding results in what is known as a Heyting category, which is a presen-
tation of the variety of semi-lattices known as Heyting algebras in the language of category theory. We do not
present that full elaboration here. Instead, in §9, we consider a refinement of this construction that results from
using as the fundamental notion, instead of the modal concept of derivability (‘what can be inferred’), particu-
lar derivations. This allows the articulation of an aspect of the Deduction Theorem not visible in the Heyting
category.
17
A sizeable literature has developed around the question of the proper relationship between introduction and
elimination rules. The reader will have noticed that in the famous passage just presented, Gentzen described
the introduction rules as definitive and the elimination rules as determined by them, whereas in the presentation
of ∧ and ⊃ given here, things are the other way around: for example the Deduction Theorem, conceived of
as an introduction rule for ⊃, is shown to be determined by the conception of modus ponens (i.e., ⊃-Elim) as
definitional. The point stressed here is that the question as to the priority of the elimination or introduction rules
is a distraction. Neither rule ‘follows’ from the other, although the conception of one of them as, not merely an
incidental property of a connective, but as definitive of that connective determines what the other must be. In the
case of disjunction, the natural presentation begins, as Gentzen described it, with the introduction rule and ends
with the elimination rule (see the definition of coproduct in §9). This is the sense in which hindsight sheds light
on Gentzen’s words. His claim that one rule is a consequence of another has proven to be hard to evaluate in
the terms that he used but makes perfect sense in terms of universal mapping properties, and in these terms the
question of priority of one rule over the other falls away.
18
special properties’, he specified one in particular. In the synopsis he wrote that ‘the law of
excluded middle, which the intuitionists reject, occupies a special position’. In paragraph 5.3
of §II, he observed that classical logic could be formulated with an additional rule schema for
double negation ¬¬A ‘in place of the basic formula schema’ for the law of excluded middle,
A
and then specified the special position that this schema occupies: ‘Such a schema still falls
outside the [introduction/elimination] framework . . . , because it represents a new elimination
of the negation whose admissibility does not follow at all from our method of introducing
the ¬-symbol by the ¬-Intro [rule]’. In contemporary terms, Gentzen’s point is that the law
of excluded middle and the double negation rule are in no way necessitated by the meaning
of the negation operator, when one thinks of its meaning as given by the universal mapping
property underlying the ¬-Intro/¬-Elim rules.
To see how unappreciated Gentzen’s theoretical advance is, one need only survey text-
book presentations of natural deduction. In the very popular Language, Proof, and Logic
by Barwise, Etchemendy, and Barker-Plummer, for example, the inference figures ¬¬A ,
A
¬A A , and ⊥ are called ‘¬-Elim’, ‘⊥-Intro’, and ‘⊥-Elim’. The same nomenclature
⊥ A
appears even in Richard Kaye’s lovely The Mathematics of Logic. In neither book is there any
mention that the first of these combines with no other rule to specify a universal mapping prop-
erty or that the second is actually the elimination rule for ¬, defining that operator together
with its introduction rule and having no particular relationship at all to the third with which it
is paired. Yet Gentzen’s stated reason for publishing his system of natural deduction, even if
it wasn’t his original motivation for devising it, was to point out that the double negation and
ex falso quodlibet rules ‘fall outside the Intro/Elim framework’ that he devised to define the
various connectives.18
It is this feature of natural deduction that leads to a direct justification of the Deduc-
tion Theorem (formulated as (3)), not as a verifiable fact about conditionals in any particular
sentential calculus, but as the other side of the coin of the modus ponens rule. When condition-
alization is understood as an operation that maps A and B to the proposition that together with
A licenses an inference to B, modus ponens captures the part about licensing this inference,
and (3) captures the part about being the thing that so licenses it, so that any other proposition
that allows an inference from A to B only does so via A ⊃ B. And as expected, the verification
of the Deduction Theorem in the form (2), based on a definition of the conditional in terms of
Frege’s basic laws 1 and 2 can be inverted: By defining the conditional as a universal mapping
18
This tendency to overlook the conceptual advance of natural deduction might originate with Quine 1950,
where Quine advanced what he described as a system that ‘enhances’ the advantages of Gentzen’s system over
axiomatic frameworks. Quine’s system suppresses all propositional inference except for one rule corresponding
to the Deduction Theorem, citing the decidability of classical propositional logic as obviating such features. He
even said there that natural deduction generally ‘lacks certain traits of elegance which grace’ axiomatic systems
(p. 93).
19
property (specified by Gentzen’s introduction and elimination rules), the Deduction Theorem
in the form (3) and modus ponens can be used to verify Frege’s first two basic laws:
1 2 1 3
8. Logical structure
The history recounted so far twice touches on the concept of admissibility, a focus of
contemporary logic to which the Deduction Theorem is intimately connected. We shall see
that the Deduction Theorem played a crucial role in the identification of admissibility as a
phenomenon distinct from the more general notion of deductive validity, and it returned as the
key to characterizing logical structures in light of the questions that admissibility poses.
Intuitively, a rule of inference is called ‘admissible’ if it leads reliably from one logical
truth to another. More precisely, an admissible inference rule is one whose conclusion has
substitution instances that are logical theorems whenever the corresponding substitution in-
stances of its premises are. In algebraic terminology, admissible rules are operations under
which a logic’s set of theorems is closed.
In §3 we observed that Frege’s conception of logical inference seems to correspond with
this notion, for he called ‘inferring’ the practice of reasoning from ‘already established truths’
to ‘other truths’. The more general activity of reasoning from hypotheses or arbitrary assump-
tions did not qualify as inference in Frege’s eyes. He called it ‘mere derivation’. We observed
further that Frege’s distinction between genuine inference and mere derivation facilitated one
of the earliest articulations of the Deduction Theorem, in his (unproved) claim that the logical
validity of a conditional expression ( B A ⊃ B) follows from the derivability in the system
of Begriffsschrift of its consequent from its antecedent (A B B).
Frege’s distinction survives in contemporary discussions as the distinction between ad-
missibility and derivability. A logical system’s derivable rules are those from which one can
reliably reason, not just from theorems but from arbitrary formulas. The sense in which a
logical system’s admissible rules are reliable is straightforward: The result of applying them
to theorems is another theorem. The sense in which derivable rules are reliable is harder to
express and sets up the general problem of defining deductive consequence: One clearly wants
more from an inference rule than that it lead to another arbitrary formula.
20
In classical propositional logic this something more can be specified in terms of the con-
cept of an interpretation. There a rule is derivable if, whenever its premises are true in an
interpretation (under an assignment of truth values to atomic formulas), its conclusion is true
in that same interpretation. Similar specifications are available in richer settings (quantifica-
tion theory, intuitionistic logic) according to obvious reformulations in terms of set-models
or forcing relations on frames. An alternative, proof-theoretical, specification is available,
though, simply in terms of a designated set of primitive rules (for example, the Intro/Elim
rules of natural deduction). Then an inference rule is said to be derivable if a finite sequence
of applications of primitive rules leads from its premises to its conclusion.
With this in mind, it should be clear that in any framework whatsoever, a derivable rule
will also be admissible. Consider again the model-theoretic specification of classical propo-
sitional logic: If a rule preserves truth in any interpretation (i.e., if it is derivable), then it
certainly preserves the property of being true in every interpretation (i.e., it is admissible).
Equally trivial is the verification in proof-theoretic terms: If a rule’s conclusion can be de-
rived primitively from its premises (i.e., it is derivable), and if, further, each of its premises are
themselves primitively derivable with no assumptions at all, then so too must be its conclusion
(i.e., it is admissible).
Frege appears to have assumed the converse, that any rules available for ‘inference’
among ‘judgements’ (available, that is, to reliably establish theorem-hood on the basis of
previously established theorems) are also available to form derivations among arbitrary for-
mulas. This assumption paid off, as it led to Frege’s articulation of the Deduction Theorem
and shortcut method of verifying the validity of conditional judgements. Its status as an as-
sumption, though, is ambiguous. On the one hand, Frege was correct that the admissible rules
of Begriffsschrift are also derivable. On the other hand, it is not generally true that admissi-
bility implies derivability—the fact that it does in the propositional fragment of the system in
Begriffsschrift was Frege’s good fortune.
Logical systems whose admissible rules are all derivable are called ‘structurally com-
plete’. It is commonly said that classical propositional logic is structurally complete, and
this property appears to underlie some of Frege’s maneuvering. Not all logical systems are
structurally complete, though. Most notoriously, the traditional presentations of intuitionis-
tic propositional logic have admissible rules that are not derivable. The most well-known
example of such is Harrop’s rule19 :
¬A ⊃ (B ∨ C)
(¬A ⊃ B) ∨ (¬A ⊃ C)
This rule is often cited as the earliest discovered example of an admissible but underivable
rule, although in Kreisel and Putnam’s 1957 study of it, they point out that in 1953 G. H. Rose
19
First presented in Harrop 1956.
21
had made the same observation about the rule:
(¬¬A ⊃ A) ⊃ (¬¬A ∨ ¬A)
(4)
¬¬A ∨ ¬A
More recently, Tim van der Molen (2016) observed that Ingebrigt Johansson had noted
the admissibility and non-derivability of the rules ⊥ (ex falso quodlibet) and A ∨ B ¬B
A A
(disjunctive syllogism) in minimal propositional logic already in 1935–36 correspondence
with Arend Heyting. The difference between minimal and intuitionistic logic, in other words,
is that the latter has strictly more derivable rules. They agree about admissibility.
One need not look to studies about non-classical logics to witness the phenomenon of
structural incompleteness, though. We encountered it already in our review of Herbrand’s
pioneering work on the Deduction Theorem. In §5 we observed that one of Herbrand’s princi-
pal applications of the Deduction Theorem is his demonstration that the inference rule modus
ponens can be systematically eliminated from what Herbrand called ‘purely logical’ deriva-
tions. He contrasted purely logical derivations with derivations from arithmetical or other
mathematical axioms, in which modus ponens can be essential. (Although the conceptual re-
lationship between modus ponens and the Deduction Theorem—first exposed by Gentzen in
terms of the Intro/Elim rules and later clarified in the categorial terms of universal mapping
properties—was not explicit in Herbrand’s thought, it operates behind the scenes of his proof.)
Herbrand’s observation is readily recast in terms of admissibility: If one removes the rule
modus ponens from Herbrand’s formulation of classical logic, the result is a system in which
modus ponens is admissible but not derivable. (This same observation is more commonly
made in terms of the sequent calculus, by saying that in certain cut-free calculi, the cut rule is
admissible but not derivable.)
A similar encounter with structural incompleteness occurred in our review of the tran-
sition from Frege-style presentations of logic to natural deduction. Recall the hybrid system
B+ that we encountered along the way: It has all the axioms and inference rules of Begriffs-
schrift together with the ‘virtual’ rule corresponding to the Deduction Theorem. Because the
Deduction Theorem holds for B, B+ will have no more theorems than B has. But one must
be careful, we noted, when working in B+ to apply the substitution rule only to logical axioms
or to formulas derived purely from them, because although substitution instances of theorems
are again theorems, unrestricted use of substitution leads quickly to inconsistency (from A ∧ B
infer A ∧ ¬A, or, even more directly, from A infer B ∧ ¬B). Substitution is admissible but not
derivable.
These early and rudimentary encounters with structural incompleteness highlight a gen-
eral feature of the theory of admissibility. Tarski (e.g., in1936) defined a logical theory as a
set of theorems closed under a consequence relation. But on this Tarskian conception, there is
22
no fact of the matter about whether a logical theory is structurally complete or not. Structural
completeness, in other words, is not a property of Tarskian theories but of the consequence
relations that generate them. Because distinct consequence relations can agree about logical
theorem-hood, knowledge of the set of theorems leaves undetermined further facts about the
consequence relation. In particular, although it does fix the set of admissible inference rules,
that information does not fix the set of derivable rules.20
To illustrate the Deduction Theorem’s bearing on the natural questions raised by these
phenomena, it is helpful to present the notation of the abstract theory of consequence relations.
Following Hertz 1929, Gentzen 1932, and Tarski 1936,21 one finds the concept of a
(single-conclusion) finitary consequence relation defined as a relation on a set S of for-
mulas that satisfies, for all finite subsets Γ, ∆ ⊆ S and all formulas A, B ∈ S:
1. A A (reflexivity)
2. if Γ A then B, Γ A (monotonicity)
• A rule Γ is derivable in if Γ A.
A
• A rule r = Γ is admissible in
r
if Thm( ) = Thm( ). The notation Γ ∼ A is
A
20
This is not to say that the set of admissible inference rules can be effectively determined from information
about the set of theorems. Chagrov (1992) identified logical systems whose set of derivable rules, and hence
theorems, is decidable, although the set of admissible rules is undecidable. And generally, the complexity of the
question of admissibility of inference rules is computationally more complex than the corresponding question of
derivability.
21
See Franks 2018 for an account of the history of this notion and of its relationship to a competing definition
of logical consequence in terms of substitution.
23
used to express the fact that Γ is admissible in .
A
It is easy to verify that if is a consequence relation, then ∼ is also a consequence rela-
22
tion. Of course these two relations have the same theorems. More generally, if 1 and 2 are
consequence relations such that Thm( 1 ) = Thm( 2 ), then ∼ ∼
1 = 2.
This observation has led some logicians to remark that, because they depend only on the
set of theorems, a logic’s true characterization is in terms of its admissible rules. The derivable
rules, by contrast, depend on a ‘design choice’, i.e., on which proof system or semantic frame-
work one chooses to generate its theorems. See especially Rybakov 1997. Similarly Iemhoff
2016 remarked that whether or not a logic is structurally complete ‘depends very much on the
particular consequence relation one uses for a logic’, whereas ‘admissibility solely depends
on the [logic’s] theorems’ which is invariant for all such consequence relations.
As was mentioned above, one ordinarily says that classical propositional logic (CPC) is
structurally complete, whereas intuitionistic propositional logic (IPC) is not. Indeed, IPC is
the standard reference for the structural incompleteness phenomenon. Can this sentiment be
maintained in light of the fact that if IPC is any consequence relation generating the theorems
of IPC, then ∼ IPC is a structurally complete relation with the exact same theorems? A tempt-
ing, but ultimately unsuccessful, response is to try to pin the distinction between CPC and
IPC to the mere existence of structurally incomplete relations generating the theorems of the
latter. Such a response fails because, as Rosalie Iemhoff pointed out, the smallest consequence
relation RI that generates the theorems of CPC, given by
is structurally incomplete.23
In sum, both CPC and IPC have structurally complete and structurally incomplete pre-
sentations, in the sense that the theorems of each are generated by both sorts of consequence
relations. What can be said in favor of the compelling idea that structural incompleteness is a
hallmark of intuitionsitic logic? The Deduction Theorem provides an answer.
A consequence relation such that Thm( ) = Thm( IPC ) can evidently be struc-
turally complete only by violating the Deduction Theorem: In any such relation, (¬¬A ⊃
A) ⊃ (¬¬A∨¬A) ¬¬A∨¬A, corresponding to the admissibility of Rose’s Rule (4), so that
22
See, for example, Iemhoff 2016, corollary 4.3.
23
See Iemhoff 2016 for verification that RI satisfies the conditions of a consequence relation and is the minimal
element of { : Thm( ) = Thm( CPC )}.
24
the left to right direction of (2) would force ((¬¬A ⊃ A) ⊃ (¬¬A ∨ ¬A)) ⊃ (¬¬A ∨ ¬A).
But ((¬¬A ⊃ A) ⊃ (¬¬A ∨ ¬A)) ⊃ (¬¬A ∨ ¬A) is certainly not a theorem of IPC.
Conversely, consider any consequence relation with Thm( ) = Thm( CPC ) but
with a rule A for which A ∼ B but not A B. Because A ∼ B, A ∼ CPC B, and so
B
CPC A ⊃ B. Therefore also A ⊃ B. Applying the right to left direction of (2) leads to
A B, contradicting our assumption about . For example, ¬¬A ∼ RI A, but not ¬¬A RI A,
even though RI ¬¬A ⊃ A.
In 2016, Iemhoff wrote that some consequence relations that generate the theorems of
familiar logics like CPC and IPC are ‘far more natural than others’. The Deduction Theorem
clarifies this intuition. Among the connectives of CPC and IPC is ⊃. Following Gentzen
and Bolzano, the inferential meaning of this connective is given by the Deduction Theorem
and the (‘metasentential’ version of the) modus ponens rule. Therefore in any consequence
relation in which the Deduction Theorem fails, ⊃ will not have its intended meaning. If propo-
sitional logic is supposed to characterize the rules of reasoning about conjunction, negation,
conditionalization, and the like, then the Deduction Theorem is a constraint over and above
the definitive conditions of a consequence relation that must be satisfied in order for a relation
to adequately represent a propositional logic. When this constraint is met, all the admissible
rules of classical logic are derivable, whereas intuitionistic logic has underivable admissible
rules.
The Deduction Theorem led Herbrand to the earliest observation of the difference be-
tween admissibility and derivability: The system that results from removing modus ponens
from his formulation of classical logic has modus ponens still as an admissible, though not as
a derivable, rule. If all one were interested in were a presentation of the theorems of classical
logic, the lesson from this observation would be that there are two equally good presentations
to choose from—one being more efficient, the other less redundant. This is not all a logician
might be interested in, though. As Herbrand pointed out, modus ponens ought to be an explicit
rule so that one can apply logic also to non-logical axioms, such as the formulas that define
arithmetical operations or the principle of mathematical induction. Even more basically, we
observed that having the right set of theorems and satisfying the conditions of a consequence
relation might be enough for some purposes, but that it doesn’t suffice to adequately represent
propositional logics. For that purpose, the Deduction Theorem is an additional constraint,
one that picks out the structurally complete presentation of classical logic and a structurally
incomplete presentation of intuitionistic logic.
9. Adjunction
25
The Deduction Theorem makes precise the intuition that classical logic is structurally
complete whereas intuitionistic logic isn’t. But there is another sort of logical completeness
that applies to intuitionistic and classical logic alike, and again the Deduction Theorem under-
lies the phenomenon. As Kosta Došen wrote in 1996, the Deduction Theorem
. . . is a completeness theorem of some sort. It tells us that the system is strong
enough to mirror its extensions in the same language. . . . For the deductions we
shall be able to make in the extension we already have corresponding implica-
tions in the system. However, this completeness is not such that subsystems or
extensions of our system cannot have it. In classical logic we have the deduction
theorem, but in intuitionistic logic we have it, too. (p. 243–4)
Following Došen, we could call this ‘deductive completeness’ in contrast to ‘structural com-
pleteness’ and ‘semantic completeness’.24
The extensions of a logical system that Došen referred to are just systems supplemented
with non-logical axioms. So his concept of completeness is about a ‘correspondence’ between
derivations from hypotheses and ‘implications’, i.e., hypothesis-free proofs of conditionals.
Clearly the Deduction Theorem in the form (2), because it is a biconditional claim, describes
a parity between proofs of conditionals and derivations of those conditionals’ consequents
from their antecedents. But the correspondence Došen had in mind runs deeper than this,
as indicated by his expression of a system ‘mirroring’ its extensions. He also described the
extended systems being ‘contained in’ the original system, or there being ‘a reflection’ of each
of the former in the latter (p. 274).
This is a provocative reading of the Deduction Theorem, but one not at all expressed in
its usual formulation. Recall that in the standard proof of (2), one gets more than a verification
of a biconditional: one gets a recipe for building a proof of a conditional, using a derivation
of that conditional’s consequent from its antecedent as raw ingredient. For that reason, the
proof conveys a tremendous amount of information about how any hypothetical derivation
and its corresponding hypothesis-free proof are related—information that is lost in the state-
ment of the theorem. Even the verification of (1) in the theory of classical truth exposes a
close relationship between classical entailment and the validity of an associated conditional,
whereas the biconditional (1) says only that when you have one you will have the other as
well. It is natural to seek a fuller expression of the way in which the structure of derivation
from assumptions is contained already within the structure of a system’s theorems.
The vocabulary for expressing just this sort of sameness of structure can be found in
24
Došen used the term ‘deductive completeness’ to refer to a generalization and extension of the property de-
scribed here—one that applies across the full variety of Lambek’s ‘deductive systems’, of which logical systems
ordinarily construed are only a special type. Lambek himself used the term ‘functional completeness’ in 1974
and Lambek and Scott 1986.
26
category theory.
Definition A category C consists of a collection ob(C) of objects (A, B, . . .) and a collection
hom(C) of morphisms (f, g, . . .) between objects, denoted by (and conventionally also called)
arrows whose sources and targets are objects in ob(C), together with a binary relation ◦ called
composition of morphisms that on an ordered input of two arrows, f : A −→ B and g : B −→
C, the target of the first of which is the second’s source, returns an arrow g ◦ f : A −→ C and,
for every object C, a special25 identity morphism denoted 1C , such that:
• h ◦ (g ◦ f ) = (h ◦ g) ◦ f for any f : A −→ B, g : B −→ C, and h : C −→ D.
(Associativity Law)
• 1B ◦ f = f = f ◦ 1A , for any A, B and f : A −→ B. (Identity Law)
Often one wants to refer to the restriction of hom(C) just to arrows with source A and target
B. This collection is denoted homC (A, B).
The laws of category theory can also be expressed in terms of the ‘commutativity’ of the
following diagrams, meaning that any two paths from one object to another object in such a
diagram made by composition of arrows yield identical morphisms:
f f
A B A B
h◦g Associativity Law f Identity Law
g 1A 1B
g◦f
C h
D A f
B
27
C
f g
h
A pA A⊗B pB B
C
Such universal mapping properties abound in category theory, characterizing objects in
terms of (1) their relation to other objects or arrows and (2) their being the uniquely extreme
objects so related. Of course it does not follow from the rules of category theory that there
will always be objects satisfying a given universal mapping property. For example, there may
be no product of A and B, either because no one object is the source both of an arrow whose
target is A and an arrow whose target is B or because, among such objects, there is no unique
such, targeted by them all. Shortly we will observe conditions under which a category will
always have products for all its objects.
Let us consider now the conception of a logical system L as a category L, as introduced
by Joachim Lambek in 1968, 1969, and 1972.26 In this scheme, ob(L) are the formulas of
L , and hom(L) are derivations. In particular homL (A, B) is the collection of all derivations
from A to B. One must identify certain formal derivations in L for this scheme to satisfy the
rules of categories.27 When one identifies all formal derivations from A to B with a single
arrow, L is just the partially ordered set of provability, and hom(L) is just the consequence
relation on L-formulas. But the flexibility of the categorial framework allows an investigation
of more nuanced relationships among proofs that are hidden by the consequence relation. As
Došen emphasized, ‘[in] categorial proof theory we are not concerned with a consequence
relation, but with a consequence graph, where more than one arrow, i.e., deduction, can join
the same pair of objects, i.e., propositions’ (2006, p. 643). To bring these more nuanced
relationships to the fore, one may identify, e.g., only formal derivations that resolve to the
same normal form under the familiar normalization procedures of natural deduction. Thus the
26
A particularly lucid introduction to this conception, from the motivation of its basic framework to open
questions about the coherence of its individuation criteria involved, is Harnick and Makkai 1992.
27
The references in this paragraph explain why.
28
arrows in homL (A, B) are abstract derivations from A to B: we distinguish two such arrows
only according to differences in how they associate with other arrows, though they all are of
the same ‘type’.28
Observe that in L, ◦ corresponds to the fact that a derivation of C from the assumption
B can be appended to a derivation of B from the assumption A, resulting in a derivation of
C from the assumption A. In Frege systems, appending is just concatenation of strings; in
natural deduction, all leaf nodes in the proof-tree of C that contain ‘open’ occurrences of B
are replaced with copies of the derivation of B from A. This leads to a reading of claims
like ‘for all arrows f : A −→ B there is a unique arrow zf : A −→ C for which the
C
zf
diagram i commutes’ as ‘any derivation from an assumption to B must actually
A f
B
be a derivation from that same assumption to C followed by some fixed derivation of B from
C, the same one that recurs in all such derivations of B’.
Returning now to the concept of the product of A and B as understood in L, we see that
such an object (if one exists) is more than just a formula from which it is possible to prove
both A and B and which can be proved from any other formula that suffices for derivations
of both A and B (as was stressed in §7)—such is the meaning of a product available when
one considers only the partially ordered set of provability. By generalizing to the category
of abstract proofs, this same construction means that the derivations of A and B, from any
formula C admitting such derivations, are themselves just canonical derivations of A and B
from A ⊗ B appended to derivations of A ⊗ B from C. This example illustrates why L is the
category of abstract proofs rather than concrete formal proofs: Only under the assumption that
the f and g in the product diagram are normal does it follow that they must proceed initally
to the product (the conjunction of A and B) and then via the projections (the elimination rules
for conjunction).
The central concepts from category theory that feature in the version of the Deduction
Theorem that we are approaching are structure-preserving maps between categories called
functors and a type of relation between functors, given by families of morphisms in their
target category, called natural transformations:
Definition A functor F from C to D (F : C −→ D) is an assignment of
• an object F(A) from ob(D) to each A in ob(C)
28
The question about how best to individuate proofs is usually traced back to a 1971 lecture of Dag Prawitz
(see especially §1). Došen 2003 is a contemporary survey of the options and obstacles in the way of a satisfactory
answer to this question for different logical systems.
29
• an arrow F(f ) : F(A) −→ F(B) from hom(D) to each f : A −→ B in hom(C)
that ‘preserves’ units and compositionality, i.e.:
1. for all A in ob(C), F(1A ) = 1F(A)
2. for all f : A −→ B and g : B −→ C in hom(C), either
(a) F(g ◦ f ) = F(g) ◦ F(f ) or
(b) F(g ◦ f ) = F(f ) ◦ F(g)
The preservation of compositionality of functors can be expressed in terms of commuta-
tive diagrams, by saying that whenever the diagram
f
A B
g
h
C
commutes, so too must one of the following diagrams:
F(f ) F(f )
F(A) F(B) F(A) F(B)
F(g) (a) F(g) (b)
F(h) F(h)
F(C) F(C)
F is called a covariant functor in case (a) and a contravariant functor in case (b).
The idea of a functor is that by preserving identity morphisms and compositionality, the
functor will also preserve, in its target category, all the other structural features of its source.
Intuitively, one can think of a functor F : C −→ D as projecting an image of C into D. Suppose
now that there are two such functors, F : C −→ D and G : C −→ D. One could ask whether
the similarity between the images of C provided by F and G is such that one can only detect
it ‘externally’ by verifying the functoriality of F and G, or if in addition there are arrows in D
that track the similarity ‘internally’.
Definition This idea is made precise by the concept of a natural transformation between F
and G: a family τ of arrows τA : F(A) −→ G(A) in hom(D) (one for each A in ob(C)) such
that for each f : A −→ B in hom(C), the following diagram commutes:
τA
F(A) G(A)
F(f ) G(f )
F(B) τB G(B)
30
A natural transformation is a family of morphisms that relate F and G by systematically
shifting the F-image of C into the G-image of C. (When, further, the inverses of each τA
together form a natural transformation between G and F, τ is called a natural isomorphism.)
Now, one of the fundamental ideas of category theory is that objects, morphisms, etc.,
can be individuated in terms just of their ‘structural’ properties, i.e., the relations they stand in
with other objects, morphisms, etc., in their environment. This idea gives rise to the concept
of objects being ‘isomorphic’, i.e., maybe not literally identical, but the same in terms of
their structural properties. The definition of isomorphism that captures the idea of sharing all
structural properties is very simple:
Definition A morphism f : A −→ B is an isomorphism if there is a morphism g : B −→ A
inverse to f in the sense that g ◦ f = 1A and f ◦ g = 1B . Two objects A and B are said to be
isomorphic if there is an isomorphism between them.
One would like to generalize this definition in a way that extends the idea of structural
identity to categories. Two categories C and D could be called isomorphic if there is a functor
F : C −→ D with inverse G : D −→ C such that G ◦ F and F ◦ G are the identity functors 1C
and 1D , respectively. In this case we would have, for every A ∈ ob(C), G ◦ F(A) = A and
for every B ∈ ob(D), F ◦ G(B) = B. But this violates the spirit of the structural approach
to individuation, according to which G ◦ F(A) should be considered the same as A so long
as they are isomorphic, even if they are not literally identical. (In Robert Goldblatt’s quip,
for categories to be structurally the same, they need only be ‘isomorphic up to isomorphism’
(2006, p. 200).)
There are two closely related ways to make this idea precise.
First, one might require that there be natural isomorphisms τ : 1C −→ G ◦ F and σ :
1D −→ F ◦ G. This is what is called an equivalence of categories. Alternatively, one might
require only ‘one-way’ natural transformations from 1C to G ◦ F and from F ◦ G to 1D , so
that the isomorphism structure is spread out across the two categories. This second option
describes what is called an adjoint situation.
Definition An adjunction between two categories C and D is a pair of functors L : C −→ D
and R : D −→ C and a pair of natural transformations η : 1C −→ R ◦ L and ε : L ◦ R −→ 1D
satisfying the universal mapping properties:
(universality of η) Given C ∈ ob(C), D ∈ ob(D), and f ∈ homC (C, R(D)), there is a
unique g ∈ homD (L(C), D) making this diagram commute:
31
ηC
C R ◦ L(C)
R(g)
f
R(D)
L(C)
When an adjoint situation holds, this is written L a R, and L is called the ‘left adjoint
functor’ of R. (Correspondingly, R is called the ‘right adjoint functor’ of L.) The natural
transformations η and ε are called the ‘unit’ and ‘counit’ of the adjunction. As suggested in
the motivating remarks above, although neither the unit nor the counit of an adjunction need
be natural isomorphisms (one has η : 1C −→ R ◦ L but typically no inverse transformation
R ◦ L −→ 1C ; ε : L ◦ R −→ 1D but no 1D −→ L ◦ R), together they give rise to an
isomorphism of another sort that captures a type of structural correspondence between C and
D.
Define a natural transformation θ componentwise for C ∈ ob(C), D ∈ ob(D), and
g ∈ homD (L(C), D) by θCD (g) = R(g) ◦ ηC and a dual natural transformation (given D, C,
and f : C −→ R(D)) τ by τDC (f ) = εD ◦ L(f ). By the universality of η and ε, each θCD and
τDC are inverses of one another (for example, any f just is R(g) ◦ ηC ). Therefore θ is actually
a natural isomorphism with components θCD : homD (L(C), D)) −→ homC (C, R(D)). The
unit η is a family of morphisms in C describing a relation between the functors R ◦ L and 1C ,
as the counit ε relates L ◦ R and 1D in D. θ operates one level removed from η and ε: It is a
family of morphisms in the category of sets, relating functors that map an arbitary C ∈ ob(C)
to homC (C, R(D)) and to homD (L(C), D).
Such a relation establishes, not an equivalence between C and D, but a reflection of the
arrow structure of C in that of D. Because θ is an isomorphism, the adjunction guarantees an
exact correspondence between C-arrows and D-arrows: given C and D, any morphism from
C to R(D) is matched uniquely with a morphism from L(C) to D. Because θ is natural in C
and D, the adjunction further guarantees that all categorial structure is preserved as C and D
vary smoothly across C and D (cf. Goldblatt 2006, p. 439).
Returning to the presentation of a logical system as the category L, what would it mean
for the structure of derivation from an additional assumption to be captured already by deriva-
tions in L? L contains only arrows sourced at single formulas, so the first stage in phrasing the
32
condition is to construct a category whose objects correspond to multiple formulas and whose
arrows correspond to (abstract) derivations from multiple hypotheses. The structure of such a
category is what we want to recover within L.
The construction needed is the product category L × L. Objects in L × L are pairs hA, Bi
of objects from ob(L), and arrows in L × L are pairs hf, gi of arrows from hom(L). It is
important to recognize that, conceived of as a logical system, L × L is a seriously deficient
characterization of reasoning from multiple hypotheses. If a formula C follows from formulas
A and B taken jointly, though from neither assumption alone, perhaps no arrow in L × L
corresponds to this fact. Our interest in L × L is just in its ability to represent having and
reasoning from multiple hypotheses, not in whether its scheme includes all such derivations
we would deem valid.
To represent pairs of formulas as objects in L by specifying an adjunction we need a
functor F that maps arbitrary objects hA, Bi in L × L into ob(L). How this functor should
handle arrows, and even which arrows it should operate on, is less obvious. However, the
other functor from the adjunction is easy to determine. It must take arbitrary objects from L
into ob(L × L). The only choice that both preserves the source data and takes advantage of
the target data-type is the ‘diagonal functor’ ∆ defined by ∆(C) = hC, Ci, ∆(f ) = hf, f i.
Now consider what it would mean for F to be left or right adjoint to ∆.
In case F a ∆, the adjunction between L×L and L establishes a correspondence between
L × L-arrows hA, Bi −→ hC, Ci and L-arrows F(hA, Bi) −→ C. Thus, there will be a
derivation of C from F(hA, Bi) precisely when there is a derivation of hC, Ci from hA, Bi, i.e.,
two derivations of C: one from A and one from B. In this adjunction, the object F(hA, Bi)
represents having available (an unknown) one of A and B from which to reason.
In case ∆ a F, the adjunction establishes a correspondence between L × L-arrows
hC, Ci −→ hA, Bi and L-arrows C −→ F(hA, Bi). Thus, there will be a derivation of
F(hA, Bi) from C precisely when there are derivations of both A and B from C. In this
adjunction, the object F(hA, Bi) represents having both A and B available as hypotheses.
The right choice for our purpose of representing the idea of reasoning from multiple
hypotheses in L is for F to be a right adjoint functor to ∆. The counit of this adjoint situation,
ε : ∆ ◦ F(X) −→ 1L×L , is a family of morphisms in L × L, i.e., a family of pairs of
L-arrows. Given hA, Bi ∈ ob(L × L), εhA,Bi : ∆ ◦ F(hA, Bi) −→ hA, Bi is just hpA :
F(hA, Bi) −→ A, pB : F(hA, Bi) −→ Bi. Therefore, the universality of ε, which given
C ∈ ob(L), hA, Bi ∈ ob(L × L), and hf, gi ∈ homL×L (F(C), hA, Bi) guarantees a unique
h ∈ homL (C, F(hA, Bi)) making this diagram commute:
33
εhA,Bi
∆ ◦ F(hA, Bi) hA, Bi
∆(h)
hf,gi
∆(C)
A pA F(hA, Bi) pB B
and we have rediscovered in F the product functor. (Analogous reasoning establishes that the
left adjoint functor to ∆ is the coproduct.)
We have verified the general fact that ∆ : C −→ C × C has a right adjoint if, and only
if, C has binary products. The interpretation of this fact for the category of proofs L is that
in order for L to have the structure needed to represent derivation from multiple assumptions,
it must have for every A and B a formula corresponding to the conjunction of A and B—a
formula A ∧ B from which there are proofs pA : A ∧ B −→ A and pB : A ∧ B −→ B such that
any proofs cA : C −→ A and cB : C −→ B must in fact be pA ◦ cA∧B and pB ◦ cA∧B for some
cA∧B : C −→ A ∧ B.
Building from this we can ask whether L has within its arrow structure a reflection of the
structure obtained by extending the underlying logic with an additional axiom: Under what
conditions is the structure of derivation from conjunctions present already in the structure
of derivation from a single conjunct? We seek a natural isomorphism between arrows A ∧
B −→ C and arrows with source object B. The adjunction giving rise to this isomorphism
will have left adjoint functor mapping objects to products. This is the ‘right product functor’
⊗ B : L −→ L which maps any formula A to A ∧ B and any arrow f : C → D to
f ⊗ 1B : C ∧ B −→ D ∧ B. Let us determine the right adjoint functor to ⊗ B, customarily
denoted B . The counit of the adjunction is again described by a natural transformation
from the composition of the left adjunct functor with the right adjunct functor to the identity
functor: εB : B ⊗ B −→ 1L . This is a family of morphisms in L: Given C ∈ ob(L),
εBC : CB ⊗ B −→ C. By the universality of εB , given A and f : A ⊗ B −→ C, there is a unique
g : A −→ CB making this diagram commute:
εB
CB ⊗ B
C
C
g⊗1B (5)
f
A⊗B
34
In the intended interpretation of L, this diagram describes a special derivation εBC from
C ∧ B to C, special in that any formula A that can ‘fill the role’ of CB by generating a
B
10. Hindsight
The thought of architects of modern logic such as Bolzano, Frege, Tarski, Herbrand, and
Gentzen can only be fully appreciated from a retrospective point of view, informed by their
contemporaries’ rival conceptions and by later developments of their work that they could not
have imagined themselves. It is possible to derive the Deduction Theorem as a property of a
canonical proof system, à la Herbrand. And if this is all one knows about the theorem, Frege’s
use of the property to demonstrate theorems in his own system appears unlicensed, careless,
and above all, mysterious.
35
But as Tarski showed, it is also possible to impose the Deduction Theorem as a constraint
on any reasonable logical system. Tarski’s abstract perspective has been hard for historians
to appreciate, as they searched for clues that he had a concrete demonstration of the fact for
some particular system—a proof that could be compared with Herbrand’s. But when one
sees in the modern theory of admissibility how several distinct consequence relations can
generate the same set of theorems, Tarski’s abstract approach is just what is called for: It is a
condition, over and above the definition of a consequence relation, needed to pick out logics
that adequately capture the concept of propositional inference.
There are, it turns out, two different ways to motivate the Deduction Theorem as a prin-
ciple of logic, rather than a property of a specific system. First, one could see it as the ‘other
side of the coin’ of modus ponens, as was Gentzen’s idea with natural deduction, where one
views modus ponens and the Deduction Theorem as together defining conditionalization via
a universal mapping property. According to this perspective, the Deduction Theorem stands
no more in need of justification than modus ponens itself, and in fact the same analysis of
conditional language justifies them both simultaneously. For Gentzen, this perspective was
hard-won against a current of thinking that distinguished logic’s object-level and meta-level,
according to which axioms and rules of the first sort can be engineered in light of concep-
tual analysis whereas relations of the second sort stand in need of mathematical verification.
Equipped with hindsight, we see that Gentzen’s dissolution of this distinction led him back to
the perspective of Bolzano, for whom defining conditionals simultaneously in terms of modus
ponens and the Deduction Theorem was natural.
Alternatively, the Deduction Theorem can arise indirectly, simply by stipulating that the
structure of reasoning from additional hypotheses be present in a logic’s theorem structure. On
this route to the Deduction Theorem, rather than being imposed from the outside, conjunction
and conditionalization are discovered as the functors required to make the desired structure
arise. But amazingly, these two constraints turn out to be interderivable: The unique right
adjunct to product is exponentiation, and, conversely, if exponentiation is definable for all
pairs, then product has a right adjunct.
Perhaps most surprisingly, this understanding of the Deduction Theorem as an adjunction
is the one most easily read back into Frege’s thought. We asked how Frege could have assumed
the Deduction Theorem and advocated its use without verification. One answer that cannot
be given is that Frege defined his conditional stroke in terms of the Deduction Theorem, as
Bolzano had done decades before and Gentzen would do again as many years later. Frege had
other, explicit definitions of the conditional stroke on offer, and the same question could just
be rephrased as a puzzle about why he was confident that his multiple definitions align. On
the other hand, as a property of the propositional fragment of Begriffsschrift, the Deduction
Theorem stands in need of just the sort of verification Herbrand would eventually provide and
Frege clearly did not seek.
36
But if Frege simply conceived of the system of Begriffsschrift as deductively complete
and structurally complete for the same reason he conceived of it as semantically complete,
his appeal to the identity (2) makes perfect sense. By semantic completeness, all laws of
thought are theorems. By structural completeness, those same laws of thought are available for
‘deducing’ things from arbitrary hypotheses, even if this can’t be described as ‘inference’. And
by deductive completeness, no relations between thoughts uncovered through these deductions
could fail to be expressed again as theorems, for theorems simply reflect those very laws of
deduction. Whether Frege’s (1883) insistence that the Begriffsschrift is ‘not a mere calculus
ratiocinator’ but instead ‘a lingua characteristica in the Leibnizian sense’ justifies any of
these assumptions is better left for others to debate. However that debate might resolve, the
rediscovery of the Deduction Theorem as an adjunction provides the hindsight to make sense
of Frege’s remarks.
37
References
Barwise, J., J. Etchemendy, and D. Barker-Plummer. 2011. Language, Proof, and Logic
(Second Edition). CSLI.
van Benthem, J. 1985. ‘The variety of consequence, according to Bolzano’, Studia Logica,
44(4), 389–403.
Bernays, P. 1918. Beiträge zur axiomatischen Behandlung des Logik-Kalküls, Habilitationss-
chrift, University of Göttingen.
Bolzano, B. 1837. Wissenschaftslehre. Versuch einer ausführlichen und größtentheils neuen
Darstellung der Logik mit steter Rücksicht auf deren bisherige Bearbeiter. Sulzbach: Seidel.
Trans. by B. Terrel in B. Bolzano Theory of Science, J. Berg (ed.) 1973. Boston: D. Reidel
Publishing Company.
Bynum, T. W. (ed.) 1972. Conceptual Notation and Related Articles, New York: Oxford
University Press.
Chagrov A. V. 1992. ‘A decidable modal logic with the undecidable admissibility problem for
inference rules’, Algebra and Logic, 31, 53–55.
Church, A. 1947. ‘Review of W. V. Quine A Short Course in Logic’, Journal of Symbolic
Logic, 12(2), 60–61.
Czelakowski, J. 1985. ‘Algebraic aspects of deduction theorems’, Studia Logica, 44(4), 369–
87.
Došen, K. 1996. ‘Deductive completeness’, Bulletin of Symbolic Logic, 2(3), 243–83.
Došen, K. 2003. ‘Identity of proofs based on normalization and generality’, Bulletin of Sym-
bolic Logic, 9(4), 477–503.
Došen, K. 2006. ‘Models of deduction’, Synthese, 148(3), 639–57.
Ewald, W. and W. Sieg (eds.) 2010. David Hilbert’s Lectures on the Foundations of Arithmetic
and Logic, 1917-1933, vol. 3. Springer.
Feferman, S., J. W. Dawson, Jr., S.C. Kleene, G. H. Moore, R. M. Solovay, and J. van Hei-
jenoort (eds.) 1986. Kurt Gödel, Collected Works, Vol I: Publications 1929-1936, New York:
Oxford University Press.
Franks, C. 2010. ‘Cut as consequence’, History and Philosophy of Logic, 31(4), 349–79.
Franks, C. 2014. ‘Logical completeness, form, and content: an archaeology’, in J. Kennedy
(ed.) Interpreting Gödel: Critical Essays. New York: Cambridge University Press.
38
Franks, C. 2017. ‘Hilbert’s logic’, in A. P. Malpass and M. Antonutti-Marfori (eds.) The
History of Philosophical and Formal Logic. Bloomsbury.
Franks, C. 2018, ‘The context of inference’, History and Philosophy of Logic, 39(4), 365–95.
Frege, G. 1879. Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des
reinen Denkens Halle: L. Nebert. Translated by T. W. Bynum as Conceptual Notation: a
formula language of pure thought modeled upon the formula language of arithmetic in Bynum
1972, 101-208.
Frege, G. 1883. ‘Über den Zweck der Begriffsschrift’, Sitzungsberichte der Jenaischen Gesellschaft
für Medicin und Naturwissenschaft, JZN 16, 1-10. Translated by T. W. Bynum as ‘On the aim
of the “Conceptual Notation”’ in Bynum 1972, 90-100.
Frege, G. 1910. ‘Letter to Jourdain’, translated and reprinted in Frege 1980.
Frege, G. 1917. ‘Letter to Dingler’, translated and reprinted in Frege 1980.
Frege, G. 1980. Philosophical and Mathematical Correspondence, G. Gabriel, et al. (eds.)
Oxford: Blackwell Publishers.
Gentzen, G. 1932. ‘Über die Existenz unabhängiger Axiomensysteme zu unendlichen Satzsys-
temen’, Mathematische Annalen 107, 329-50. Translated as ‘On the existence of independent
axiomsystems for infinite sentence systems’ in Szabo 1969, 29-52.
Gentzen, G. 1934–35. ‘Untersuchungen über das logische Schließen’. Gentzen’s doctoral
thesis at the University of Göttingen, translated as ‘Investigations into logical deduction’ in
Szabo 1969, 68-131.
Gödel, K. 1931. ‘Über formal unentscheidbare Sätze der Principia Mathematica und ver-
wandter Systeme I’, Monatshefte für Mathematik und Physik 38, 173-98, translation by J. van
Heijenoort as ‘On formally undecidable propositions of Principia Mathematica and related
systems I’ reprinted in Feferman et al. 1986, 144-95.
Gödel, K. 1929. ‘Über die Vollständigkeit des Logikkalküls’, Gödel’s doctoral thesis at the
University of Vienna, translation by S. Bauer-Mengelberg and Jean van Heijenoort as ‘On the
completeness of the calculus of logic’ reprinted in Feferman et al. 1986, 60-101.
Goldblatt, R. 2006. Topoi: the Categorical Analysis of Logic. Revised edition. Dover.
Goldfarb, W. (ed.) 1971. Jacques Herbrand: Logical Writings. Cambridge: Harvard Univer-
sity Press.
Harnick, V. and M. Makkai. 1992. ‘Lambek’s categorical proof theory and Läuchli’s abstract
realizability’, Journal of Symbolic Logic, 57(1), 200–230.
39
Harrop, R. 1956. ‘On disjunctions and existential statements in intuitionistic systems of logic’,
Mathematische Annalen, 132, 347–61.
Herbrand, J 1929. ‘Sur quelques propriétés des propositions vraies et leurs applications’.
Translated by W. Goldfarb as ‘On several properties of true propositions and their applications’
in Goldfarb 1971, 38–40.
Herbrand, J. 1930. Recherches sur la théorie de la démonstration. Herbrand’s doctoral thesis
at the University of Paris. Translated by W. Goldfarb, except pp. 133-88 trans. by B. Dreben
and J. van Heijenoort, as ‘Investigations in proof theory’ in Goldfarb 1971, 44-202.
Hertz, P. 1929. ‘Über Axiomensysteme für beliebige Satzsysteme’, Mathematische Annalen,
101, 457–514.
Iemhoff, R. 2016. ‘Consequence relations and admissible rules’, Journal of Philosophical
Logic, 45 (3), 327–48.
Jaśkowski, S. ‘On the rules of supposition in formal logic’, Studia Logica, 1, 5–32.
Johansson, I. 1937. ‘Der Minimalkalkül, ein reduzierter intuitionistischer Formalismus’.
Compositio Mathematica 4, 119-136.
Kaye, R. 2007. The Mathematics of Logic. New York: Cambridge University Press.
Kreisel, G. and H. Putnam 1957. ‘Eine Unableitbarkeitsbeweismethode für den Intuitionistis-
chen Aussagenkalkül’, Zeitschrift für Mathematische Logik und Grundlagen der Mathematik
3, 74–78.
Lambek, J. 1968. ‘Deductive systems and categories I’, Mathematical Systems Theory, 2,
287–318.
Lambek, J. 1969. ‘Deductive systems and categories II: Standard constructions and closed
categories’, Lecture Notes in Mathematics (Category theory, homology theory and their ap-
plications), 86. Berlin: Springer-Verlag, 76–122.
Lambek, J. 1972. ‘Deductive systems and categories III: Cartesian closed categories, intuition-
ist propositional calculus and combinatory logic’, Lecture Notes in Mathematics (Toposes,
algebraic geometry and logic), 274. Berlin: Springer-Verlag, 57–82.
Lambek, J. 1974. ‘Functional completeness of cartesian categories’, Annals of Mathematical
Logic, 6, 259–92.
Lambek, J. and P. J. Scott. 1986. Introduction to Higher-order Categorical Logic. New York:
Cambridge University Press.
Łukasiewicz, J. and A. Tarski. 1930. ‘Untersuchungen über den Aussagenkalkül’. Comptes
40
rendus de la Société des sciences et des lettres de Varsovie, cl. iii, 23, 1–21. Translated by J.
H. Woodger as ‘Investigations into the sentential calculus’ in Tarski 1956, 38–59.
van der Molen, T. 2016. ‘The Johansson/Heyting letters and the birth of minimal logic’,
Institute for Logic, Language, and Computation, Amsterdam.
Pogorzelski, W. A. 1968. ‘On the scope of the classical deduction theorem’, Journal of Sym-
bolic Logic, 33(1), 77–81.
Porte, J. 1982. ‘Fifty years of deduction theorems’, Studies in Logic and the Foundations of
Mathematics, 107, 243–50.
Prawitz, D. 1971. ‘Towards a foundation of a general proof theory’, in J. E. Fenstad (ed.)
Ideas and Results in Proof Theory. Amsterdam: North Holland. 235–307.
Quine, W. V. O. 1950. ‘On natural deduction’, Journal of Symbolic Logic, 15(2), 93–102.
Quine, W. V. O. 1951. Mathematical Logic (Revised Edition). Cambridge: Harvard University
Press.
Quine, W. V. O. 1995. Selected Logic Papers. Enlarged edition. Cambridge: Harvard Univer-
sity Press.
Rose, G. F. 1953. ‘Propositional calculus and realizability’, Transactions of the American
Mathematical Society, 75, 1–19.
Rybakov, V. 1997. Admissibility of logical inference rules. Amsterdam: Elsevier.
Šebestik, J. 2016. ‘Bolzano’s logic’, Stanford Encyclopedia of Philosophy:
https://plato.stanford.edu/entries/bolzano-logic/
Szabo, M. E. 1969. The Collected Papers of Gerhard Gentzen, London: North Holland.
Tarski, A. 1930. ‘Über einige fundamentale Begriffe der Metamathematik’. Translated by J.
H. Woodger as ‘On some fundamental concepts of metamathematics’ in Tarski 1956, 30–37.
Tarski, A. 1933. ‘Einige Betrachtungen über die Begriffe ω-Widerspruchsfreiheit und der
ω-Vollständigkeit’, Monatshefte für Mathematik und Physik, 40.
Tarski, A. 1936. ‘On the concept of logical consequence’, first English translation of a 1935
address at the International Congres of Scientific Philosophy in Paris, in Tarski 1956.
Tarski, A. 1956. Logic, Semantics, Metamathematics. J. H. Woodger (trans.). New York:
Oxford University Press.
41