Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
211 views89 pages

Applied Combinatorics Roberts

Applied Combinatorics, Second Edition by Fred S. Roberts and Barry Tesman, provides a comprehensive overview of combinatorial analysis, including fundamental concepts, counting rules, and applications in graph theory. The book is structured into chapters covering various topics such as permutations, combinations, probability, and graph coloring. It serves as a resource for understanding the principles and techniques of combinatorics, supported by exercises and references.

Uploaded by

Jeeho Choe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
211 views89 pages

Applied Combinatorics Roberts

Applied Combinatorics, Second Edition by Fred S. Roberts and Barry Tesman, provides a comprehensive overview of combinatorial analysis, including fundamental concepts, counting rules, and applications in graph theory. The book is structured into chapters covering various topics such as permutations, combinations, probability, and graph coloring. It serves as a resource for understanding the principles and techniques of combinatorics, supported by exercises and references.

Uploaded by

Jeeho Choe
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 89

Applied Combinatorics

SECOND EDITION
Applied Combinatorics
SECOND EDITION

FRED S. ROBERTS
BARRY TESMAN
Chapman & Hall/CRC
Taylor & Francis Group
6000 Broken Sound Parkway NW, Suite 300
Boca Raton, FL 33487-2742
© 2009 by Taylor and Francis Group, LLC
Chapman & Hall/CRC is an imprint of Taylor & Francis Group, an Informa business

No claim to original U.S. Government works

Printed in the United States of America on acid-free paper


10 9 8 7 6 5 4 3 2 1

International Standard Book Number: 978-1-4200-9982-9 (Hardback)

This book contains information obtained from authentic and highly regarded sources. Reasonable efforts have been
made to publish reliable data and information, but the author and publisher cannot assume responsibility for the valid-
ity of all materials or the consequences of their use. The authors and publishers have attempted to trace the copyright
holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this
form has not been obtained. If any copyright material has not been acknowledged please write and let us know so we may
rectify in any future reprint.

Except as permitted under U.S. Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or uti-
lized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopy-
ing, microfilming, and recording, or in any information storage or retrieval system, without written permission from the
publishers.

For permission to photocopy or use material electronically from this work, please access www.copyright.com (http://
www.copyright.com/) or contact the Copyright Clearance Center, Inc. (CCC), 222 Rosewood Drive, Danvers, MA 01923,
978-750-8400. CCC is a not-for-profit organization that provides licenses and registration for a variety of users. For
organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged.

Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for
identification and explanation without intent to infringe.

Library of Congress Cataloging-in-Publication Data

Roberts, Fred S.
Applied combinatorics/ Fred Roberts. -- 2nd ed. / Barry Tesman.
p. cm.
Originally published: 2nd ed. Upper Saddle River, N.J. : Pearson Education/Prentice-Hall, c2005.
Includes bibliographical references and index.
ISBN 978-1-4200-9982-9 (hardcover : alk. paper)
1. Combinatorial analysis. I. Tesman, Barry. II. Title.

QA164.R6 2009
511’.6--dc22 2009013043

Visit the Taylor & Francis Web site at


http://www.taylorandfrancis.com
and the CRC Press Web site at
http://www.crcpress.com
To: Helen, To: Johanna,
David, and Emma, and
Sarah Lucy

-F.S.R. -B.T.

v
Contents
Preface xvii
Notation xxvii
1 What Is Combinatorics? 1
1.1 The Three Problems of Combinatorics . . . . . . . . . . . . . . . . . 1
1.2 The History and Applications of Combinatorics . . . . . . . . . . . . 8
References for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
PART I The Basic Tools of Combinatorics 15
2 Basic Counting Rules 15
2.1 The Product Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 The Sum Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.3 Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Complexity of Computation . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 r-Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.6 Subsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.7 r-Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.8 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.9 Sampling with Replacement . . . . . . . . . . . . . . . . . . . . . . . 47
2.10 Occupancy Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.10.1 The Types of Occupancy Problems . . . . . . . . . . . . . . . 51
2.10.2 Case 1: Distinguishable Balls and Distinguishable Cells . . . 53
2.10.3 Case 2: Indistinguishable Balls and Distinguishable Cells . . 53
2.10.4 Case 3: Distinguishable Balls and Indistinguishable Cells . . 54
2.10.5 Case 4: Indistinguishable Balls and Indistinguishable Cells . 55
2.10.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.11 Multinomial Coecients . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.11.1 Occupancy Problems with a Specied Distribution . . . . . . 59
2.11.2 Permutations with Classes of Indistinguishable Objects . . . 62
2.12 Complete Digest by Enzymes . . . . . . . . . . . . . . . . . . . . . . 64

vii
viii Contents

2.13 Permutations with Classes of Indistinguishable Objects Revisited . . 68


2.14 The Binomial Expansion . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.15 Power in Simple Games . . . . . . . . . . . . . . . . . . . . . . . . . 73
2.15.1 Examples of Simple Games . . . . . . . . . . . . . . . . . . . 73
2.15.2 The Shapley-Shubik Power Index . . . . . . . . . . . . . . . . 75
2.15.3 The U.N. Security Council . . . . . . . . . . . . . . . . . . . 78
2.15.4 Bicameral Legislatures . . . . . . . . . . . . . . . . . . . . . . 78
2.15.5 Cost Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 79
2.15.6 Characteristic Functions . . . . . . . . . . . . . . . . . . . . . 80
2.16 Generating Permutations and Combinations . . . . . . . . . . . . . . 84
2.16.1 An Algorithm for Generating Permutations . . . . . . . . . . 84
2.16.2 An Algorithm for Generating Subsets of Sets . . . . . . . . . 86
2.16.3 An Algorithm for Generating Combinations . . . . . . . . . 88
2.17 Inversion Distance Between Permutations and the Study of
Mutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
2.18 Good Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2.18.1 Asymptotic Analysis . . . . . . . . . . . . . . . . . . . . . . . 96
2.18.2 NP-Complete Problems . . . . . . . . . . . . . . . . . . . . . 99
2.19 Pigeonhole Principle and Its Generalizations . . . . . . . . . . . . . . 101
2.19.1 The Simplest Version of the Pigeonhole Principle . . . . . . . 101
2.19.2 Generalizations and Applications of the Pigeonhole
Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
2.19.3 Ramsey Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 106
Additional Exercises for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . 111
References for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
3 Introduction to Graph Theory 119
3.1 Fundamental Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.1.1 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 119
3.1.2 Denition of Digraph and Graph . . . . . . . . . . . . . . . . 124
3.1.3 Labeled Digraphs and the Isomorphism Problem . . . . . . . 127
3.2 Connectedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
3.2.1 Reaching in Digraphs . . . . . . . . . . . . . . . . . . . . . . 133
3.2.2 Joining in Graphs . . . . . . . . . . . . . . . . . . . . . . . . 135
3.2.3 Strongly Connected Digraphs and Connected Graphs . . . . . 135
3.2.4 Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
3.2.5 Connected Components . . . . . . . . . . . . . . . . . . . . . 138
3.3 Graph Coloring and Its Applications . . . . . . . . . . . . . . . . . . 145
3.3.1 Some Applications . . . . . . . . . . . . . . . . . . . . . . . . 145
3.3.2 Planar Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 151
3.3.3 Calculating the Chromatic Number . . . . . . . . . . . . . . . 154
3.3.4 2-Colorable Graphs . . . . . . . . . . . . . . . . . . . . . . . . 155
Contents ix

3.3.5 Graph-Coloring Variants . . . . . . . . . . . . . . . . . . . . . 159


3.4 Chromatic Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . 172
3.4.1 Denitions and Examples . . . . . . . . . . . . . . . . . . . . 172
3.4.2 Reduction Theorems . . . . . . . . . . . . . . . . . . . . . . . 175
3.4.3 Properties of Chromatic Polynomials . . . . . . . . . . . . . . 179
3.5 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
3.5.1 Denition of a Tree and Examples . . . . . . . . . . . . . . . 185
3.5.2 Properties of Trees . . . . . . . . . . . . . . . . . . . . . . . . 188
3.5.3 Proof of Theorem 3.15 . . . . . . . . . . . . . . . . . . . . . . 188
3.5.4 Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . 189
3.5.5 Proof of Theorem 3.16 and a Related Result . . . . . . . . . 192
3.5.6 Chemical Bonds and the Number of Trees . . . . . . . . . . . 193
3.5.7 Phylogenetic Tree Reconstruction . . . . . . . . . . . . . . . . 196
3.6 Applications of Rooted Trees to Searching, Sorting, and
Phylogeny Reconstruction . . . . . . . . . . . . . . . . . . . . . . . . 202
3.6.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
3.6.2 Search Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
3.6.3 Proof of Theorem 3.24 . . . . . . . . . . . . . . . . . . . . . . 206
3.6.4 Sorting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
3.6.5 The Perfect Phylogeny Problem . . . . . . . . . . . . . . . . . 211
3.7 Representing a Graph in the Computer . . . . . . . . . . . . . . . . 219
3.8 Ramsey Numbers Revisited . . . . . . . . . . . . . . . . . . . . . . . 224
References for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
4 Relations 235
4.1 Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
4.1.1 Binary Relations . . . . . . . . . . . . . . . . . . . . . . . . . 235
4.1.2 Properties of Relations/Patterns in Digraphs . . . . . . . . . 240
4.2 Order Relations and Their Variants . . . . . . . . . . . . . . . . . . . 247
4.2.1 Dening the Concept of Order Relation . . . . . . . . . . . . 247
4.2.2 The Diagram of an Order Relation . . . . . . . . . . . . . . . 250
4.2.3 Linear Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
4.2.4 Weak Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
4.2.5 Stable Marriages . . . . . . . . . . . . . . . . . . . . . . . . . 256
4.3 Linear Extensions of Partial Orders . . . . . . . . . . . . . . . . . . . 260
4.3.1 Linear Extensions and Dimension . . . . . . . . . . . . . . . . 260
4.3.2 Chains and Antichains . . . . . . . . . . . . . . . . . . . . . . 265
4.3.3 Interval Orders . . . . . . . . . . . . . . . . . . . . . . . . . . 270
4.4 Lattices and Boolean Algebras . . . . . . . . . . . . . . . . . . . . . 274
4.4.1 Lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
4.4.2 Boolean Algebras . . . . . . . . . . . . . . . . . . . . . . . . . 276
References for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
x Contents

PART II The Counting Problem 285


5 Generating Functions and Their Applications 285
5.1 Examples of Generating Functions . . . . . . . . . . . . . . . . . . . 285
5.1.1 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . 286
5.1.2 Generating Functions . . . . . . . . . . . . . . . . . . . . . . 288
5.2 Operating on Generating Functions . . . . . . . . . . . . . . . . . . . 297
5.3 Applications to Counting . . . . . . . . . . . . . . . . . . . . . . . . 302
5.3.1 Sampling Problems . . . . . . . . . . . . . . . . . . . . . . . . 302
5.3.2 A Comment on Occupancy Problems . . . . . . . . . . . . . . 309
5.4 The Binomial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . 312
5.5 Exponential Generating Functions and Generating Functions for
Permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
5.5.1 Denition of Exponential Generating Function . . . . . . . . 320
5.5.2 Applications to Counting Permutations . . . . . . . . . . . . 321
5.5.3 Distributions of Distinguishable Balls into Indistinguishable
Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
5.6 Probability Generating Functions . . . . . . . . . . . . . . . . . . . . 328
5.7 The Coleman and Banzhaf Power Indices . . . . . . . . . . . . . . . 333
References for Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337
6 Recurrence Relations 339
6.1 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339
6.1.1 Some Simple Recurrences . . . . . . . . . . . . . . . . . . . . 339
6.1.2 Fibonacci Numbers and Their Applications . . . . . . . . . . 346
6.1.3 Derangements . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
6.1.4 Recurrences Involving More than One Sequence . . . . . . . . 354
6.2 The Method of Characteristic Roots . . . . . . . . . . . . . . . . . . 360
6.2.1 The Case of Distinct Roots . . . . . . . . . . . . . . . . . . . 360
6.2.2 Computation of the kth Fibonacci Number . . . . . . . . . . 363
6.2.3 The Case of Multiple Roots . . . . . . . . . . . . . . . . . . . 364
6.3 Solving Recurrences Using Generating Functions . . . . . . . . . . 369
6.3.1 The Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
6.3.2 Derangements . . . . . . . . . . . . . . . . . . . . . . . . . . . 375
6.3.3 Simultaneous Equations for Generating Functions . . . . . . 377
6.4 Some Recurrences Involving Convolutions . . . . . . . . . . . . . . . 382
6.4.1 The Number of Simple, Ordered, Rooted Trees . . . . . . . . 382
6.4.2 The Ways to Multiply a Sequence of Numbers in a
Computer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
6.4.3 Secondary Structure in RNA . . . . . . . . . . . . . . . . . . 389
Contents xi

6.4.4 Organic Compounds Built Up from Benzene Rings . . . . . . 391


References for Chapter 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
7 The Principle of Inclusion and Exclusion 403
7.1 The Principle and Some of Its Applications . . . . . . . . . . . . . . 403
7.1.1 Some Simple Examples . . . . . . . . . . . . . . . . . . . . . 403
7.1.2 Proof of Theorem 6.1 . . . . . . . . . . . . . . . . . . . . . . 406
7.1.3 Prime Numbers, Cryptography, and Sieves . . . . . . . . . . 407
7.1.4 The Probabilistic Case . . . . . . . . . . . . . . . . . . . . . . 412
7.1.5 The Occupancy Problem with Distinguishable Balls and
Cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
7.1.6 Chromatic Polynomials . . . . . . . . . . . . . . . . . . . . . 414
7.1.7 Derangements . . . . . . . . . . . . . . . . . . . . . . . . . . . 417
7.1.8 Counting Combinations . . . . . . . . . . . . . . . . . . . . . 418
7.1.9 Rook Polynomials . . . . . . . . . . . . . . . . . . . . . . . . 419
7.2 The Number of Objects Having Exactly m Properties . . . . . . . . 425
7.2.1 The Main Result and Its Applications . . . . . . . . . . . . . 425
7.2.2 Proofs of Theorems 7.4 and 7.5 . . . . . . . . . . . . . . . . . 431
References for Chapter 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
8 The Polya Theory of Counting 439
8.1 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . 439
8.1.1 Distinct Congurations and Databases . . . . . . . . . . . . . 439
8.1.2 Denition of Equivalence Relations . . . . . . . . . . . . . . . 440
8.1.3 Equivalence Classes . . . . . . . . . . . . . . . . . . . . . . . 445
8.2 Permutation Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . 449
8.2.1 Denition of a Permutation Group . . . . . . . . . . . . . . . 449
8.2.2 The Equivalence Relation Induced by a Permutation Group . 452
8.2.3 Automorphisms of Graphs . . . . . . . . . . . . . . . . . . . . 453
8.3 Burnside's Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
8.3.1 Statement of Burnside's Lemma . . . . . . . . . . . . . . . . 457
8.3.2 Proof of Burnside's Lemma . . . . . . . . . . . . . . . . . . . 459
8.4 Distinct Colorings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462
8.4.1 Denition of a Coloring . . . . . . . . . . . . . . . . . . . . . 462
8.4.2 Equivalent Colorings . . . . . . . . . . . . . . . . . . . . . . . 464
8.4.3 Graph Colorings Equivalent under Automorphisms . . . . . . 466
8.4.4 The Case of Switching Functions . . . . . . . . . . . . . . . . 467
8.5 The Cycle Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472
8.5.1 Permutations as Products of Cycles . . . . . . . . . . . . . . 472
8.5.2 A Special Case of Polya's Theorem . . . . . . . . . . . . . . . 474
8.5.3 Graph Colorings Equivalent under Automorphisms
Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475
xii Contents

8.5.4 The Case of Switching Functions . . . . . . . . . . . . . . . . 476


8.5.5 The Cycle Index of a Permutation Group . . . . . . . . . . . 476
8.5.6 Proof of Theorem 8.6 . . . . . . . . . . . . . . . . . . . . . . 477
8.6 Polya's Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480
8.6.1 The Inventory of Colorings . . . . . . . . . . . . . . . . . . . 480
8.6.2 Computing the Pattern Inventory . . . . . . . . . . . . . . . . 482
8.6.3 The Case of Switching Functions . . . . . . . . . . . . . . . . 484
8.6.4 Proof of Polya's Theorem . . . . . . . . . . . . . . . . . . . . 485
References for Chapter 8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
PART III The Existence Problem 489
9 Combinatorial Designs 489
9.1 Block Designs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489
9.2 Latin Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494
9.2.1 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 494
9.2.2 Orthogonal Latin Squares . . . . . . . . . . . . . . . . . . . . 497
9.2.3 Existence Results for Orthogonal Families . . . . . . . . . . . 500
9.2.4 Proof of Theorem 9.3 . . . . . . . . . . . . . . . . . . . . . . 505
9.2.5 Orthogonal Arrays with Applications to Cryptography . . . . 506
9.3 Finite Fields and Complete Orthogonal Families of Latin Squares . . 513
9.3.1 Modular Arithmetic . . . . . . . . . . . . . . . . . . . . . . . 513
9.3.2 Modular Arithmetic and the RSA Cryptosystem . . . . . . . 514
9.3.3 The Finite Fields GF(pk ) . . . . . . . . . . . . . . . . . . . . 516
9.3.4 Construction of a Complete Orthogonal Family of n n Latin
Squares if n Is a Power of a Prime . . . . . . . . . . . . . . . 519
9.3.5 Justication of the Construction of a Complete Orthogonal
Family if n = pk . . . . . . . . . . . . . . . . . . . . . . . . . 521
9.4 Balanced Incomplete Block Designs . . . . . . . . . . . . . . . . . . . 525
9.4.1 (b v r k )-Designs . . . . . . . . . . . . . . . . . . . . . . . 525
9.4.2 Necessary Conditions for the Existence of
(b v r k )-Designs . . . . . . . . . . . . . . . . . . . . . . . 528
9.4.3 Proof of Fisher's Inequality . . . . . . . . . . . . . . . . . . . 530
9.4.4 Resolvable Designs . . . . . . . . . . . . . . . . . . . . . . . . 532
9.4.5 Steiner Triple Systems . . . . . . . . . . . . . . . . . . . . . . 533
9.4.6 Symmetric Balanced Incomplete Block Designs . . . . . . . . 536
9.4.7 Building New (b v r k )-Designs from Existing Ones . . . . 537
9.4.8 Group Testing and Its Applications . . . . . . . . . . . . . . . 539
9.4.9 Steiner Systems and the National Lottery . . . . . . . . . . . 542
9.5 Finite Projective Planes . . . . . . . . . . . . . . . . . . . . . . . . . 549
9.5.1 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . 549
Contents xiii

9.5.2 Projective Planes, Latin Squares, and (v k )-Designs . . . . 553


References for Chapter 9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 558
10 Coding Theory 561
10.1 Information Transmission . . . . . . . . . . . . . . . . . . . . . . . . 561
10.2 Encoding and Decoding . . . . . . . . . . . . . . . . . . . . . . . . . 562
10.3 Error-Correcting Codes . . . . . . . . . . . . . . . . . . . . . . . . . 567
10.3.1 Error Correction and Hamming Distance . . . . . . . . . . . 567
10.3.2 The Hamming Bound . . . . . . . . . . . . . . . . . . . . . . 570
10.3.3 The Probability of Error . . . . . . . . . . . . . . . . . . . . . 571
10.3.4 Consensus Decoding and Its Connection to Finding Patterns
in Molecular Sequences . . . . . . . . . . . . . . . . . . . . . 573
10.4 Linear Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 582
10.4.1 Generator Matrices . . . . . . . . . . . . . . . . . . . . . . . . 582
10.4.2 Error Correction Using Linear Codes . . . . . . . . . . . . . . 584
10.4.3 Hamming Codes . . . . . . . . . . . . . . . . . . . . . . . . . 587
10.5 The Use of Block Designs to Find Error-Correcting Codes . . . . . . 591
10.5.1 Hadamard Codes . . . . . . . . . . . . . . . . . . . . . . . . . 591
10.5.2 Constructing Hadamard Designs . . . . . . . . . . . . . . . . 592
10.5.3 The Richest (n d)-Codes . . . . . . . . . . . . . . . . . . . . 597
10.5.4 Some Applications . . . . . . . . . . . . . . . . . . . . . . . . 602
References for Chapter 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . 605
11 Existence Problems in Graph Theory 609
11.1 Depth-First Search: A Test for Connectedness . . . . . . . . . . . . . 610
11.1.1 Depth-First Search . . . . . . . . . . . . . . . . . . . . . . . . 610
11.1.2 The Computational Complexity of Depth-First Search . . . . 612
11.1.3 A Formal Statement of the Algorithm . . . . . . . . . . . . . 612
11.1.4 Testing for Connectedness of Truly Massive Graphs . . . . . 613
11.2 The One-Way Street Problem . . . . . . . . . . . . . . . . . . . . . . 616
11.2.1 Robbins' Theorem . . . . . . . . . . . . . . . . . . . . . . . . 616
11.2.2 A Depth-First Search Algorithm . . . . . . . . . . . . . . . . 619
11.2.3 Ecient One-Way Street Assignments . . . . . . . . . . . . . 621
11.2.4 Ecient One-Way Street Assignments for Grids . . . . . . . . 623
11.2.5 Annular Cities and Communications in Interconnection
Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 625
11.3 Eulerian Chains and Paths . . . . . . . . . . . . . . . . . . . . . . . 632
11.3.1 The Konigsberg Bridge Problem . . . . . . . . . . . . . . . . 632
11.3.2 An Algorithm for Finding an Eulerian Closed Chain . . . . . 633
11.3.3 Further Results about Eulerian Chains and Paths . . . . . . 635
11.4 Applications of Eulerian Chains and Paths . . . . . . . . . . . . . . . 640
11.4.1 The \Chinese Postman" Problem . . . . . . . . . . . . . . . . 640
xiv Contents

11.4.2 Computer Graph Plotting . . . . . . . . . . . . . . . . . . . . 642


11.4.3 Street Sweeping . . . . . . . . . . . . . . . . . . . . . . . . . . 642
11.4.4 Finding Unknown RNA/DNA Chains . . . . . . . . . . . . . 645
11.4.5 A Coding Application . . . . . . . . . . . . . . . . . . . . . . 648
11.4.6 De Bruijn Sequences and Telecommunications . . . . . . . . . 650
11.5 Hamiltonian Chains and Paths . . . . . . . . . . . . . . . . . . . . . 656
11.5.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 656
11.5.2 Sucient Conditions for the Existence of a Hamiltonian
Circuit in a Graph . . . . . . . . . . . . . . . . . . . . . . . . 658
11.5.3 Sucient Conditions for the Existence of a Hamiltonian Cycle
in a Digraph . . . . . . . . . . . . . . . . . . . . . . . . . . . 660
11.6 Applications of Hamiltonian Chains and Paths . . . . . . . . . . . . 666
11.6.1 Tournaments . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
11.6.2 Topological Sorting . . . . . . . . . . . . . . . . . . . . . . . . 669
11.6.3 Scheduling Problems in Operations Research . . . . . . . . . 670
11.6.4 Facilities Design . . . . . . . . . . . . . . . . . . . . . . . . . 671
11.6.5 Sequencing by Hybridization . . . . . . . . . . . . . . . . . . 673
References for Chapter 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . 678
PART IV Combinatorial Optimization 683
12 Matching and Covering 683
12.1 Some Matching Problems . . . . . . . . . . . . . . . . . . . . . . . . 683
12.2 Some Existence Results: Bipartite Matching and Systems of Distinct
Representatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 690
12.2.1 Bipartite Matching . . . . . . . . . . . . . . . . . . . . . . . . 690
12.2.2 Systems of Distinct Representatives . . . . . . . . . . . . . . 692
12.3 The Existence of Perfect Matchings for Arbitrary Graphs . . . . . . 699
12.4 Maximum Matchings and Minimum Coverings . . . . . . . . . . . . 702
12.4.1 Vertex Coverings . . . . . . . . . . . . . . . . . . . . . . . . . 702
12.4.2 Edge Coverings . . . . . . . . . . . . . . . . . . . . . . . . . . 704
12.5 Finding a Maximum Matching . . . . . . . . . . . . . . . . . . . . . 706
12.5.1 M-Augmenting Chains . . . . . . . . . . . . . . . . . . . . . . 706
12.5.2 Proof of Theorem 12.7 . . . . . . . . . . . . . . . . . . . . . . 707
12.5.3 An Algorithm for Finding a Maximum Matching . . . . . . . 709
12.6 Matching as Many Elements of X as Possible . . . . . . . . . . . . . 714
12.7 Maximum-Weight Matching . . . . . . . . . . . . . . . . . . . . . . . 716
12.7.1 The \Chinese Postman" Problem Revisited . . . . . . . . . . 717
12.7.2 An Algorithm for the Optimal Assignment Problem
(Maximum-Weight Matching) . . . . . . . . . . . . . . . . . . 718
12.8 Stable Matchings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
12.8.1 Gale-Shapley Algorithm . . . . . . . . . . . . . . . . . . . . . 726
Contents xv

12.8.2 Numbers of Stable Matchings . . . . . . . . . . . . . . . . . . 727


12.8.3 Structure of Stable Matchings . . . . . . . . . . . . . . . . . . 729
12.8.4 Stable Marriage Extensions . . . . . . . . . . . . . . . . . . . 731
References for Chapter 12 . . . . . . . . . . . . . . . . . . . . . . . . . . . 735
13 Optimization Problems for Graphs and Networks 737
13.1 Minimum Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . 737
13.1.1 Kruskal's Algorithm . . . . . . . . . . . . . . . . . . . . . . . 737
13.1.2 Proof of Theorem 13.1 . . . . . . . . . . . . . . . . . . . . . . 740
13.1.3 Prim's Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 741
13.2 The Shortest Route Problem . . . . . . . . . . . . . . . . . . . . . . 745
13.2.1 The Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 745
13.2.2 Dijkstra's Algorithm . . . . . . . . . . . . . . . . . . . . . . . 748
13.2.3 Applications to Scheduling Problems . . . . . . . . . . . . . . 751
13.3 Network Flows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757
13.3.1 The Maximum-Flow Problem . . . . . . . . . . . . . . . . . . 757
13.3.2 Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 760
13.3.3 A Faulty Max-Flow Algorithm . . . . . . . . . . . . . . . . . 763
13.3.4 Augmenting Chains . . . . . . . . . . . . . . . . . . . . . . . 764
13.3.5 The Max-Flow Algorithm . . . . . . . . . . . . . . . . . . . . 768
13.3.6 A Labeling Procedure for Finding Augmenting Chains . . . . 770
13.3.7 Complexity of the Max-Flow Algorithm . . . . . . . . . . . . 772
13.3.8 Matching Revisited . . . . . . . . . . . . . . . . . . . . . . . . 773
13.3.9 Menger's Theorems . . . . . . . . . . . . . . . . . . . . . . . . 776
13.4 Minimum-Cost Flow Problems . . . . . . . . . . . . . . . . . . . . . 785
13.4.1 Some Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 785
References for Chapter 13 . . . . . . . . . . . . . . . . . . . . . . . . . . . 792
Appendix: Answers to Selected Exercises 797
Author Index 833
Subject Index 841
Preface
The second edition of Applied Combinatorics comes 20 years after the rst one. It
has been substantially rewritten with more than 200 pages of new material and sig-
nicant changes in numerous sections. There are many new examples and exercises.
On the other hand, the main philosophy of the book is unchanged. The following
three paragraphs are from the preface of the rst edition and these words still ring
true today.
Perhaps the fastest growing area of modern mathematics is combinatorics.
A major reason for this rapid growth is its wealth of applications, to com-
puter science, communications, transportation, genetics, experimental design,
scheduling, and so on. This book introduces the reader to the tools of combi-
natorics from an applied point of view.
Much of the growth of combinatorics has gone hand in hand with the
development of the computer. Today's high-speed computers make it possi-
ble to implement solutions to practical combinatorial problems from a wide
variety of elds, solutions that could not be implemented until quite recently.
This has resulted in increased emphasis on the development of solutions to
combinatorial problems. At the same time, the development of computer sci-
ence has brought with it numerous challenging combinatorial problems of its
own. Thus, it is hard to separate combinatorial mathematics from computing.
The reader will see the emphasis on computing here by the frequent use of
examples from computer science, the frequent discussion of algorithms, and
so on. On the other hand, the general point of view taken in this book is that
combinatorics has a wealth of applications to a large number of subjects, and
this book has tried to emphasize the variety of these applications rather than
just focusing on one.
Many of the mathematical topics presented here are relatively standard
topics from the rapidly growing textbook literature of combinatorics. Others
are taken from the current research literature, or are chosen because they
illustrate interesting applications of the subject. The book is distinguished,
we believe, by its wide-ranging treatment of applications. Entire sections
are devoted to such applications as switching functions, the use of enzymes
to uncover unknown RNA chains, searching and sorting problems of infor-
mation retrieval, construction of error-correcting codes, counting of chemical
xvii
xviii Preface
compounds, calculation of power in voting situations, and uses of Fibonacci
numbers. There are entire sections on applications of recurrences involving
convolutions, applications of eulerian chains, applications of generating func-
tions, and so on, that are unique to the literature.
WHAT'S NEW IN THIS EDITION?
Much of the appeal of this book has stemmed from its references to modern
literature and real applications. The applications that motivate the development
and use of combinatorics are expanding greatly, especially in the natural and social
sciences. In particular, computer science and biology are primary sources of many of
the new applications appearing in this second edition. Along with these additions,
we have also added some major new topics, deleted some specialized ones, made
organizational changes, and updated and improved the examples, exercises, and
references to the literature.
Some of the major changes in the second edition are the following:
Chapter 1 (What Is Combinatorics?): We have added major new material on
list colorings, expanding discussion of scheduling legislative committees. List
colorings are returned to in various places in the book.
Chapter 2 (Basic Counting Rules): Section 2.16, which previously only discuss-
ed algorithmic methods for generating permutations, has been substantially
expanded and broken into subsections. Section 2.18, which introduces the
notion of \good algorithms" and NP-completeness, has been substantially
rewritten and modernized. A new section on the pigeonhole principle has
been added. The section consists of the material from Section 8.1 of the rst
edition and some of the material from Section 8.2 that deals with Ramsey the-
ory. We have also added a substantial new section on the inversion distance
between permutations and the study of mutations in evolutionary biology.
Chapter 3 (Introduction to Graph Theory): A major new subsection has
been added to Section 3.3, the graph coloring section. This new subsection
deals with the generalizations of graph coloring, such as set coloring, list col-
oring, and T-coloring, that have been motivated by practical problems such
as mobile radio telephone problems, trac phasing, and channel assignment.
We have also introduced a major new subsection on phylogenetic tree recon-
struction. Much of the material on Ramsey theory from Chapter 8 of the rst
edition, and not covered in Chapter 2, has been updated and presented in a
new section.
Chapter 4 (Relations): This chapter is brand new. Concepts of binary rela-
tions are dened and connected to digraphs. Orders are introduced using
digraphs and relations, and parts of the new chapter deal with linear and
weak orders partial orders, linear extensions and dimension chains and com-
parability graphs lattices boolean algebras and switching functions and gate
Preface xix

networks. The chapter is closely tied to applications ranging from informa-


tion theory to utility theory to searching and sorting, as well as returning
to the earlier applications such as switching functions. This chapter includes
some applications not widely discussed in the combinatorics literature, such
as preference, search engines, sequencing by hybridization, and psychophysi-
cal scaling. Examples based on Chapter 4 concepts have also been added to
many subsequent chapters. Coverage of Chapter 4 can be delayed until after
Chapter 11.
Chapter 5 (Generating Functions and Their Applications): In the rst
edition, this was Chapter 4. Many new concepts and examples introduced in
earlier chapters are revisited here, for example, weak orders from Chapter 4
and list colorings from Chapter 3.
Chapter 6 (Recurrence Relations): This was Chapter 5 in the rst edition.
New material on DNA sequence alignment has been added as has material on
the \transposition average" of permutations.
Chapter 7 (The Principle of Inclusion and Exclusion): This was Chapter 6
in the rst edition. We have added major new material on cryptography and
factoring integers throughout the chapter (and revisited it later in the book).
Old Chapter 8 - 1st edition (Pigeonhole Principle): This chapter has been
dropped, with important parts of the material added to Chapter 2 and other
parts included from time to time throughout the book.
Chapter 8 (The Polya Theory of Counting): This was Chapter 7 in the rst
edition. Some examples based on newly added Chapter 4 concepts such as
weak order run through the chapter. A subsection on automorphisms of
graphs has been added and returned to throughout the chapter.
Chapter 9 (Combinatorial Designs): Major additions to this chapter include
a section on orthogonal arrays and cryptography, including authentication
codes and secret sharing. There is also a new section on connections be-
tween modular arithmetic and the RSA cryptosystem and one on resolvable
designs with applications to secret sharing. A new section on \Group Test-
ing" includes applications to identifying defective products, screening diseases,
mapping genomes, and satellite communication.
Chapter 10 (Coding Theory): There is a new subsection on \consensus decod-
ing" with connections to nding proteins in molecular sequences and there
are added connections of error-correcting codes to compact disks. Material
on \reading" DNA to produce proteins is also new.
Chapter 11 (Existence Problems in Graph Theory): We have added new
subsections to Section 11.2 that deal with the one-way street problem. These
new subsections deal with recent results about orientations of square and an-
nular grids reecting dierent kinds of cities. We have added a new subsection
xx Preface
on testing for connectedness of truly massive graphs, arising from modern ap-
plications involving telecommunications trac and web data. There is also a
new subsection on sequencing DNA by hybridization.
Chapter 12 (Matching and Covering): There are many new examples illus-
trating the concepts of this chapter, including examples involving smallpox
vaccinations, sound systems, and oil drilling. We have introduced a new
section dealing with stable marriages and their many modern applications,
including the assignment of interns to hospitals, dynamic labor markets, and
strategic behavior. A section on maximum-weight matching, which was in
Chapter 13 of the rst edition, has been moved to this chapter.
Chapter 13 (Optimization Problems for Graphs and Networks): We have
introduced a new subsection on Menger's Theorems. There are also many
new examples throughout the chapter, addressing such problems as building
evacuation, clustering and data mining, and distributed computing.
Appendix (Answers to Selected Exercises): Answers to Selected Exercises was
included in the 1st edition of the book but it has been greatly expanded in
this edition.
CONTINUING FEATURES
While the second edition has been substantially changed from the rst, this
edition continues to emphasize the features that make this book unique:
 Its emphasis on applications from a variety of elds, the treatment of applica-
tions as major topics of their own rather than as isolated examples, and the
use of applications from the current literature.
 Many examples, especially ones that tie in new topics with old ones and are
revisited throughout the book.
 An emphasis on problem solving through a variety of exercises that test rou-
tine ideas, introduce new concepts and applications, or attempt to challenge
the reader to use the combinatorial techniques developed. The book contin-
ues to be based on the philosophy that the best way to learn combinatorial
mathematics, indeed any kind of mathematics, is through problem solving.
 A mix of diculty in topics with careful annotation that makes it possible to
use this book in a variety of courses at a variety of levels.
 An organization that allows the use of the topics in a wide variety of orders,
reecting the somewhat independent nature of the topics in combinatorics
while at the same time using topics from dierent chapters to reinforce each
other.
Preface xxi

THE ORGANIZATION OF THE BOOK


The book is divided into four parts. The rst part (Chapters 2, 3, and 4)
introduces the basic tools of combinatorics and their applications. It introduces
fundamental counting rules and the tools of graph theory and relations. The re-
maining three parts are organized around the three basic problems of combinatorics:
the counting problem, the existence problem, and the optimization problem. These
problems are discussed in Chapter 1. Part II of the book is concerned with more
advanced tools for dealing with the counting problem: generating functions, recur-
rences, inclusion/exclusion, and Polya Theory. Part III deals with the existence
problem. It discusses combinatorial design, coding theory, and special problems
in graph theory. It also begins a series of three chapters on graphs and networks
(Chapters 11{13, spanning Parts III and IV) and begins an introduction to graph
algorithms. Part IV deals with combinatorial optimization, illustrating the basic
ideas through a continued study of graphs and networks. It begins with a tran-
sitional chapter on matching and covering that starts with the existence problem
and ends with the optimization problem. Then Part IV ends with a discussion of
optimization problems for graphs and networks. The division of the book into four
parts is somewhat arbitrary, and many topics illustrate several dierent aspects of
combinatorics, for instance both existence and optimization questions. However,
dividing the book into four parts seemed to be a reasonable way to organize the
large amount of material that is modern combinatorics.
PREREQUISITES
This book can be used at a variety of levels. Most of the book is written for a
junior/senior audience, in a course populated by math and computer science ma-
jors and nonmajors. It could also be appropriate for sophomores with sucient
mathematical maturity. (Topics that can be omitted in elementary treatments are
indicated throughout.) On the other hand, at a fast pace, there is more than
enough material for a challenging graduate course. In the undergraduate courses
for which the material has been used at Rutgers, the majority of the enrollees come
from mathematics and computer science, and the rest from such disciplines as busi-
ness, economics, biology, and psychology. At Dickinson, the material has been used
primarily for junior/senior-level mathematics majors. The prerequisites for these
courses, and for the book, include familiarity with the language of functions and
sets usually attained by taking at least one course in calculus. Innite sequences
and series are used in Chapters 5 and 6 (though much of Chapter 6 uses only the
most elementary facts about innite sequences, and does not require the notion of
limit). Other traditional topics of calculus are not needed. However, the mathemat-
ical sophistication attained by taking a course like calculus is a prerequisite. Also
required are some tools of linear algebra, specically familiarity with matrix manip-
ulations. An understanding of mathematical induction is also assumed. (There are
those instructors who will want to review mathematical induction in some detail at
an early point in their course, and who will want to quickly review the language
of sets.) A few optional sections of the book require probability beyond what is
xxii Preface
developed in the text. Other sections introduce topics in modern algebra, such as
groups and nite elds. These sections are self-contained, but they would be too
fast-paced for a student without sucient background.
ALGORITHMS
Many parts of the book put an emphasis on algorithms. This is inevitable, as
combinatorics is increasingly connected to the development of precise and ecient
procedures for solving complicated problems, and because the development of com-
binatorics is so closely tied to computer science. Our aim is to introduce students
to the notion of an algorithm and to introduce them to some important examples
of algorithms. For the most part, we have adopted a relatively informal style in
presenting algorithms. The style presumes little exposure to the notion of an al-
gorithm and how to describe it. The major goal is to present the basic idea of a
procedure, without attempting to present it in its most concise or most computer-
oriented form. There are those who will disagree with this method of presenting
algorithms. Our own view is that no combinatorics course is going to replace the
learning of algorithms. The computer science student needs a separate course in
algorithms that includes discussion of implementing the data structures for the al-
gorithms presented. However, all students of combinatorics need to be exposed to
the idea of algorithm, and to the algorithmic way of thinking, a way of thinking
that is so central and basic to the subject. We realize that our compromise on how
to present algorithms will not make everyone happy. However, it should be pointed
out that for students with a background in computer science, it would make for
interesting, indeed important, exercises to translate the informal algorithms of the
text into more precise computer algorithms or even computer programs.
ROLE OF EXAMPLES AND APPLICATIONS
Applications play a central role in this book and are a feature that makes the
book unique among combinatorics books. The instructor is advised to pick and
choose among the applications or to assign them for outside reading. Many of the
applications are presented as Examples that are returned to as the book progresses.
It is not necessary for either the instructor or the student to be an expert in the
area of application represented in the various examples and subsections of the book.
They tend to be self-contained and, when not, should be readily understood with
some appropriate searching of the Internet.
The connection between combinatorics and computer science is well understood
and vitally important and does not need specic emphasis in this discussion.
Of particular importance in this book are examples from the biological sciences.
Our emphasis on such examples stems from our observation that the connection
between the biological and the mathematical sciences is growing extremely fast.
Methods of mathematics and computer science have played and are playing a ma-
jor role in modern biology, for example in the \human genome project" and in the
modeling of the spread of disease. Increasingly, it is vitally important for mathe-
matical scientists to understand such modern applications and also for students of
Preface xxiii

the biological sciences to understand the importance for their discipline of mathe-
matical methods such as combinatorics. This interdisciplinarity is reected in the
growing number of schools that have courses or programs at the interface between
the mathematical and the biological sciences.
While less advanced than the connection between the mathematical and the
biological sciences, the connection between the mathematical and the social sciences
is also growing rapidly as more and more complex problems of the social sciences
are tackled using tools of computer science and mathematical modeling. Thus, we
have introduced a variety of applications that arise from the social sciences, with
an emphasis on decisionmaking and voting.
PROOFS
Proving things is an essential aspect of mathematics that distinguishes it from
other sciences. Combinatorics can be a wonderful mechanism for introducing stu-
dents to the notion of mathematical proof and teaching them how to write good
proofs. Some schools use the combinatorics course as the introduction to proofs
course. That is not our purpose with this book. While the instructor using this
book should include proofs, we tend to treat proofs as rather informal and do not
put emphasis on writing them. Many of the harder proofs in the book are starred
as optional.
EXERCISES
The exercises play a central role in this book. They test routine ideas, introduce
new concepts and applications, and attempt to challenge the reader to use the
combinatorial techniques developed in the text. It is the nature of combinatorics,
indeed the nature of most of mathematics, that it is best mastered by doing many
problems. We have tried to include a wide variety of both applied and theoretical
exercises, of varying degrees of diculty, throughout the book.
WAYS TO USE THE BOOK IN VARIOUS SETTINGS
This book is appropriate for a variety of courses at a variety of levels. We have
both used the material of the book for several courses, in particular a one-semester
course entitled Combinatorics and a one-semester course entitled Applied Graph
Theory. The combinatorics course, taught to juniors and seniors, covers much of
the material of Chapters 1, 2, 3, 5, 6, 7, 9, and 10, omitting the sections indicated
by footnotes in the text. (These are often proofs.) At Rutgers, a faster-paced course
that Fred Roberts has used with rst-year graduate students puts more emphasis on
proofs, includes many of the optional sections, and also covers the material of either
Chapter 8 or Chapter 12. In an undergraduate or a graduate course, the instructor
could also substitute for Chapters 9 and 10 either Chapter 8 or Chapter 11 and
parts of Chapters 12 and 13. Including Chapter 11 is especially recommended at
institutions that do not have a separate course in graph theory. Similarly, including
parts of Chapter 13 is especially recommended for institutions that do not have a
xxiv Preface
course in operations research. At Rutgers, there are separate (both undergraduate
and graduate) courses that cover much of the material of Chapters 11 to 13.
Other one-semester or one-quarter courses could be designed from this material,
as most of the chapters are relatively independent. (See the discussion below.) At
Rutgers, the applied graph theory course that is taught is built around Chapters 3
and 11, supplemented with graph-theoretical topics from the rest of the book (Chap-
ters 4, 12, and 13) and elsewhere. (A quick treatment of Sections 2.1 through 2.7,
plus perhaps Section 2.18, is needed background.) Chapters 3, 11, 12, and 13 would
also be appropriate for a course introducing graph algorithms or a course called
Graphs and Networks. The entire book would make a very appropriate one-year
introduction to modern combinatorial mathematics and its applications. A course
emphasizing applications of combinatorics for those who have previously studied
combinatorics could be constructed out of the applied subsections and examples in
the text.
This book could be used for a one-semester or one-quarter sophomore-level
course. Such a course would cover much of Chapters 1, 2, and 3, skip Chapters 4
and 5, and cover only Sections 6.1 and 6.2 of Chapter 6. It would then cover Chap-
ter 7 and parts of Chapter 11. Starred sections and most proofs would be omitted.
Other topics would be added at the discretion of the instructor.
DEPENDENCIES AMONG TOPICS
AND ORDERS IN WHICH TO USE THE BOOK
In organizing any course, the instructor will wish to take note of the relative
independence of the topics here. There is no well-accepted order in which to present
an introduction to the subject matter of combinatorics, and there is no universal
agreement on the topics that make up such an introduction. We have tried to write
this book in such a way that the chapters are quite independent and can be covered
in various orders.
Chapter 2 is basic to the book. It introduces the basic counting rules that are
used throughout. Chapter 3 develops just enough graph theory to introduce the
subject. It emphasizes graph-theoretical topics that illustrate the counting rules
developed in Chapter 2. The ideas introduced in Chapter 3 are referred to in
places throughout the book, and most heavily in Chapters 4, 11, 12, and 13. It is
possible to use this book for a one-semester or one-quarter course in combinatorics
without covering Chapter 3. However, in our opinion, at least the material on graph
coloring (Sections 3.3 and 3.4) should be included. The major dependencies beyond
Chapter 3 are that Chapter 4 depends on Chapter 3 Chapter 6 after Section 6.2
depends on Chapter 5 Chapter 7 refers to examples developed in Chapters 3 and
6 Chapters 11, 12, and 13 depend on Chapter 3 and Section 10.5 depends on
Chapter 9. Ideas from Chapter 12 are used in Chapter 13, Section 13.3.8.
COMBINATORICS IS RAPIDLY CHANGING
Finally, it should be emphasized that combinatorics is a rapidly growing subject
and one whose techniques are being rapidly developed and whose applications are
Preface xxv

being rapidly explored. Many of the topics presented here are close to the frontiers
of research. It is typical of the subject that it is possible to bring a newcomer
to the frontiers very quickly. We have tried to include references to the literature
of combinatorics and its applications that will allow the interested reader to delve
more deeply into the topics discussed here.
ACKNOWLEDGMENTS
Fred Roberts started on the rst edition of this book in 1976, when he produced
a short set of notes for his undergraduate course in combinatorics at Rutgers. Over
the years that this book has changed and grown, he has used it regularly as the text
for that course and for the other courses described earlier, as has Barry Tesman. It
has also been a great benet to the authors that others have used this material as the
text for their courses and have sent extensive comments. They would particularly
like to thank for their very helpful comments: Midge Cozzens, who used drafts of
the rst edition at Northeastern Fred Homan, who used them at Florida Atlantic
Doug West, who used them at Princeton Garth Isaak, who used drafts of the second
edition at Lehigh and Buck McMorris, who used drafts of the second edition at
Illinois Institute of Technology.
We would especially like to thank present and former students who have helped
in numerous ways in the preparation of both editions of the book, by proofreading,
checking exercises, catching numerous mistakes, and making nasty comments. Fred
Roberts wants to acknowledge the help of Midge Cozzens, Shelly Leibowitz, Bob
Opsut, Arundhati Ray-Chaudhuri, Sam Rosenbaum, and Je Steif. Barry Tesman
acknowledges the help of Kathy Clawson and John Costango. Both authors would
also like to thank Aritanan Gruber, David Papp, and Paul Ra for their help with
working out answers to selected exercises.
We have received comments on this material from many people. We would specif-
ically like to thank the following individuals, who made extremely helpful comments
on the rst edition at various stages during the reviewing process for that edition, as
well as at other times: John Cozzens, Paul Duvall, Marty Golumbic, Fred Homan,
Steve Maurer, Ronald Mullin, Robert Tarjan, Tom Trotter, and Alan Tucker. As we
were preparing the second edition, we received very helpful comments on the rst
edition from Steve Maurer. Je Dinitz gave us detailed comments on drafts of Chap-
ters 9 and 10. For the second edition, we received extremely helpful comments from
the following reviewers: Edward Allen, Martin Billik, John Elwin, Rodney W. For-
cade, Kendra Killpatrick, Joachim Rosenthal, Sung-Yell Song, Vladimir Tonchev,
and Cun-Quan Zhang. Although we have received a great deal of help with this
material, errors will almost surely remain. We alone are responsible for them.
As the rst edition of this book grew, it was typed and retyped, copied and
recopied, cut (literally with scissors), pasted together (literally with glue), uncut,
glued, and on and on. Fred Roberts had tremendous help with this from Lynn
Braun, Carol Brouillard, Mary Anne Jablonski, Kathy King, Annette Roselli, and
Dotty Westgate. It is quite remarkable how the business of publishing has changed.
For the second edition, Barry Tesman did the typing, retyping, (electronic) cutting
and pasting, etc. Without an electronic copy of the rst edition, the task of scanning
xxvi Preface
the complete rst edition went to Barry Tesman's former student Jennifer Becker.
He acknowledges her herculean task. He also would like to thank members of
his department, LIS, and the administration at Dickinson for their support and
contributions.
The authors would like to thank Pearson Education for permission to modify
and reproduce material from Fred Roberts' book Discrete Mathematical Models,
with Applications to Social, Biological, and Environmental Problems, 1976, Pearson
Education, Inc. All rights are reserved by Pearson Education. In particular we have
modied Tables 2.1 and 8.1, reproduced exercises from Section 3.6, and reproduced
selections from pages 8{14, 21{23, 25{29, 31{32, 81{82, 84{86, 156{157, and 165{
166.
Both of us would like to thank our families for their support. Those who have
written a book will understand the number of hours it takes away from one's fam-
ily: cutting short telephone calls to proofread, canceling trips to write, postponing
outings to create exercises, stealing away to make just one more improvement. Our
families have been extremely understanding and helpful. Fred Roberts would like
to thank his late parents, Louis and Frances Roberts, for their love and support.
He would like to thank his mother-in-law, the late Lily Marcus, for her assistance,
technical and otherwise. He would like to thank his wife, Helen, who, it seems, is
always a \book widow." She has helped not only by her continued support and
guidance, and inspiration, but she has also co-authored one chapter of this book,
and introduced him to a wide variety of topics and examples which she developed
for her courses and which we have freely scattered throughout this book. Finally,
he would like to thank Sarah and David. When the rst edition was being written,
their major contribution to it was to keep him in constant good humor. Remark-
ably, as his children have grown to adulthood, they have grown to contribute to
his work in other ways: for instance, Sarah by introducing him to ideas of public
health that are reected in some of his current mathematical interests and in this
book and David by explaining numerous aspects of computer science, earning him
in particular an acknowledgment in an important footnote later in the book. Fred
Roberts does not need the counting techniques of combinatorics to count his bless-
ings. Barry Tesman would like to thank his parents, Shirley and Harvey Tesman,
for their love and support. He would like to thank his wife, Johanna, who was his
silent partner in this undertaking and has been his (nonsilent) partner and best
friend for the last 20 years. Finally, he would like to thank Emma and Lucy, for
being Emma and Lucy.

Fred S. Roberts Barry Tesman


[email protected] [email protected]
Notation
Set-theoretic Notation
 union  empty set
\ intersection f  g the set : : :
j subset (contained in) f   :   g the set of all .. .
$ proper subset such that . ..
6j is not a subset Ac complement of A
k contains (superset) A;B A \ Bc
2 member of jAj cardinality of A, the
62 not a member of number of elements
in A

Logical Notation
not
! implies
$ if and only if
(equivalence)
i if and only if

Miscellaneous
bxc the greatest integer less !a b] the closed interval con-
than or equal to x sisting of all real num-
dxe the least integer greater bers c with a c b
than or equal to x  approximately equal to
f  g composition of the two  congruent to
functions f and g AT the transpose of the
f(A) the image of the set A Q
matrix A
under the function f P
product
that is, ff(a) : a 2 Ag R
sum
(a b) the open interval con- integral
sisting of all real num- Re the set of real numbers
bers c with a < c < b
xxvii
Chapter 1

What Is Combinatorics?
1.1 THE THREE PROBLEMS OF COMBINATORICS
Perhaps the fastest-growing area of modern mathematics is combinatorics. Com-
binatorics is concerned with the study of arrangements, patterns, designs, assign-
ments, schedules, connections, and congurations. In the modern world, people in
almost every area of activity nd it necessary to solve problems of a combinatorial
nature. A computer scientist considers patterns of digits and switches to encode
complicated statements. A shop supervisor prepares assignments of workers to tools
or to work areas. An agronomist assigns test crops to dierent elds. An electrical
engineer considers alternative congurations for a circuit. A banker studies alter-
native patterns for electronically transferring funds, and a space scientist studies
such patterns for transferring messages to distant satellites. An industrial engineer
considers alternative production schedules and workplace congurations to maxi-
mize ecient production. A university scheduling ocer arranges class meeting
times and students' schedules. A chemist considers possible connections between
various atoms and molecules, and arrangements of atoms into molecules. A trans-
portation ocer arranges bus or plane schedules. A linguist considers arrangements
of words in unknown alphabets. A geneticist considers arrangements of bases into
chains of DNA, RNA, and so on. A statistician considers alternative designs for an
experiment.
There are three basic problems of combinatorics. They are the existence prob-
lem, the counting problem, and the optimization problem. The existence problem
deals with the question: Is there at least one arrangement of a particular kind?
The counting problem asks: How many arrangements are there? The optimization
problem is concerned with choosing, among all possible arrangements, that which
is best according to some criteria. We shall illustrate these three problems with a
number of examples.
Example 1.1 Design of Experiments Let us consider an experiment designed
to test the eect on human beings of ve dierent drugs. Let the drugs be labeled
1 2 3 4 5. We could pick out ve subjects and give each subject a dierent drug.
1
2 Chapter 1. What Is Combinatorics?

Table 1.1: A Design for a Drug Experimenta


Day
M Tu W Th F
A 1 2 3 4 5
B 1 2 3 4 5
Subject C 1 2 3 4 5
D 1 2 3 4 5
E 1 2 3 4 5
a The entry in the row corresponding to a
given subject and the column correspond-
ing to a given day shows the drug taken
by that subject on that day.

Unfortunately, certain subjects might be allergic to a particular drug, or immune


to its eects. Thus, we could get very biased results. A more eective use of ve
subjects would be to give each subject each of the drugs, say on ve consecutive
days. Table 1.1 shows one possible arrangement of the experiment. What is wrong
with this arrangement? For one thing, the day of the week a drug is taken may
aect the result. (People with Monday morning hangovers may never respond well
to a drug on Monday.) Also, drugs taken earlier might aect the performance of
drugs taken later. Thus, giving each subject the drugs in the same order might lead
to biased results. One way around these problems is simply to require that no two
people get the same drug on the same day. Then the experimental design calls for
a 5 5 table, with each entry being one of the integers 1 2 3 4 5, and with each
row having all its entries dierent and each column having all its entries dierent.
This is a particular kind of pattern. The crucial question for the designer of the
drug experiment is this: Does such a design exist? This is the existence problem of
combinatorics. 
Let us formulate the problem more generally. We dene a Latin square 1 as an
n n table that uses the numbers 1 2 : : : n as entries, and does so in such a way
that no number appears more than once in the same row or column. Equivalently,
it is required that each number appear exactly once in each row and column. A
typical existence problem is the following: Is there a 2 2 Latin square? The answer
is yes Table 1.2 shows such a square. Similarly, one may ask if there is a 3 3
Latin square. Again, the answer is yes Table 1.3 shows one.
Our specic question asks whether or not there is a 5 5 Latin square. Table 1.4
shows that the answer is yes. (Is there an n n Latin square for every n? The
answer is known and is left to the reader.)
1 The term \Latin square" comes from the fact that the elements were usually represented by
letters of the Latin alphabet.
1.1 The Three Problems of Combinatorics 3

Table 1.2: A 2  2 Table 1.3: A 3  3 Table 1.4: A 5  5


Latin Square Latin Square Latin Square
1 2 1 2 3 1 2 3 4 5
2 1 2 3 1 2 3 4 5 1
3 1 2 3 4 5 1 2
4 5 1 2 3
5 1 2 3 4

Note that the Latin square is still not a complete solution to the problem that
order eects may take place. To avoid any possible order eects, we should ideally
have enough subjects so that each possible ordering of the 5 drugs can be tested.
How many such orderings are there? This is the counting problem, the second basic
type of problem encountered in combinatorics. It turns out that there are 5! = 120
such orderings, as will be clear from the methods of Section 2.3. Thus, we would
need 120 subjects. If only 5 subjects are available, we could try to avoid order
eects by choosing the Latin square we use at random. How many possible 5 5
Latin squares are there from which to choose? We address this counting problem
in Section 6.1.3.
As this very brief discussion suggests, questions of experimental design have been
a major stimulus to the development of combinatorics.2 We return to experimental
design in detail in Chapter 9.
Example 1.2 Bit Strings and Binary Codes A bit or binary digit is a zero
or a one. A bit string is dened to be a sequence of bits, such as 0001, 1101, or
1010. Bit strings are the crucial carriers of information in modern computers. A bit
string can be used to encode detailed instructions, and in turn is translated into a
sequence of on-o instructions for switches in the computer. A binary code (binary
block code) for a collection of symbols assigns a dierent bit string to each of the
symbols. Let us consider a binary code for the 26 letters in the alphabet. A typical
such code is the Morse code which, in its more traditional form, uses dots for zeros
and dashes for ones. Some typical letters in Morse code are given as follows:
O:111 A:01 K: 101 C: 1010:
If we are restricted to bit strings consisting of either one or two bits, can we encode
all 26 letters of the alphabet? The answer is no, for the only possible strings are
the following:
0 1 00 01 10 11:
There are only six such strings. Notice that to answer the question posed, we had
to count the number of possible arrangements. This was an example of a solution to
a counting problem. In this case we counted by enumerating or listing all possible
2 See Herzberg and Stanton 1984].
4 Chapter 1. What Is Combinatorics?
arrangements. Usually, this will be too tedious or time consuming for us, and we
will want to develop shortcuts for counting without enumerating. Let us ask if bit
strings of three or fewer bits would do for encoding all 26 letters of the alphabet.
The answer is again no. A simple enumeration shows that there are only 14 such
strings. (Can you list them?) However, strings of four or fewer bits will suce.
(How many such strings are there?) The Morse code, indeed, uses only strings of
four or fewer symbols. Not every possible string is used. (Why?) In Section 2.1 we
shall encounter a very similar counting problem in studying the genetic code. DNA
chains encode the basic genetic information required to determine long strings of
amino acids called proteins. We shall try to explain how long a segment in a DNA
chain is required to be to encode for an amino acid. Codes will arise in other parts
of this book as well, not just in the context of genetics or of communication with
modern computers. For instance, in Chapter 10 we study the error-correcting codes
that are used to send and receive messages to and from distant space probes, to re
missiles, and so on. 

Example 1.3 The Best Design for a Gas Pipeline The ow of natural gas
through a pipe depends on the diameter of the pipe, its length, the pressures at the
endpoints, the temperature, various properties of the gas, and so on. The problem
of designing an oshore gas pipeline system involves, among other things, decisions
about what sizes (diameters) of pipe to use at various junctions or links so as to
minimize total cost of both construction and operation. A standard approach to
this problem has been to use \engineering judgment" to pick reasonable sizes of
pipe and then to hope for the best. Any chance of doing better seems, at rst
glance, to be hopeless. For example, a modest network of 40 links, with 7 possible
pipe sizes for each link, would give rise to 740 possible networks, as we show in
Section 2.1. Now 740, as we shall see, is a very large number. Our problem is to
nd the least expensive network out of these 740 possibilities. This is an example of
the third kind of combinatorial problem, an optimization problem, a problem where
we seek to nd the optimum (best, maximum, minimum, etc.) design or pattern or
arrangement.
It should be pointed out that progress in solving combinatorial optimization
problems has gone hand in hand with the development of the computer. Today
it is possible to solve on a machine problems whose solution would have seemed
inconceivable only a few years ago. Thus, the development of the computer has been
a major impetus behind the very rapid development of the eld of combinatorial
optimization. However, there are limitations to what a computing machine can
accomplish. We shall see this next.
Any nite problem can be solved in principle by considering all possibilities.
However, how long would this particular problem take to solve by enumerating all
possible pipeline networks? To get some idea, note that 740 is approximately 6
1033, that is, 6 followed by 33 zeros. This is a huge number. Indeed, even a computer
that could analyze 1 billion (109) dierent pipeline networks in 1 second (one each
nanosecond) would take approximately 1:9 1017 = 190,000,000,000,000,000 years
1.1 The Three Problems of Combinatorics 5

to analyze all 740 possible pipeline networks!3


Much of modern combinatorics is concerned with developing procedures or algo-
rithms for solving existence, counting, or optimization problems. From a practical
point of view, it is a very important problem in computer science to analyze an
algorithm for solving a problem in terms of how long it would take to solve or how
much storage capacity would be required to solve it. Before embarking on a com-
putation (such as trying all possibilities) on a machine, we would like to know that
the computation can be carried out within a reasonable time or within the available
storage capacity of the machine. We return to these points in our discussion of
computational complexity in Sections 2.4 and 2.18.
The pipeline problem we have been discussing is a problem that, even with the
use of today's high-speed computer tools, does not seem tractable by examining all
cases. Any foreseeable improvements in computing speed would make a negligible
change in this conclusion. However, a simple procedure gives rise to a method
for nding the optimum network in only about 7 40 = 280 steps, rather than
740 steps. The procedure was implemented in the Gulf of Mexico at a savings of
millions of dollars. See Frank and Frisch !1970], Kleitman !1976], Rothfarb, et al.
!1970], or Zadeh !1973] for references. This is an example of the power of techniques
for combinatorial optimization. 
Example 1.4 Scheduling Meetings of Legislative Committees Commit-
tees in a state legislature are to be scheduled for a regular meeting once each week.
In assigning meeting times, the aide to the Speaker of the legislature must be careful
not to schedule simultaneous meetings of two committees that have a member in
common. Let us suppose that in a hypothetical situation, there are only three meet-
ing times available: Tuesday, Wednesday, and Thursday mornings. The committees
whose meetings must be scheduled are Finance, Environment, Health, Transporta-
tion, Education, and Housing. Let us suppose that Table 1.5 summarizes which
committees have a common member. A convenient way to represent the informa-
tion of Table 1.5 is to draw a picture in which the committees are represented by
dots or points and two points are joined by an undirected line if and only if the
corresponding committees have a common member. The resulting diagram is called
a graph.
Figure 1.1 shows the graph obtained in this way for the data of Table 1.5.
Graphs of this kind have a large number of applications, for instance in computer
science, operations research, electrical engineering, ecology, policy and decision sci-
ence, and in the social sciences. We discuss graphs and their applications in detail
in Chapters 3 and 11 and elsewhere.
Our rst question is this: Given the three available meeting times, can we nd
an assignment of committees to meeting times so that no member has to be at
3 There are roughly 3:15 107 seconds per year, so 3:15 107 109 or 3:15 1016 networks
could be analyzed in a year. Then the number of years it takes to check 6 1033 networks is
6 1033  1:9 1017:
3:15 1016
6 Chapter 1. What Is Combinatorics?

Table 1.5: Common Membership in Committeesa


Finance Environment Health Transportation Education Housing
Finance 0 0 0 0 0 1
Environment 0 0 1 0 1 0
Health 0 1 0 1 1 1
Transportation 0 0 1 0 0 1
Education 0 1 1 0 0 1
Housing 1 0 1 1 1 0
a The i j entry is 1 if committees i and
j have a common member, and 0 otherwise.
(The diagonal entries are taken to be 0 by convention.)

Finance

Housing Transportation
Figure 1.1: The graph obtained
from the data of Table 1.5.
Education Health

Environment

two meetings at once? This is an existence question. In terms of the graph we


have drawn, we would like to assign a meeting time to each point so that if two
points are joined by a line, they get dierent meeting times. Can we nd such an
assignment? The answer in our case, after some analysis, is yes. One assignment
that works is this: Let the Housing and Environment committees meet on Tuesday,
the Education and Transportation committees on Wednesday, and the Finance and
Health committees on Thursday.
Problems analogous to the one we have been discussing arise in scheduling nal
exams or class meeting times in a university, in scheduling job assignments in a
factory, and in many other scheduling situations. We shall return to such problems
in Chapter 3 when we look at these questions as questions of graph coloring and
think of the meeting times, for example, as corresponding to \colors."
The problem gets more realistic if each committee chair indicates a list of accept-
able meeting times. We then ask if there is an assignment of committees to meeting
times so that each committee is assigned an acceptable time and no member has
to be at two meetings at once. For instance, suppose that the acceptable meeting
times for Transportation are Tuesday and Thursday, for Education is Wednesday,
and all other committees would accept any of the three days. It is not hard to show
that there is no solution (see Exercise 13). We will then have solved the existence
problem in the negative. This is an example of a scheduling problem known as a
1.1 The Three Problems of Combinatorics 7

Table 1.6: First Choice of Meeting Times


Committee Finance Environment Health Transportation Education Housing
Chair's
rst choice Tuesday Thursday Thursday Tuesday Tuesday Wednesday

list-coloring problem, a graph coloring problem where assigned colors (in this case
representing \days of the week") are chosen from a list of acceptable ones. We
return to this problem in Example 3.22. List colorings have been widely studied
in recent years. See Alon !1993] and Kratochv$l, Tuza, and Voigt !1999] for recent
surveys.
We might ask next: Suppose that each committee chair indicates his or her rst
choice for a meeting time. What is the assignment of meeting times that satises our
original requirements (if there is such an assignment) and gives the largest number
of committee chairs their rst choice? This is an optimization question. Let us
again take a hypothetical situation and analyze how we might answer this question.
Suppose that Table 1.6 gives the rst choice of each committee chair. One approach
to the optimization question is simply to try to identify all possible satisfactory
assignments of meeting times and for each to count how many committee chairs get
their rst choice. Before implementing any approach to a combinatorial problem,
as we have observed before, we would like to get a feeling for how long the approach
will take. How many possibilities will have to be analyzed? This is a counting
problem. We shall solve this counting problem by enumeration. It is easy to see
from the graph of Figure 1.1 that Housing, Education, and Health must get dierent
times. (Each one has a line joining it to the other two.) Similarly, Transportation
must get a dierent time from Housing and Health. (Why?) Hence, since only
three meeting times are available, Transportation must meet at the same time as
Education. Similarly, Environment must meet at the same time as Housing. Finally,
Finance cannot meet at the same time as Housing, and therefore as Environment,
but could meet simultaneously with any of the other committees. Thus, there are
only two possible meeting patterns. They are as follows.
Pattern 1. Transportation and Education meet at one time, Environ-
ment and Housing at a second time, and Finance and Health meet
at the third time.
Pattern 2. Transportation, Education, and Finance meet at one time,
Environment and Housing meet at a second time, and Health meets
at the third time.
It follows that Table 1.7 gives all possible assignments of meeting times. In
all, there are 12 possible. Our counting problem has been solved by enumerating
all possibilities. (In Section 3.4.1 we do this counting another way.) It should
be clear from Example 1.3 that enumeration could not always suce for solving
combinatorial problems. Indeed, if there are more committees and more possible
meeting times, the problem we have been discussing gets completely out of hand.
8 Chapter 1. What Is Combinatorics?
Having succeeded in enumerating in our example, we can easily solve the opti-
mization problem. Table 1.7 shows the number of committee chairs getting their
rst choice under each assignment. Clearly, assignment number 7 is the best from
this point of view. Here, only the chair of the Environment committee does not get
his or her rst choice. For further reference on assignment of meeting times for state
legislative committees, see Bodin and Friedman !1971]. For work on other schedul-
ing problems where the schedule is repeated periodically (e.g., every week), see, for
instance, Ahuja, Magnanti, and Orlin !1993], Baker !1976], Bartholdi, Orlin, and
Ratli !1980], Chretienne !2000], Crama, et al. !2000], Karp and Orlin !1981], Orlin
!1982], or Tucker !1975]. For surveys of various workforce scheduling algorithms,
see Brucker !1998], Kovalev, et al. !1989], or Tien and Kamiyama !1982]. 
This book is organized around the three basic problems of combinatorics that
we have been discussing. It has four parts. After an introductory part consisting
of Chapters 2 to 4, the remaining three parts deal with these three problems: the
counting problem (Chapters 5 to 8), the existence problem (Chapters 9 to 11), and
the optimization problem (Chapters 12 and 13).

1.2 THE HISTORY AND APPLICATIONS OF


COMBINATORICS4
The four examples described in Section 1.1 illustrate some of the problems with
which combinatorics is concerned. They were chosen from a variety of elds to
illustrate the variety of applications of combinatorics in modern times.
Although combinatorics has achieved its greatest impetus in modern times, it is
an old branch of mathematics. According to legend, the Chinese Emperor Yu (in
approximately 2200 B.C.) observed a magic square on the back of a divine tortoise.
(A magic square is a square array of numbers in which the sum of all rows, all
columns, and both diagonals is the same. An example of such a square is shown in
Table 1.8. The reader might wish to nd a dierent 3 3 magic square.)
Permutations or arrangements in order were known in China before 1100 B.C.
The binomial expansion !the expansion of (a + b)n ] was known to Euclid about 300
B.C. for the case n = 2. Applications of the formula for the number of permutations
of an n-element set can be found in an anonymous Hebrew work, Sefer Yetzirah,
written between A.D. 200 and 500. The formula itself was known at least 2500 years
ago. In A.D. 1100, Rabbi Ibn Ezra knew the formula for the number of combinations
of n things taken r at a time, the binomial coecient. Shortly thereafter, Chinese,
Hindu, and Arab works began mentioning binomial coecients in a primitive way.
In more modern times, the seventeenth-century scholars Pascal and Fermat pur-
sued studies of combinatorial problems in connection with gambling|among other
things, they gured out odds. (Pascal's famous triangle was in fact known to Chu
4 For a more detailed discussion of the history of combinatorics, see Biggs, Lloyd, and Wilson
1995] or David 1962]. For the history of graph theory, see Biggs, Lloyd, and Wilson 1976].
1.2 The History and Applications of Combinatorics 9

Table 1.7: Possible Assignments of Meeting Times


Number of
Assignment committee
number Tuesday Wednesday Thursday chairs getting
their rst choice
1 Transportation- Environment- Finance- 4
Education Housing Health
2 Transportation- Finance- Environment- 3
Education Health Housing
3 Environment- Transportation- Finance- 1
Housing Education Health
4 Environment- Finance- Transportation- 0
Housing Health Education
5 Finance- Transportation- Environment- 2
Health Education Housing
6 Finance- Environment- Transportation- 2
Health Housing Education
7 Transportation- Environment- Health 5
Education- Housing
Finance
8 Transportation- Health Environment- 4
Education- Housing
Finance
9 Environment- Transportation- Health 1
Housing Education-
Finance
10 Environment- Health Transportation- 0
Housing Education-
Finance
11 Health Transportation- Environment- 1
Education- Housing
Finance
12 Health Environment- Transportation- 1
Housing Education-
Finance

Table 1.8: A Magic Square


4 9 2
3 5 7
8 1 6
10 Chapter 1. What Is Combinatorics?
Shih-Chieh in China in 1303.) The work of Pascal and Fermat laid the ground-
work for probability theory in the eighteenth century, Laplace dened probability
in terms of number of favorable cases. Also in the eighteenth century, Euler in-
vented graph theory in connection with the famous Konigsberg bridge problem
and Bernoulli published the rst book presenting combinatorial methods, Ars Con-
jectandi. In the eighteenth and nineteenth centuries, combinatorial techniques were
applied to study puzzles and games, by Hamilton and others. In the nineteenth
century, Kirchho developed a graph-theoretical approach to electrical networks
and Cayley developed techniques of enumeration to study organic chemistry. In
modern times, the techniques of combinatorics have come to have far-reaching, sig-
nicant applications in computer science, transportation, information processing,
industrial planning, electrical engineering, experimental design, sampling, coding,
genetics, political science, and a variety of other important elds. In this book we
always keep the applications close at hand, remembering that they are not only a
signicant benet derived from the development of the mathematical techniques,
but they are also a stimulus to the continuing development of these techniques.

EXERCISES FOR CHAPTER 1


1. Find a 4  4 Latin square.
2. Find all possible 3  3 Latin squares.
3. Describe how to create an n  n Latin square.
4. (Liu 1972]) Suppose that we have two types of drugs to test simultaneously, such
as headache remedies and fever remedies. In this situation, we might try to design
an experiment in which each type of drug is tested using a Latin square design.
However, we also want to make sure that, if at all possible, all combinations of
headache and fever remedies are tested. For example, Table 1.9 shows two Latin
square designs if we have 3 headache remedies and 3 fever remedies. Also shown in
Table 1.9 is a third square, which lists as its i j entry the i j entries from both of
the rst two squares. We demand that each entry of this third square be dierent.
This is not true in Table 1.9.
(a) Find an example with 3 headache and 3 fever drugs where the combined square
has the desired property.
(b) Find another example with 4 headache and 4 fever drugs. (In Chapter 9
we observe that with 6 headache and 6 fever drugs, this is impossible. The
existence problem has a negative solution.) Note : If you start with one Latin
square design for the headache drugs and cannot nd one for the fever drugs
so that the combined square has the desired property, you should start with a
dierent design for the headache drugs.
5. Show by enumeration that there are 14 bit strings of length at most 3.
6. Use enumeration to nd the number of bit strings of length at most 4.
7. Suppose that we want to build a trinary code for the 26 letters of the alphabet,
using strings in which each symbol is 0, 1, or ;1.
Exercises for Chapter 1 11

Table 1.9: A Latin Square Design for Testing Headache Drugs 1, 2, and 3, a
Latin Square Design for Testing Fever Drugs a, b, and c, and a Combination
of the Two.a
Day Day Day
1 2 3 1 2 3 1 2 3
1 1 2 3 1 a b c 1 1 a 2 b 3 c
Subject: 2 2 3 1 2 b c a 2 2 b 3 c 1 a
3 3 1 2 3 c a b 3 3 c 1 a 2 b
Headache Fever Combination
Drugs Drugs
a The third square has as its i j entry the headache drug and the fever
drug shown in the i j entries of the rst two squares, respectively.

Table 1.10: Overlap Dataa


English Calculus History Physics
English 0 1 0 0
Calculus 1 0 1 1
History 0 1 0 1
Physics 0 1 1 0
a The i j entry is 1 if the ith and j th courses have a
common member, and 0 otherwise.

(a) Could we encode all 26 letters using strings of length at most 2? Answer this
question by enumeration.
(b) What about using strings of length exactly 3?
8. The genetic code embodied in the DNA molecule, a code we describe in Section 2.1,
consists of strings of symbols, each of which is one of the four letters T, C, A, or
G. Find by enumeration the number of dierent codewords or strings using these
letters and having length 3 or less.
9. Suppose that in designing a gas pipeline network, we have 2 possible pipe sizes,
small (S ) and large (L). If there are 4 possible links, enumerate all possible pipeline
networks. (A typical one could be abbreviated LSLL, where the ith letter indicates
the size of the ith pipe.)
10. In Example 1.3, suppose that a computer could analyze as many as 100 billion
dierent pipeline networks in a second, a 100-fold improvement over the speed we
assumed in the text. Would this make a signicant dierence in our conclusions?
Why? (Do a computation in giving your answer.)
11. Tables 1.10 and 1.11 give data of overlap in class rosters for several courses in a
university.
(a) Translate Table 1.10 into a graph as in Example 1.4.
12 Chapter 1. What Is Combinatorics?

Table 1.11: More Overlap Dataa


English Calculus History Physics Economics
English 0 1 0 0 0
Calculus 1 0 1 1 1
History 0 1 0 1 1
Physics 0 1 1 0 1
Economics 0 1 1 1 0
a The i j entry is 1 if the ith and j th courses have a common member,
and 0 otherwise.

Table 1.12: Acceptable Exam Times


Course English Calculus History Physics
Acceptable Thur. AM Wed. AM Tues. AM Tues. AM
exam times Thur. AM Wed. AM Wed. AM

(b) Repeat part (a) for Table 1.11.


12. (a) Suppose that there are only two possible nal examination times for the courses
considered in Table 1.10. Is there an assignment of nal exam times so that
any two classes having a common member get a dierent exam time? If so,
nd such an assignment. If not, why not?
(b) Repeat part (a) for Table 1.10 if there are three possible nal exam times.
(c) Repeat part (a) for Table 1.11 if there are three possible nal exam times.
(d) Repeat part (a) for Table 1.11 if there are four possible nal exam times.
13. Suppose that in the situation of Table 1.5, the acceptable meeting times for Trans-
portation are Tuesday and Thursday, for Education is Wednesday, and for all others
are Tuesday, Wednesday, and Thursday. Show that no assignment of meeting times
is possible.
14. (a) Suppose that in the situation of Table 1.10, the acceptable exam time schedules
for each course are given in Table 1.12. Answer Exercise 12(b) if, in addition,
each exam must be scheduled at an acceptable time.
(b) Suppose that in the situation of Table 1.11, the acceptable exam time schedules
for each course are given in Table 1.13. Answer Exercise 12(d) if, in addition,
each exam must be scheduled at an acceptable time.
15. Suppose that there are three possible nal exam times, Tuesday, Wednesday, and
Thursday mornings. Suppose that each instructor of the courses listed in Table 1.10
requests Tuesday morning as a rst choice for nal exam time. What assignment
(assignments) of exam times, if any exist, gives the largest number of instructors
their rst choice?
References for Chapter 1 13

Table 1.13: More Acceptable Exam Times


Course English Calculus History Physics Economics
Acceptable Wed. AM Tues. AM Tues. AM Tues. AM Mon. AM
exam times Wed. AM Wed. AM Thur. AM Wed. AM

REFERENCES FOR CHAPTER 1


Ahuja, R. K., Magnanti, T. L., and Orlin, J. B., Network Flows: Theory, Algorithms,
and Applications, Prentice Hall, Englewood Clis, NJ, 1993.
Alon, N., \Restricted Colorings of Graphs," in K. Walker (ed.), Surveys in Combi-
natorics, Proceedings 14th British Combinatorial Conference, London Math. Soc.
Lecture Note Series, Vol. 187, Cambridge University Press, Cambridge, 1993, 1{33.
Baker, K. R., \Workforce Allocation in Cyclical Scheduling Problems," Oper. Res.
Quart., 27 (1976), 155{167.
Bartholdi, J. J., III, Orlin, J. B., and Ratliff, H. D., \Cyclic Scheduling via Integer
Programs with Circular Ones," Oper. Res., 28 (1980), 1074{1085.
Biggs, N. L., Lloyd, E. K., and Wilson, R. J., Graph Theory 1736{1936, Oxford
University Press, London, 1976.
Biggs, N. L., Lloyd, E. K., and Wilson, R. J., \The History of Combinatorics," in
R. L. Graham, M. Grotschel, and L. Lovasz (eds.), Handbook of Combinatorics,
Elsevier, Amsterdam, 1995, 2163{2198.
Bodin, L. D., and Friedman, A. J., \Scheduling of Committees for the New York
State Assembly," Tech. Report USE No. 71{9, Urban Science and Engineering,
State University of New York, Stony Brook, NY, 1971.
Brucker, P., Scheduling Algorithms, Springer-Verlag, Berlin, 1998.
Chretienne, P., \On Graham's Bound for Cyclic Scheduling," Parallel Comput., 26
(2000), 1163{1174.
Crama, Y., Kats, V., van de Klundert, J., and Levner, E., \Cyclic Scheduling in
Robotic Flowshops," Ann. Oper. Res., 96 (2000), 97{124.
David, F. N., Games, Gods, and Gambling, Hafner Press, New York, 1962. (Reprinted
by Dover, New York, 1998.)
Frank, H., and Frisch, I. T., \Network Analysis," Sci. Amer., 223 (1970), 94{103.
Herzberg, A. M., and Stanton, R. G., \The Relation Between Combinatorics and
the Statistical Design of Experiments," J. Combin. Inform. System Sci., 9 (1984),
217{232.
Karp, R. M., and Orlin, J. B., \Parametric Shortest Path Algorithms with an Appli-
cation to Cyclic Stang," Discrete Appl. Math., 3 (1981), 37{45.
Kleitman, D. J., \Comments on the First Two Days' Sessions and a Brief Description
of a Gas Pipeline Network Construction Problem," in F. S. Roberts (ed.), Energy:
Mathematics and Models, SIAM, Philadelphia, 1976, 239{252.
Kovalev, M. Ya., Shafranskij, Ya. M., Strusevich, V. A., Tanaev, V. S., and
Tuzikov, A. V., \Approximation Scheduling Algorithms: A Survey," Optimization,
20 (1989), 859{878.
Kratochvl, J., Tuza, Z., and Voigt, M., \New Trends in the Theory of Graph
Colorings: Choosability and List Coloring," in R. L. Graham, J. Kratochvl, J.
14 Chapter 1. What Is Combinatorics?
Nesetril, and F. S. Roberts (eds.), Contemporary Trends in Discrete Mathematics,
DIMACS Series, Vol. 49, American Mathematical Society, Providence, RI, 1999,
183{197.
Liu, C. L., Topics in Combinatorial Mathematics, Mathematical Association of America,
Washington, DC, 1972.
Orlin, J. B., \Minimizing the Number of Vehicles to Meet a Fixed Periodic Schedule:
An Application of Periodic Posets," Oper. Res., 30 (1982), 760{776.
Rothfarb, B., Frank, H., Rosenbaum, D. M., Steiglitz, K., and Kleitman, D. J.,
\Optimal Design of Oshore Natural-Gas Pipeline Systems," Oper. Res., 18 (1970),
992{1020.
Tien, J. M., and Kamiyama, A., \On Manpower Scheduling Algorithms," SIAM Rev.,
24 (1982), 275{287.
Tucker, A. C., \Coloring a Family of Circular Arcs," SIAM J. Appl. Math., 29 (1975),
493-502.
Zadeh, N., \Construction of Ecient Tree Networks: The Pipeline Problem," Networks,
3 (1973), 1{32.
PART I. The Basic Tools of Combinatorics

Chapter 2
Basic Counting Rules 1

2.1 THE PRODUCT RULE


Some basic counting rules underlie all of combinatorics. We summarize them in this
chapter. The reader who is already familiar with these rules may wish to review
them rather quickly. This chapter also introduces a widely used tool for proving
that a certain kind of arrangement or pattern exists. In reading this chapter the
reader already familiar with counting may wish to concentrate on the variety of
applications that may not be as familiar, many of which are returned to in later
chapters.
Example 2.1 Bit Strings and Binary Codes (Example 1.2 Revisited) Let
us return to our binary code example (Example 1.2), and ask again how many
letters of the alphabet can be encoded if there are exactly 2 bits. Let us get the
answer by drawing a tree diagram. We do that in Figure 2.1. There are 4 possible
strings of 2 bits, as we noted before. The reader will observe that there are 2 choices
for the rst bit, and for each of these choices, there are 2 choices for the second bit,
and 4 is 2 2. 

Example 2.2 DNA The total of all the genetic information of an organism is its
genome. It is convenient to think of the genome as one long deoxyribonucleic acid
(DNA) molecule. (The genome is actually made up of pieces of DNA representing
the individual chromosomes.) The DNA (or chromosomes) is composed of a string of
building blocks known as nucleotides. The genome size can be expressed in terms
of the total number of nucleotides. Since DNA is actually double-stranded with
the two strands held together by virtue of pairings between specic bases (a base
being one of the three subcomponents of a nucleotide), genome sizes are usually
1 This chapter was written by Helen Marcus-Roberts, Fred S. Roberts, and Barry A. Tesman.

15
16 Chapter 2. Basic Counting Rules
First Second String
0 00
0
1 01
0 10
1
1 11
Figure 2.1: A tree diagram for counting the number of bit strings of length 2.

expressed in terms of base pairs (bp). Each base in a nucleotide is one of four
possible chemicals: thymine, T cytosine, C adenine, A guanine, G. The sequence
of bases encodes certain genetic information. In particular, it determines long chains
of amino acids which are known as proteins. There are 20 basic amino acids. A
sequence of bases in a DNA molecule will encode one such amino acid. How long
does a string of a DNA molecule have to be for there to be enough possible bases
to encode 20 dierent amino acids? For example, can a 2-element DNA sequence
encode for the 20 dierent basic amino acids? To answer this, we need to ask: How
many 2-element DNA sequences are there? The answer to this question is again
given by a tree diagram, as shown in Figure 2.2. We see that there are 16 possible
2-element DNA sequences. There are 4 choices for the rst element, and for each of
these choices, there are 4 choices for the second element the reader will notice that
16 is 4 4. Notice that there are not enough 2-element sequences to encode for all
20 dierent basic amino acids. In fact, a sequence of 3 elements does the encoding
in practice. A simple counting procedure has shown why at least 3 elements are
needed. 
The two examples given above illustrate the following basic rule.
PRODUCT RULE: If something can happen in n1 ways, and no matter how
the rst thing happens, a second thing can happen in n2 ways, then the two things
together can happen in n1 n2 ways. More generally, if something can happen in
n1 ways, and no matter how the rst thing happens, a second thing can happen in
n2 ways, and no matter how the rst two things happen, a third thing can happen
in n3 ways, and ..., then all the things together can happen in
n1 n2 n3   
ways.
Returning to bit strings, we see immediatelyby the product rule that the number
of strings of exactly 3 bits is given by 2 2 2 = 23 = 8 since there are two choices
for the rst bit (0 or 1), and no matter how it is chosen, there are two choices for
2.1 The Product Rule 17
First Second Chain
T TT
T C TC
A TA
G TG
T CT
C C CC
A CA
G CG
T AT
C AC
A A AA
G AG
T GT
C GC
G A GA
G GG

Figure 2.2: A tree diagram for counting the number of 2-element DNA
sequences.

the second bit, and no matter how the rst 2 bits are chosen, there are two choices
for the third bit. Similarly, in the pipeline problem of Example 1.3, if there are 7
choices of pipe size for each of 3 links, there are
7 7 7 = 73 = 343
dierent possible networks. If there are 40 links, there are
7 7    7 = 740
dierent possible networks. Note that by our observations in Chapter 1, it is in-
feasible to count the number of possible pipeline networks by enumerating them
(listing all of them). Some method of counting other than enumeration is needed.
The product rule gives such a method. In the early part of this book, we shall be
concerned with such simple methods of counting.
Next, suppose that A is a set of a objects and B is a set of b objects. Then the
number of ways to pick one object from A and then one object from B is a b.
This statement is a more precise version of the product rule.
To give one nal example, the number of 3-element DNA sequences is
4 4 4 = 43 = 64:
That is why there are enough dierent 3-element sequences to encode for all 20
dierent basic amino acids indeed, several dierent chains encode for the same
18 Chapter 2. Basic Counting Rules

Table 2.1: The Number of Possible DNA Sequences for Various Organisms
Genus and Genome size Number of possible
Phylum Species (base pairs) sequences
Algae P. salina 6:6 105 4 6: 6 10 5
> 103:97 105
Mycoplasma M. pneumoniae 1:0 106 41:0 106 > 106:02 105
Bacterium E. coli 4:2 106 44:2 106 > 102:52 106
Yeast S. cerevisiae 1:3 107 41:3 1077 > 107:82 1076
Slime mold D. discoideum 5:4 107 45:4 10 > 103:25 10
Nematode C. elegans 8:0 107 48:0 107 > 104:81 107
Insect D. melanogaster 1:4 108 41:4 108 > 108:42 107
Bird G. domesticus 1:2 109 41:2 1099 > 107:22 1098
Amphibian X. laevis 3:1 109 43:1 10 > 101:86 10
Mammal H. sapiens 3:3 109 43:3 109 > 101:98 109
Source : Lewin !2000].

amino acid|this is dierent from the situation in Morse code, where strings of up
to 4 bits are required to encode for all 26 letters of the alphabet but not every
possible string is used. In Section 2.9 we consider Gamow's !1954a,b] suggestion
about which 3-element sequences encode for the same amino acid.
Continuing with DNA molecules, we see that the number of sequences of 4 bases
is 44, the number with 100 bases is 4100. How long is a full-edged DNA molecule?
Some answers are given in Table 2.1. Notice that in slime mold (D. discoideum ),
the genome has 5:4 107 bases or base pairs. Thus, the number of such sequences
is
45:4 107 
which is greater than
103:2 107 :
This number is 1 followed by 3:2 107 zeros or 32 million zeros! It is a number that
is too large to comprehend. Similar results hold for other organisms. By a simple
counting of all possibilities, we can understand the tremendous possible variation
in genetic makeup. It is not at all surprising, given the number of possible DNA
sequences, that there is such an amazing variety in nature, and that two individ-
uals are never the same. It should be noted once more that given the tremendous
magnitude of the number of possibilities, it would not have been possible to count
these possibilities by the simple expedient of enumerating them. It was necessary
to develop rules or procedures for counting, which counted the number of possi-
bilities without simply listing them. That is one of the three basic problems in
combinatorics: developing procedures for counting without enumerating.
As large as the number of DNA sequences is, it has become feasible, in part
due to the use of methods of combinatorial mathematics, to \sequence" and \map"
2.1 The Product Rule 19

ABC DEF
1 2 3
GHI JKL MNO
4 5 6
PRS TUV WXY Figure 2.3: A telephone pad.
7 8 9

* 0 #

genomes of dierent organisms, including humans. A gene is a strip of DNA that


carries the code for making a particular protein. \Mapping" the genome would
require localizing each of its genes \sequencing" it would require determining the
exact order of the bases making up each gene. In humans, this involves approxi-
mately 100,000 genes, each with a thousand or more bases. For more on the use
of combinatorial mathematics in mapping and sequencing the genome, see Clote
and Backofen !2000], Congress of the United States !1988], Farach-Colton, et al.
!1999], Guseld !1997], Lander and Waterman !1995], Pevzner !2000], Setubal and
Meidanis !1997], or Waterman !1995].
Example 2.3 Telephone Numbers At one time, a local telephone number was
given by a sequence of two letters followed by ve numbers. How many dierent
telephone numbers were there? Using the product rule, one is led to the answer:
26 26 10 10 10 10 10 = 262 105:
While the count is correct, it doesn't give a good answer, for two letters on the
same place on the pad led to the same telephone number. The reader might wish
to envision a telephone pad. (A rendering of one is given in Figure 2.3.) There are
three letters on all digits, except that 1 and 0 have no letters. Hence, letters A, B,
and C were equivalent so were W, X, and Y and so on. There were, in eect, only
8 dierent letters. The number of dierent telephone numbers was therefore
82 105 = 6:4 106:
Thus, there were a little over 6 million such numbers. In the 1950s and 1960s,
most local numbers were changed to become simply seven-digit numbers, with the
restriction that neither of the rst two digits could be 0 or 1. The number of
telephone numbers was still 82 105 . Direct distance dialing was accomplished by
adding a three-digit area code. The area code could not begin with a 0 or 1, and
it had to have 0 or 1 in the middle. Using these restrictions, we compute that the
number of possible telephone numbers was
8 2 10 82 105 = 1:024 109:
20 Chapter 2. Basic Counting Rules
Bit string x S(x) T(x)
00 1 0
Table 2.2: Two Switching 01 0 0
Functions 10 0 1
11 1 1

That was enough to service over 1 billion customers. To service even more cus-
tomers, direct distance dialing was changed to include a starting 1 as an 11th digit
for long-distance calls. This freed up the restriction that an area code must have a
0 or 1 in the middle. The number of telephone numbers grew to
1 8 10 10 82 105 = 5:12 109:
With increasingly better technology, the telecommunications industry could boast
that with the leading 1, there are no restrictions on the next 10 digits. Thus, there
are now 1010 possible telephone numbers. However, demand continues to increase
at a very fast pace (e.g., multiple lines, fax machines, cellular phones, pagers, etc.).
What will we do when 1010 numbers are not enough? 

Example 2.4 Switching Functions Let Bn be the set of all bit strings of length
n. A switching function (Boolean function) of n variables is a function that assigns
to each bit string of length n a number 0 or 1. For instance, let n = 2. Then
B2 = f00 01 10 11g. Two switching functions S and T dened on B2 are given in
Table 2.2. The problem of making a detailed design of a digital computer usually
involves nding a practical circuit implementation of certain functional behavior.
A computer device implements a switching function of two, three, or four variables.
Now every switching function can be realized in numerous ways by an electrical
network of interconnections. Rather than trying to gure out from scratch an
ecient design for a given switching function, a computer engineer would like to have
a catalog that lists, for every switching function, an ecient network realization.
Unfortunately, this seems at rst to be an impractical goal. For how many switching
functions of n variables are there? There are 2n elements in the set Bn by a
generalization of Example 2.1. Hence, by the product rule, there are 2 2    2
dierent n-variable switching
n
functions, where there are 2n terms in the product.
In total, there are 2 dierent n-variable switching functions. Even the number
2
of such functions for n = 4 is 65,536, and the number grows astronomically fast.
Fortunately, by taking advantage of symmetries, we can consider certain switching
functions equivalent as far as what they compute is concerned. Then we need not
identify the best design for every switching function we need do it only for enough
switching functions so that every other switching function is equivalent to one of
those for which we have identied the best design. While the rst computers were
being built, a team of researchers at Harvard painstakingly enumerated all possible
switching functions of 4 variables, and determined which were equivalent. They
discovered that it was possible to reduce every switching function to one of 222
Exercises for Section 2.1 21

types (Harvard Computation Laboratory Sta !1951]). In Chapter 8 we show how


to derive results such as this from a powerful theorem due to George Polya. For
a more detailed discussion of switching functions, see Deo !1974, Ch. 12], Harrison
!1965], Hill and Peterson !1968], Kohavi !1970], Liu !1977], Muroga !1990], Pattavina
!1998], Prather !1976], or Stone !1973]. 
Example 2.5 Food Allergies An allergist sees a patient who often develops
a severe upset stomach after eating. Certain foods are suspected of causing the
problem: tomatoes, chocolate, corn, and peanuts. It is not clear if the problem
arises because of one of these foods or a combination of them acting together.
The allergist tells the patient to try dierent combinations of these foods to see
whether there is a reaction. How many dierent combinations must be tried? Each
food can be absent or present. Thus, there are 2 2 2 2 = 24 = 16 possible
combinations. In principle, there are 224 possible manifestations of food allergies
based on these four foods each possible combination of foods can either bring forth
an allergic reaction or not. Each person's individual sensitivity to combinations
of these foods corresponds to a switching function S(x1  x2 x3 x4) where x1 is 1 if
there are tomatoes in the diet, x2 is 1 if there is chocolate in the diet, x3 is 1 if there
is corn in the diet, x4 is 1 if there are peanuts in the diet. For instance, a person
who develops an allergic reaction any time tomatoes are in the diet or any time both
corn and peanuts are in the diet would demonstrate the switching function S which
has S(1 0 0 0) = 1, S(1 1 0 0) = 1, S(0 0 1 1) = 1, S(0 1 1 0) = 0, and so on.
In practice, it is impossible to know the value of a switching function on all possible
bit strings if the number of variables is large there are just too many possible bit
strings. Then the practical problem is to develop methods to guess the value of a
switching function that is only partially dened. There is much recent work on this
problem. See, for example, Boros, et al. !1995], Boros, Ibaraki, and Makino !1998],
Crama, Hammer, and Ibaraki !1988], and Ekin, Hammer, or Kogan !2000]. Similar
cause-and-eect problems occur in diagnosing failure of a complicated electronic
system given a record of failures when certain components fail (we shall have more
to say about this in Example 2.21) and in teaching a robot to maneuver in an area
lled with obstacles where an obstacle might appear as a certain pattern of dark or
light pixels and in some situations the pattern of pixels corresponds to an object
and in others it does not. For other applications, see Boros, et al. !2000]. 
EXERCISES FOR SECTION 2.12
1. The population of Carlisle, Pennsylvania, is about 20,000. If each resident has three
initials, is it true that there must be at least two residents with the same initials?
Give a justication of your answer.
2 Note to reader: In the exercises in Chapter 2, exercises after each section can be assumed
to use techniques of some previous (nonoptional) section, not necessarily exactly the techniques
just introduced. Also, there are additional exercises at the end of the chapter. Indeed, sometimes
an exercise is included which does not make use of the techniques of the current section. To
understand a new technique, one must understand when it does not apply as well as when it
applies.
22 Chapter 2. Basic Counting Rules
2. A library has 1,700,000 books, and the librarian wants to encode each using a code-
word consisting of 3 letters followed by 3 numbers. Are there enough codewords to
encode all 1,700,000 books with dierent codewords?
3. (a) Continuing with Exercise 7 of Chapter 1, compute the maximum number of
strings of length at most 3 in a trinary code.
(b) Repeat for length at most 4.
(c) Repeat for length exactly 4, but beginning with a 0 or 1.
4. In our discussion of telephone numbers, suppose that we maintain the original re-
strictions on area code as in Example 2.3. Suppose that we lengthen the local phone
number, allowing it to be any eight-digit number with the restriction that none of
the rst three digits can be 0 or 1. How many local phone numbers are there? How
many phone numbers are there including area code?
5. If we want to use bit strings of length at most n to encode not only all 26 letters
of the alphabet, but also all 10 decimal digits, what is the smallest number n that
works? (What is n for Morse code?)
6. How many m  n matrices are there each of whose entries is 0 or 1?
7. A musical band has to have at least one member. It can contain at most one
drummer, at most one pianist, at most one bassist, at most one lead singer, and at
most two background singers. How many possible bands are there if we consider any
two drummers indistinguishable, and the same holds true for the other categories,
and hence call two bands the same if they have the same number of members of
each category? Justify your answer.
8. How many nonnegative integers less than 1 million contain the digit 2?
9. Enumerate all switching functions of 2 variables.
10. If a function assigns 0 or 1 to each switching function of n variables, how many such
functions are there?
11. A switching function S is called self-dual if the value S of a bit string is unchanged
when 0's and 1's are interchanged. For instance, the function S of Table 2.2 is
self-dual, but the function T of that table is not. How many self-dual switching
functions of n variables are there?
12. (Stanat and McAllister 1977]) In some computers, an integer (positive or negative)
is represented by using bit strings of length p. The last bit in the string represents
the sign, and the rst p ; 1 bits are used to encode the integer. What is the largest
number of distinct integers that can be represented in this way for a given p? What
if 0 must be one of these integers? (The sign of 0 is + or ;.)
13. (Stanat and McAllister 1977]) Every integer can be represented (nonuniquely) in
the form a  2b , where a and b are integers. The oating-point representation for an
integer uses a bit string of length p to represent an integer by using the rst m bits
to encode a and the remaining p ; m bits to encode b, with the latter two encodings
performed as described in Exercise 12.
(a) What is the largest number of distinct integers that can be represented using
the "oating-point notation for a given p?
(b) Repeat part (a) if the "oating-point representation is carried out in such a way
that the leading bit for encoding the number a is 1.
(c) Repeat part (a) if 0 must be included.
2.2 The Sum Rule 23

14. When acting on loan applications it can be concluded, based on historical records,
that loan applicants having certain combinations of features can be expected to
repay their loans and those who have other combinations of features cannot. As
their main features, suppose that a bank uses:
Marital Status: Married, Single (never married), Single (previously married).
Past Loan: Previous default, No previous default.
Employment: Employed, Unemployed (within 1 year), Unemployed (more than 1
year).
(a) How many dierent loan applications are possible when considering these fea-
tures?
(b) How many manifestations of loan repayment/default are possible when con-
sidering these features?

2.2 THE SUM RULE


We turn now to the second fundamental counting rule. Consider the following
example.
Example 2.6 Congressional Delegations There are 100 senators and 435
members of the House of Representatives. A delegation is being selected to see
the President. In how many dierent ways can such a delegation be picked if it
consists of one senator and one representative? The answer, by the product rule, is
100 435 = 43,500:
What if the delegation is to consist of one member of the Senate or one member of
the House? Then there are
100 + 435 = 535
possible delegations. This computation illustrates the second basic rule of counting,
the sum rule. 
SUM RULE: If one event can occur in n1 ways and a second event in n2
(dierent) ways, then there are n1 + n2 ways in which either the rst event or the
second event can occur (but not both ). More generally, if one event can occur in n1
ways, a second event can occur in n2 (dierent) ways, a third event can occur in n3
(still dierent) ways, ..., then there are
n1 + n2 + n3 +   
ways in which (exactly) one of the events can occur.
In Example 2.6 we have italicized the words \and" and \or." These key words
usually indicate whether the sum rule or the product rule is appropriate. The word
\and" suggests the product rule the word \or" suggests the sum rule.
24 Chapter 2. Basic Counting Rules
Example 2.7 Draft Picks A professional football team has two draft choices
to make and has limited the choice to 3 quarterbacks, 4 linebackers, and 5 wide
receivers. To pick a quarterback and linebacker there are 3 4 = 12 ways, by the
product rule. How many ways are there to pick two players if they must play dier-
ent positions? You can pick either a quarterback and linebacker, quarterback and
wide receiver, or linebacker and wide receiver. There are, by previous computation,
12 ways of doing the rst. There are 15 ways of doing the second (why?) and 20
ways of doing the third (why?). Hence, by the sum rule, the number of ways of
choosing the two players from dierent positions is
12 + 15 + 20 = 47: 
Example 2.8 Variables in BASIC and JAVA The programming language
BASIC (standing for Beginner's All-Purpose Symbolic Instruction Code) dates back
to 1964. Variable names in early implementations of BASIC could either be a letter,
a letter followed by a letter, or a letter followed by a decimal digit, that is, one of
the numbers 0 1 : : : 9. How many dierent variable names were possible? By the
product rule, there were 26 26 = 676 and 26 10 = 260 names of the latter two
kinds, respectively. By the sum rule, there were 26+676+260 = 962 variable names
in all.
The need for more variables was but one reason for more advanced programming
languages. For example, the JAVA programming language, introduced in 1995, has
variable name lengths ranging from 1 to 65,535 characters. Each character can be
a letter (uppercase or lowercase), an underscore, a dollar sign, or a decimal digit
except that the rst character cannot be a decimal digit. By using the sum rule,
we see that the number of possible characters is 26 + 26 + 1 + 1 + 10 = 64 except
for the rst character which has only 64 ; 10 = 54 possibilities. Finally, by using
the sum and product rules, we see that the number of variable names is
54  6465534 + 54  6465533 +    + 54  64 + 54:
This certainly allows for more than enough variables.3 
In closing this section, let us restate the sum rule this way. Suppose that A and
B are disjoint sets and we wish to pick exactly one element, picking it from A or
from B. Then the number of ways to pick this element is the number of elements
in A plus the number of elements in B.

EXERCISES FOR SECTION 2.2


1. How many bit strings have length 3, 4, or 5?
2. A committee is to be chosen from among 8 scientists, 7 psychics, and 12 clerics.
If the committee is to have two members of dierent backgrounds, how many such
committees are there?
3 The value of just the rst term in the sum, 54  6465534 , is approximately 8:527 10118367 .
Arguably, there are at least on the order of 1080 atomic particles in the universe (e.g., see Dembski
1998]).
2.3 Permutations 25

3. How many numbers are there which have ve digits, each being a number in f1 2 : : : 
9g, and either having all digits odd or having all digits even?
4. Each American Express card has a 15-digit number for computer identication pur-
poses. If each digit can be any number between 0 and 9, are there enough dierent
account numbers for 10 million credit-card holders? Would there be if the digits
were only 0 or 1?
5. How many 5-letter words either start with d or do not have the letter d?
6. In how many ways can we get a sum of 3 or a sum of 4 when two dice are rolled?
7. Suppose that a pipeline network is to have 30 links. For each link, there are 2
choices: The pipe may be any one of 7 sizes and any one of 3 materials. How many
dierent pipeline networks are there?
8. How many DNA chains of length 3 have no C's at all or have no T's in the rst
position?

2.3 PERMUTATIONS
In combinatorics we frequently talk about n-element sets, sets consisting of n dis-
tinct elements. It is convenient to call these n-sets. A permutation of an n-set is an
arrangement of the elements of the set in order. It is often important to count the
number of permutations of an n-set.
Example 2.9 Job Interviews Three people, Ms. Jordan, Mr. Harper, and Ms.
Gabler, are scheduled for job interviews. In how many dierent orders can they be
interviewed? We can list all possible orders, as follows:
1. Jordan, Harper, Gabler
2. Jordan, Gabler, Harper
3. Harper, Jordan, Gabler
4. Harper, Gabler, Jordan
5. Gabler, Jordan, Harper
6. Gabler, Harper, Jordan
We see that there are 6 possible orders. Alternatively, we can observe that there
are 3 choices for the rst person being interviewed. For each of these choices, there
are 2 remaining choices for the second person. For each of these choices, there is 1
remaining choice for the third person. Hence, by the product rule, the number of
possible orders is
3 2 1 = 6:
Each order is a permutation. We are asking for the number of permutations of a
3-set, the set consisting of Jordan, Harper, and Gabler.
If there are 5 people to be interviewed, counting the number of possible orders
can still be done by enumeration however, that is rather tedious. It is easier to ob-
serve that now there are 5 possibilities for the rst person, 4 remaining possibilities
for the second person, and so on, resulting in
5 4 3 2 1 = 120
26 Chapter 2. Basic Counting Rules

Table 2.3: Values of n! for n from 0 to 10


n 0 1 2 3 4 5 6 7 8 9 10
n! 1 1 2 6 24 120 720 5,040 40,320 362,880 3,628,800

possible orders in all. 


The computations of Example 2.9 generalize to give us the following result: The
number of permutations of an n-set is given by
n (n ; 1) (n ; 2)    1 = n!
In Example 1.1 we discussed the number of orders in which to take 5 dierent drugs.
This is the same as the number of permutations of a 5-set, so it is 5! = 120. To see
once again why counting by enumeration rapidly becomes impossible, we show in
Table 2.3 the values of n! for several values of n.
The number 25!, to give an example, is already so large that it is incomprehen-
sible. To see this, note that
25!  1:55 1025:
A computer checking 1 billion permutations per second would require almost half
a billion years to look at 1:55 1025 permutations.4 In spite of the result above,
there are occasions where it is useful to enumerate all permutations of an n-set. In
Section 2.16 we present an algorithm for doing so. p
The number n! can be approximated by computing sn = 2n(n=e)n . The
approximation of n! by sn is called Stirling's approximation. To see how good the
approximation is, note that it approximates 5! as s5 = 118:02 and 10! as s10 =
3,598,600. (Compare these with the real values in Table 2.3.) The quality of the
approximation is evidenced by the fact that the ratio of n! to sn approaches 1 as n
approaches 1 (grows arbitrarily large). (On the other hand, the dierence n! ; sn
approaches 1 as n approaches 1.) For a proof, see an advanced calculus text such
as Buck !1965].

EXERCISES FOR SECTION 2.3


1. List all permutations of
(a) f1 2 3g (b) f1 2 3 4g
4 To see why, note that there are approximately 3:15 107 seconds in a year. Thus, a computer
checking 1 billion = 109 permutations per second can check 3:15 107 109 = 3:15 1016
permutations in a year. Hence, the number of years required to check 1:55 1025 permutations is
1:55 1025
3:15 1016  4:9 10 :
8
2.4 Complexity of Computation 27

2. How many permutations of f1 2 3 4 5g begin with 5?


3. How many permutations of f1 2 : : :  ng begin with 1 and end with n?
4. Compute sn and compare it to n! if
(a) n = 4 (b) n = 6 (c) n = 8
5. How many permutations of f1 2 3 4g begin with an odd number?
6. (a) How many permutations of f1 2 3 4 5g have 2 in the second place?
(b) How many permutations of f1 2 : : :  ng, n  3, have 2 in the second place and
3 in the third place?
7. How many ways are there to rank ve potential basketball recruits of dierent heights
if the tallest one must be ranked rst and the shortest one last?
8. (Cohen 1978])
(a) In a six-cylinder engine, the even-numbered cylinders are on the left and the
odd-numbered cylinders are on the right. A good ring order is a permutation
of the numbers 1 to 6 in which right and left sides are alternated. How many
possible good ring orders are there which start with a left cylinder?
(b) Repeat for a 2n-cylinder engine.
9. Ten job applicants have been invited for interviews, ve having been told to come
in the morning and ve having been told to come in the afternoon. In how many
dierent orders can the interviews be scheduled? Compare your answer to the num-
ber of dierent orders in which the interviews can be scheduled if all 10 applicants
were told to come in the morning.

2.4 COMPLEXITY OF COMPUTATION


We have already observed that not all problems of combinatorics can be solved
on the computer, at least not by enumeration. Suppose that a computer program
implements an algorithm for solving a combinatorial problem. Before running such
a program, we would like to know if the program will run in a \reasonable" amount
of time and will use no more than a \reasonable" (or allowable) amount of storage
or memory. The time or storage a program requires depends on the input. To
measure how expensive a program is to run, we try to calculate a cost function or a
complexity function. This is a function f that measures the cost, in terms of time
required or storage required, as a function of the size n of the input problem. For
instance, we might ask how many operations are required to multiply two square
matrices of n rows and columns each. This number of operations is f(n).
Usually, the cost of running a particular computer program on a particular
machine will vary with the skill of the programmer and the characteristics of the
machine. Thus there is a big emphasis in modern computer science on comparison
of algorithms rather than programs, and on estimation of the complexity f(n) of
an algorithm, independent of the particular program or machine used to implement
the algorithm. The desire to calculate complexity of algorithms is a major stimulus
for the development of techniques of combinatorics.
28 Chapter 2. Basic Counting Rules
Example 2.10 The Traveling Salesman Problem A salesman wishes to visit
n dierent cities, starting and ending his business trip at the rst city. He does not
care in which order he visits the cities. What he does care about is to minimize the
total cost of his trip. Assume that the cost of traveling from city i to city j is cij .
The problem is to nd an algorithm for computing the cheapest route, where the
cost of a route is the sum of the cij for links used in the route. This is a typical
combinatorial optimization problem.
For the traveling salesman problem, we shall be concerned with the enumeration
algorithm: Enumerate all possible routes and calculate the cost of each route. We
shall try to compute the complexity f(n) of this algorithm, where n is the size of
the input, that is, the number of cities. We shall assume that identifying a route
and computing its cost is comparable for each route and takes 1 unit of time.
Now any route starting and ending at city 1 corresponds to a permutation of
the remaining n ; 1 cities. Hence, there are (n ; 1)! such routes, so f(n) = (n ; 1)!
units of time. We have already shown that this number can be extremely high.
When n is 26 and n ; 1 is 25, we showed that f(n) is so high that it is infeasible to
perform this algorithm by computer. We return to the traveling salesman problem
in Section 11.5. 
It is interesting to note that the traveling salesman problem occurs in many
guises. Examples 2.11 to 2.16 give some of the alternative forms in which this
problem has arisen in practice.
Example 2.11 The Automated Teller Machine (ATM) Problem Your
bank has many ATMs. Each day, a courier goes from machine to machine to
make collections, gather computer information, and so on. In what order should
the machines be visited in order to minimize travel time? This problem arises in
practice at many banks. One of the rst banks to use a traveling salesman algorithm
to solve it, in the early days of ATMs, was Shawmut Bank in Boston.5 
Example 2.12 The Phone Booth Problem Once a week, each phone booth
in a region must be visited, and the coins collected. In what order should that be
done in order to minimize travel time? 
Example 2.13 The Problem of Robots in an Automated Warehouse The
warehouse of the future will have orders lled by a robot. Imagine a pharmaceutical
warehouse with stacks of goods arranged in rows and columns. An order comes in
for 10 cases of aspirin, six cases of shampoo, eight cases of Band-Aids, and so on.
Each is located by row, column, and height. In what order should the robot ll the
order in order to minimize the time required? The robot needs to be programmed
to solve a traveling salesman problem. (See Elsayed !1981] and Elsayed and Stern
!1983].) 
Example 2.14 A Problem of X-Ray Crystallography In x-ray crystallogra-
phy, we must move a diractometer through a sequence of prescribed angles. There
5 This example is from Margaret Cozzens (personal communication).
2.4 Complexity of Computation 29

is a cost in terms of time and setup for doing one move after another. How do we
minimize this cost? (See Bland and Shallcross !1989].) 
Example 2.15 Manufacturing In many factories, there are a number of jobs
that must be performed or processes that must be run. After running process i,
there is a certain setup cost before we can run process j a cost in terms of time
or money or labor of preparing the machinery for the next process. Sometimes
this cost is small (e.g., simply making some minor adjustments) and sometimes it
is major (e.g., requiring complete cleaning of the equipment or installation of new
equipment). In what order should the processes be run to minimize total cost? (For
more on this application, see Example 11.5 and Section 11.6.3.) 
Example 2.16 Holes in Circuit Boards In 1993, Applegate, Bixby, Chvatal,
and Cook (see http://www.cs.rutgers.edu/ chvatal/pcb3038.html) found the solu-
tion to the largest TSPLIB6 traveling salesman problem solved up until that time.
It had 3,038 cities and arose from a practical problem involving the most ecient or-
der in which to drill 3,038 holes to make a circuit board (another traveling salesman
problem application). For information about this, see Zimmer !1993].7 
The traveling salesman problem is an example of a problem that has deed the
eorts of researchers to nd a \good" algorithm. Indeed, it belongs to a class of
problems known as NP-complete or NP-hard problems, problems for which it is
unlikely there will be a good algorithm in a very precise sense of the word good.
We return to this point in Section 2.18, where we dene NP-completeness briey
and dene an algorithm to be a good algorithm if its complexity function f(n) is
bounded by a polynomial in n. Such an algorithm is called a polynomial algorithm
(more precisely, a polynomial-time algorithm).
Example 2.17 Scheduling a Computer System8 A computer center has n
programs to run. Each program requires certain resources, such as a compiler, a
number of processors, and an amount of memory per processor. We shall refer
to the required resources as a conguration corresponding to the program. The
conversion of the system from the ith conguration to the jth conguration has
a cost associated with it, say cij . For instance, if two programs require a similar
conguration, it makes sense to run them consecutively. The computer center would
like to minimize the total costs associated with running the n programs. The xed
cost of running each program does not change with dierent orders of running
the programs. The only things that change are the conversion costs cij . Hence,
the center wants to nd an order in which to run the programs such that the total
6 The TSPLIB (http://www.iwr.uni-heidelberg.de/iwr/comopt/software/TSPLIB95/) is a li-
brary of 110 instances of the traveling salesman problem.
7 All instances in the TSPLIB library have been solved. The largest instance of the traveling
salesman problem in TSPLIB consists of a tour through 85,900 cities in a VLSI (Very Large-
Scale Integration) application. For a survey about the computational aspects of the traveling
salesman problem, see Applegate, et al. 1998]. See also the Traveling Salesman Problem home
page (http://www.tsp.gatech.edu/index.html).
8 This example is due to Stanat and McAllister 1977].
30 Chapter 2. Basic Counting Rules
conversion costs are minimized. Similar questions arise in many scheduling problems
in operations research. We discuss them further in Example 11.5 and Section 11.6.3.
As in the traveling salesman problem, the algorithm of enumerating all possible
orders of running the programs is infeasible, for it clearly has a computational
complexity of n!. !Why n! and not (n ; 1)!?] Indeed, from a formal point of view,
this problem and the traveling salesman problem are almost equivalent|simply
replace cities by congurations. Any algorithm for solving one of these problems is
readily translatable into an algorithm for solving the other problem. It is one of the
major motivations for using mathematical techniques to solve real problems that
we can solve one problem and then immediately have techniques that are applicable
to a large number of other problems, which on the surface seem quite dierent. 

Example 2.18 Searching Through a File In determining computational com-


plexity, we do not always know exactly how long a computation will take. For
instance, consider the problem of searching through a list of n keys (identication
numbers) and nding the key of a particular person in order to access that person's
le. Now it is possible that the key in question will be rst in the list. However,
in the worst case, the key will be last on the list. The cost of handling the worst
possible case is sometimes used as a measure of computational complexity called
the worst-case complexity. Here f(n) would be proportional to n. On the other
hand, another perfectly appropriate measure of computational complexity is the
average cost of handling a case, the average-case complexity. Assuming that all
cases are equally likely, this is computed by calculating the cost of handling each
case, summing up these costs, and dividing by the number of cases. In our example,
the average-case complexity is proportional to (n+ 1)=2, assuming that all keys are
equally likely to be the object of a search, for the sum of the costs of handling the
cases is given by 1 + 2 +    + n. Hence, using a standard formula for this sum, we
have
f(n) = n1 (1 + 2 +    + n) = n1 n(n2+ 1) = n +2 1 : 
In Section 3.6 we discuss the use of binary search trees for storing les and argue
that the computational complexity of nding a le with a given key can be reduced
signicantly by using a binary search tree.

EXERCISES FOR SECTION 2.4


1. If a computer could consider 1 billion orders a second, how many years would it take
to solve the computer conguration problem of Example 2.17 by enumeration if n
is 25?
2. If a computer could consider 100 billion orders a second instead of just 1 billion, how
many years would it take to solve the traveling salesman problem by enumeration
if n = 26? (Does the improvement in computer speed make a serious dierence in
conclusions based on footnote 4 on page 26?)
Exercises for Section 2.4 31

3. Consider the problem of scheduling n legislative committees in order for meetings


in n consecutive time slots. Each committee chair indicates which time slot is his or
her rst choice, and we seek to schedule the meetings so that the number of chairs
receiving their rst choice is as large as possible. Suppose that we solve this problem
by enumerating all possible schedules, and for each we compute the number of chairs
receiving their rst choice. What is the computational complexity of this procedure?
(Make an assumption about the number of steps required to compute the number
of chairs receiving their rst choice.)
4. Suppose that there are n phone booths in a region and we wish to visit each of them
twice, but not in two consecutive times. Discuss the computational complexity of a
naive algorithm for nding an order of visits that minimizes the total travel time.
5. Solve the traveling salesman problem by enumeration if n = 4 and the cost cij is
given in the following matrix:
0 ; 1 8 11 1
(cij ) = B
B 16 ; 3 6 C
@ 4 9 ; 11 C A:
8 3 2 ;
6. Solve the computer system scheduling problem of Example 2.17 if n = 3 and the
cost of converting from the ith conguration to the j th is given by
0 ; 8 11 1
(cij ) = @ 12 ; 4 A :
3 6 ;
;
7. Suppose that it takes 3  10 seconds to examine each key in a list. If there are n
9

keys and we search through them in order until we nd the right one, nd
(a) the worst-case complexity (b) the average-case complexity
;
8. Repeat Exercise 7 if it takes 3  10 seconds to examine each key.
11

9. (Hopcroft 1981]) Suppose that L is a collection of bit strings of length n. Suppose


that A is an algorithm which determines, given a bit string of length n, whether or
not it is in L. Suppose that A always takes 2n seconds to provide an answer. Then
A has the same worst-case and average-case computational complexity, 2n . Suppose
that Lb consists of all bit strings of the form
x1 x2  xn x1 x2  xn 
where x1 x2  xn is in L. For instance, if L = f00 10g, then Lb = f0000 1010g. Con-
sider the following Algorithm B for determining, given a bit string y = y1 y2   y2n
of length 2n, whether or not it is in Lb . First, determine if y is of the form
x1 x2  xn x1 x2  xn . This is easy to check. Assume for the sake of discussion
that it takes essentially 0 seconds to answer this question. If y is not of the proper
form, stop and say that y is not in Lb. If y is of the proper form, check if the rst n
digits of y form a bit string in L.
(a) Compute the worst-case complexity of Algorithm B .
(b) Compute the average-case complexity of Algorithm B .
(c) Do your answers suggest that average-case complexity might not be a good
measure? Why?
32 Chapter 2. Basic Counting Rules

2.5 r-PERMUTATIONS
Given an n-set, suppose that we want to pick out r elements and arrange them
in order. Such an arrangement is called an r-permutation of the n-set. P (n r)
will count the number of r-permutations of an n-set. For example, the number
of 3-letter words without repeated letters can be calculated by observing that we
want to choose 3 dierent letters out of 26 and arrange them in order hence,
we want P(26 3). Similarly, if a student has 4 experiments to perform and 10
periods in which to perform them (each experiment taking one period to complete),
the number of dierent schedules he can make for himself is P(10 4). Note that
P(n r) = 0 if n < r: There are no r-permutations of an n-set in this case. In what
follows, it will usually be understood that n  r.
To see how to calculate P (n r), let us note that in the case of the 3-letter words,
there are 26 choices for the rst letter for each of these there are 25 remaining
choices for the second letter and for each of these there are 24 remaining choices
for the third letter. Hence, by the product rule,
P(26 3) = 26 25 24:
In the case of the experiment schedules, we have 10 choices for the rst experiment,
9 for the second, 8 for the third, and 7 for the fourth, giving us
P (10 4) = 10 9 8 7:
By the same reasoning, if n  r,9
P(n r) = n (n ; 1) (n ; 2)    (n ; r + 1):
If n > r, this can be simplied as follows:
P(n r) = !n (n ; 1)    (n(n ; r + 1)] !(n ; r) (n ; r ; 1)    1] :
; r) (n ; r ; 1)    1
Hence, we obtain the result
P(n r) = (n ;n! r)! : (2.1)
We have derived (2.1) under the assumption n > r. It clearly holds for n = r as
well. (Why?)
Example 2.19 CD Player We buy a brand new CD player with many nice
features. In particular, the player has slots labeled 1 through 5 for ve CDs which
it plays in that order. If we have 24 CDs in our collection, how many dierent ways
can we load the CD player's slots for our listening pleasure? There are 24 choices
9 This formula even holds if n < r. Why?
Exercises for Section 2.5 33

for the rst slot, 23 choices for the second, 22 choices for the third, 21 choices for
the fourth, and 20 choices for the fth, giving us
P(24 5) = 24 23 22 21 20:
Alternatively, using Equation (2.1), we see again that
P (24 5) = (2424! 24!
; 5)! = 19! = 24 23 22 21 20: 

EXERCISES FOR SECTION 2.5


1. Find:
(a) P (3 2) (b) P (5 3) (c) P (8 5) (d) P (1 3)
2. Let A = f1 5 9 11 15 23g.
(a) Find the number of sequences of length 3 using elements of A.
(b) Repeat part (a) if no element of A is to be used twice.
(c) Repeat part (a) if the rst element of the sequence is 5.
(d) Repeat part (a) if the rst element of the sequence is 5 and no element of A is
used twice.
3. Let A = fa b cd e f g hg.
(a) Find the number of sequences of length 4 using elements of A.
(b) Repeat part (a) if no letter is repeated.
(c) Repeat part (a) if the rst letter in the sequence is b.
(d) Repeat part (a) if the rst letter is b and the last is d and no letters are
repeated.
4. In how many dierent orders can we schedule the rst ve interviews if we need to
schedule interviews with 20 job candidates?
5. If a campus telephone extension has four digits, how many dierent extensions are
there with no repeated digits:
(a) If the rst digit cannot be 0?
(b) If the rst digit cannot be 0 and the second cannot be 1?
6. A typical combination10 lock or padlock has 40 numbers on its dial, ranging from 0
to 39. It opens by turning its dial clockwise, then counterclockwise, then clockwise,
stopping each time at specic numbers. How many dierent padlocks can a company
manufacture?
10 In Section 2.7 we will see that the term \combination" is not appropriate with regard to
padlocks \r-permutation" would be correct.
34 Chapter 2. Basic Counting Rules

2.6 SUBSETS
Example 2.20 The Pizza Problem A pizza shop advertises that it oers over
500 varieties of pizza. The local consumer protection bureau is suspicious. At the
pizza shop, it is possible to have on a pizza a choice of any combination of the
following toppings:
pepperoni, mushrooms, peppers, olives, sausage,
anchovies, salami, onions, bacon.
Is the pizza shop telling the truth in its advertisements? We shall be able to answer
this question with some simple applications of the product rule. 
To answer the question raised in Example 2.20, let us consider the set fa b cg.
Let us ask how many subsets there are of this set. The answer can be obtained by
enumeration, and we nd that there are 8 such subsets:
 fag fbg fcg fabg fa cg fb cg fabcg:
The answer can also be obtained using the product rule. We think of building up
a subset in steps. First, we think of either including element a or not. There are
2 choices. Then we either include element b or not. There are again 2 choices.
Finally, we either include element c or not. There are again 2 choices. The total
number of ways of building up the subset is, by the product rule,
2 2 2 = 23 = 8:
Similarly, the number of subsets of a 4-set is
2 2 2 2 = 24 = 16
and the number of subsets of an n-set is
2| 2 {z   2} = 2n:
n times
Do these considerations help with the pizza problem? We can think of a particular
pizza as a subset of the set of toppings. Alternatively, we can think, for each
topping, of either including it or not. Either way, we see that there are 29 = 512
possible pizzas. Thus, the pizza shop has not advertised falsely.

EXERCISES FOR SECTION 2.6


1. Enumerate the 16 subsets of fa b cdg.
2. A magazine subscription service deals with 35 magazines. A subscriber may order
any number of them. The subscription service is trying to computerize its billing
procedure and wishes to assign a dierent computer key (identication number) to
two dierent people unless they subscribe to exactly the same magazines. How much
storage is required% that is, how many dierent code numbers are needed?
2.7 r-Combinations 35

3. If the pizza shop of Example 2.20 decides to always put onions and mushrooms on
its pizzas, how many dierent varieties can the shop now oer?
4. Suppose that the pizza shop of Example 2.20 adds a new possible topping, sardines,
but insists that each pizza either have sardines or have anchovies. How many possible
varieties of pizza does the shop now oer?
5. If A is a set of 10 elements, how many nonempty subsets does A have?
6. If A is a set of 8 elements, how many subsets of more than one element does A have?
7. A value function on a set A assigns 0 or 1 to each subset of A.
(a) If A has 3 elements, how many dierent value functions are there on A?
(b) What if A has n elements?
8. In a simple game (see Section 2.15), every subset of players is identied as either
winning or losing.
(a) If there is no restriction on this identication, how many distinct simple games
are there with 3 players?
(b) With n players?

2.7 r-COMBINATIONS
An r-combination of an n-set is a selection of r elements from the set, which means
that order does not matter. Thus, an r-combination is an r-element subset. C(n r)
will denote the number of r-combinations of an n-set. For example, the number of
ways to choose a committee of 3 from a set of 4 people is given by C(4 3). If the 4
people are Dewey, Evans, Grange, and Howe, the possible committees are
fDewey, Evans, Grangeg, fHowe, Evans, Grangeg,
fDewey, Howe, Grangeg, fDewey, Evans, Howeg.
Hence, C(4 3) = 4. We shall prove some simple theorems about C(n r). Note
that C(n r) is 0 if n < r: There are no r-combinations of an n-set in this case.
Henceforth, n  r will usually be understood. It is assumed in all of the theorems
in this section.
Theorem 2.1
P(n r) = C(n r) P(r r):
Proof. An ordered arrangement of r objects out of n can be obtained by rst
choosing r objects !this can be done in C(n r) ways] and then ordering them !this
can be done in P(r r) = r! ways]. The theorem follows by the product rule. Q.E.D.
Corollary 2.1.1
C(n r) = r!(nn!; r)! : (2.2)
36 Chapter 2. Basic Counting Rules
Proof.
(n r) = n!=(n ; r)! = n! :
C(n r) = PP(r r) r!=(r ; r)! r!(n ; r)! Q.E.D.
Corollary 2.1.2
C(n r) = C(n n ; r):
Proof.
C(n r) = r!(nn!; r)! = (n ;n!r)!r! = (n ; r)!!nn!; (n ; r)]! = C(n n ; r):
Q.E.D.
For an alternative \combinatorial" proof, see Exercise 20.
Note : The number
n!
r!(n ; r)!
is often denoted by  
n
r
and called a binomial coe cient. This is because, as we shall see below, this number
arises in the binomial expansion (see Section 2.14). Corollary 2.1.2 states the result
that  
n =  n :
r n;r
 
In what follows we use C(n r) and nr interchangeably.
Theorem 2.2
C(n r) = C(n ; 1 r ; 1) + C(n ; 1 r):
Proof. Mark one of the n objects with a . The r objects can be selected either
to include the object  or not to include it. There are C(n ; 1 r ; 1) ways to do
the former, since this is equivalent to choosing r ; 1 objects out of the n ; 1 non-
objects. There are C(n ; 1 r) ways to do the latter, since this is equivalent to
choosing r objects out of the n ; 1 non- objects. Hence, the sum rule yields the
theorem. Q.E.D.
Note: This proof can be described as a \combinatorial" proof, i.e, relying on
counting arguments. This theorem can also be proved by algebraic manipulation,
using the formula (2.2). Here is such an \algebraic" proof.
2.7 r-Combinations 37

Second Proof of Theorem 2.2.


C(n ; 1 r ; 1) + C(n ; 1 r) = (r ; 1)!!(n(n;;1)1)!; (r ; 1)]! + r!!(n(n;;1)1)!; r]!
; 1)! + (n ; 1)!
= (r ;(n1)!(n ; r)! r!(n ; r ; 1)!
; ; r)(n ; 1)!
= r!(n ; r)! + (n r!(n
r(n 1)!
; r)!
= r(n ; 1)! + (n ; r)(n ; 1)!
r!(n ; r)!
= (n ; 1)!!r + n ; r]
r!(n ; r)!
n!
= r!(n ; r)!
= C(n r):
Q.E.D.
Let us give some quick applications of our new formulas and our basic rules so
far.
1. In the pizza problem (Example 2.20), the number of pizzas with
exactly 3 dierent toppings is
9! = 84:
C(9 3) = 3!6!

2. The number of pizzas with at most 3 dierent toppings is, by the


sum rule,
C(9 0) + C(9 1) + C(9 2) + C(9 3) = 130:
3. If we have 6 drugs being tested in an experiment and we want to
choose 2 of them to give to a particular subject, the number of ways
in which we can do this is
6! = 15:
C(6 2) = 2!4!

4. If there are 7 possible meeting times and a committee must meet 3


times, the number of ways we can assign the meeting times is
7! = 35:
C(7 3) = 3!4!

5. The number of 5-member committees from a group of 9 people is


C(9 5) = 126:
38 Chapter 2. Basic Counting Rules
6. The number of 7-member committees from the U.S. Senate is
C(100 7):
7. The number of delegations to the President consisting of 2 senators
and 2 representatives is
C(100 2) C(435 2):
8. The number of 9-digit bit strings with 5 1's and 4 0's is
C(9 5) = C(9 4):
To see why, think of having 9 unknown digits and choosing 5 of them
to be 1's (or 4 of them to be 0's).
A convenient method of calculating the numbers C(n r) is to use the array
shown in Figure 2.4. The number C(n r) appears in the nth row, rth diagonal.
Each element in a given position is obtained by summing the two elements in the
row above it which are just to the left and just to the right. For example, C(5 2)
is given by summing up the numbers 4 and 6, which are circled in Figure 2.4. The
array of Figure 2.4 is called Pascal's triangle, after the famous French philosopher
and mathematician Blaise Pascal. Pascal was one of the inventors of probability
theory and discovered many interesting combinatorial techniques.
Why does Pascal's triangle work? The answer is that it depends on the relation
C(n r) = C(n ; 1 r ; 1) + C(n ; 1 r): (2.3)
This is exactly the relation that was proved in Theorem 2.2. The relation (2.3) is
an example of a recurrence relation. We shall see many such relations later in the
book, especially in Chapter 6, which is devoted entirely to this topic. Obtaining
such relations allows one to reduce calculations of complicated numbers to earlier
steps, and therefore allows the computation of these numbers in stages.

EXERCISES FOR SECTION 2.7


1. How many ways are there to choose 5 starters (independent of position) from a
basketball team of 10 players?
2. How many ways can 7 award winners be chosen from a group of 50 nominees?
3. Compute:
(a) C (6 3) (b) C (7 4) (c) C (5 1) (d) C (2 4)
4. Find C (n 1).
5. Compute C (5 2) and check your answer by enumeration.
6. Compute C (6 2) and check your answer by enumeration.
Exercises for Section 2.7 39
r=0
r=1
n=0 1
r=2
n=1 1 1
r=3
n=2 1 2 1
r=4
n=3 1 3 3 1
r=5
n=4 1 4 6 4 1
r=6
n=5 1 5 10 10 5 1
n=6 1 6 15 20 15 6 1
Figure 2.4: Pascal's triangle. The circled numbers are added to give C (5 2).

7. Check by computation that:


(a) C (7 2) = C (7 5) (b) C (6 4) = C (6 2)
8. Extend Figure 2.4 by adding one more row.
9. Compute C (5 3), C (4 2), and C (4 3) and verify that formula (2.3) holds.
10. Repeat Exercise 9 for C (7 5), C (6 4), and C (6 5).
11. (a) In how many ways can 8 blood samples be divided into 2 groups to be sent to
dierent laboratories for testing if there are 4 samples in each group? Assume
that the laboratories are distinguishable.
(b) In how many ways can 8 blood samples be divided into 2 groups to be sent to
dierent laboratories for testing if there are 4 samples in each group? Assume
that the laboratories are indistinguishable.
(c) In how many ways can the 8 samples be divided into 2 groups if there is at
least 1 item in each group? Assume that the laboratories are distinguishable.
12. A company is considering 6 possible new computer systems and its systems manager
would like to try out at most 3 of them. In how many ways can the systems manager
choose the systems to be tried out?
13. (a) In how many ways can 10 food items be divided into 2 groups to be sent to
dierent laboratories for purity testing if there are 5 items in each group?
(b) In how many ways can the 10 items be divided into 2 groups if there is at least
1 item in each group?
14. How many 8-letter words with no repeated letters can be constructed using the 26
letters of the alphabet if each word contains 3, 4, or 5 vowels?
15. How many odd numbers between 1000 and 9999 have distinct digits?
16. A "eet is to be chosen from a set of 7 dierent make foreign cars and 4 dierent
make domestic cars. How many ways are there to form the "eet if:
(a) The "eet has 5 cars, 3 foreign and 2 domestic?
40 Chapter 2. Basic Counting Rules
(b) The "eet can be any size (except empty), but it must have equal numbers of
foreign and domestic cars?
(c) The "eet has 4 cars and 1 of them must be a Chevrolet?
(d) The "eet has 4 cars, 2 of each kind, and a Chevrolet and Honda cannot both
be in the "eet?
17. (a) A computer center has 9 dierent programs to run. Four of them use the lan-
guage C++ and 5 use the language JAVA. The C++ programs are considered
indistinguishable and so are the JAVA programs. Find the number of possible
orders for running the programs if:
i. There are no restrictions.
ii. The C++ programs must be run consecutively.
iii. The C++ programs must be run consecutively and the JAVA programs
must be run consecutively.
iv. The languages must alternate.
(b) Suppose that the cost of switching from a C++ conguration to a JAVA con-
guration is 10 units, the cost of switching from a JAVA conguration to a
C++ conguration is 5 units, and there is no cost to switch from C++ to
C++ or JAVA to JAVA. What is the most ecient (least cost) ordering in
which to run the programs?
(c) Repeat part (a) if the C++ programs are all distinguishable from each other
and so are the JAVA programs.
18. A certain company has 30 female employees, including 3 in the management ranks,
and 150 male employees, including 12 in the management ranks. A committee
consisting of 3 women and 3 men is to be chosen. How many ways are there to
choose the committee if:
(a) It includes at least 1 person of management rank of each gender?
(b) It includes at least 1 person of management rank?
19. Consider the identity  ! !  ! !
n m = n n;k
m k k m;k :
(a) Prove this identity using an \algebraic" proof.
(b) Prove this identity using a \combinatorial" proof.
20. Give an alternative \combinatorial" proof of Corollary 2.1.2 by using the denition
of C (n r).
! ! ! !
21. How would you nd the sum n0 + n1 + n2 +   + nn from Pascal's trian-
gle? Do so for n = 2, 3, and 4. Guess at the answer in general.
22. Show that !  !  !  !
n + n + 1 +  + n + r = n + r + 1 :
0 1 r r
2.8 Probability 41

23. Prove the following identity (using a combinatorial proof if possible). The identity
is called Vandermonde's identity.
 !  ! !  ! !  ! !  ! !
n+m = n m + n m n m n m :
r 0 r 1 r;1 + 2 r ; 2 +  + r 0
D E  !
24. Following Cohen 1978], dene nr to be n + rr ; 1 . Show that
D nE  n  n ; 1 
r = +r;1 r
(a) using an algebraic proof (b) using a combinatorial proof
DnE
25. If r is dened as in Exercise 24, show that
DnE n n + 1 n + r ; 1  n 
r = r r;1 = r r;1 :
26. A sequence of numbers a0  a1  a2  : : :  an is called unimodal if for some integer t a0 
a1    at and at  at+1    an . (Note that the entries in any row of Pascal's
triangle increase for awhile and then decrease and thus form a unimodal sequence.)
(a) Show that if a0  a1  a2  : : :  an is unimodal, t is not necessarily unique.
 ! ! ! !
(b) Show that if n > 0, the sequence n0  n1  n2  : : :  nn is unimodal.
 !
(c) Show that the largest entry in the sequence in part (b) is bn=n2c where bxc
is the greatest integer less than or equal to x.

2.8 PROBABILITY
The history of combinatorics is closely intertwined with the history of the theory of
probability. The theory of probability was developed to deal with uncertain events,
events that might or might not occur. In particular, this theory was developed
by Pascal, Fermat, Laplace, and others in connection with the outcomes of certain
gambles. In his Theorie Analytique des Probabilites, published in 1812, Laplace
dened probability as follows: The probability of an event is the number of possible
outcomes whose occurrence signals the event divided by the total number of possible
outcomes. For instance, suppose that we consider choosing a 2-digit bit string at
random. There are 4 such strings, 00 01 10 and 11. What is the probability that
the string chosen has a 0? The answer is 43 , because 3 of the possible outcomes
signal the event in question, that is, have a 0, and there are 4 possible outcomes in
all. This denition of Laplace's is appropriate only if all the possible outcomes are
equally likely, as we shall quickly observe.
42 Chapter 2. Basic Counting Rules
Let us make things a little more precise. We shall try to formalize the notion
of probability by thinking of an experiment that produces one of a number of pos-
sible outcomes. The set of possible outcomes is called the sample space. An event
corresponds to a subset of the set of outcomes, that is, of the sample space it cor-
responds to those outcomes that signal that the event has taken place. An event's
complement corresponds to those outcomes that signal that the event has not taken
place. Laplace's denition says that if E is an event in the sample space S and E c
is the complement of E, then
probability of E = n(E) and probability of E c = n(S) ; n(E) = 1 ; n(E) 
n(S) n(S) n(S)
where n(E) is the number of outcomes in E and n(S) is the number of outcomes
in S. Note that it follows that the probability of E is a number between 0 and 1.
Let us apply this denition to a gambling situation. We toss a die|this is the
experiment. We wish to compute the probability that the outcome will be an even
number. The sample space is the set of possible outcomes, f1 2 3 4 5 6g. The
event in question is the set of all outcomes which are even, that is, the set f2 4 6g.
Then we have
probability of even = n(fn( f2 4 6g) 3 1
1 2 3 4 56g) = 6 = 2 :
Notice that this result would not hold unless all the outcomes in the sample space
were equally likely. If we have a weighted die that always comes up 1, the probability
of getting an even number is not 21 but 0.11
Let us consider a family with two children. What is the probability that the
family will have at least one boy? There are three possibilities for such a family: It
can have two boys, two girls, or a boy and a girl. Let us take the set of these three
possibilities as our sample space. The rst and third outcomes make up the event
\having at least one boy," and hence, by Laplace's denition,
probability of having at least one boy = 23 :
Is this really correct? It is not. If we look at families with two children, more than
3 of them have at least one boy. That is because there are four ways to build up
2
a family of two children: we can have rst a boy and then another boy, rst a girl
and then another girl, rst a boy and then a girl, or rst a girl and then a boy.
Thus, there are more ways to have a boy and a girl than there are ways to have two
boys, and the outcomes in our sample space were not equally likely. However, the
11 It could be argued that the denition of probability we have given is \circular" because it
depends on the notion of events being \equally likely," which suggests that we already know how
to measure probability. This is a subtle point. However, we can make comparisons of things
without being able to measure them, e.g., to say that this person and that person seem equally
tall. The theory of measurement of probability, starting with comparisons of this sort, is described
in Fine 1973] and Roberts 1976, 1979].
2.8 Probability 43

outcomes BB, GG, BG, and GB, to use obvious abbreviations, are equally likely,12
so we can take them as our sample space. Now the event \having at least one boy"
has 3 outcomes in it out of 4, and we have
probability of having at least one boy = 43 :
We shall limit computations of probability in this book to situations where
the outcomes in the sample space are equally likely. Note that our denition of
probability applies only to the case where the sample space is nite. In the innite
case, the Laplace denition obviously has to be modied. For a discussion of the
not-equally-likely case and the innite case, the reader is referred to almost any
textbook on probability theory, for instance Feller !1968], Parzen !1992], or Ross
!1997].
Let us continue by giving several more applications of our denition. Suppose
that a family is known to have 4 children. What is the probability that half of
them are boys? The answer is not 12 . To obtain the answer we observe that the
sample space is all sequences of B's and G's of length 4 a typical such sequence is
BGGB. How many such sequences have exactly 2 B's? There are 4 positions, and
2 of these must be chosen for B's. Hence, there are C(4 2) such sequences. How
many sequences are there in all? By the product rule, there are 24. Hence,
probability that half are boys = C(4 2) 6 3
24 = 16 = 8 :
The reader might wish to write out all 16 possible outcomes and note the 6 that
signal the event having exactly 2 boys.
Next, suppose that a fair coin is tossed 5 times. What is the probability that
there will be at least 2 heads? The sample space consists of all possible sequences
of heads and tails of length 5, that is, it consists of sequences such as HHHTH, to
use an obvious abbreviation. How many such sequences have at least 2 heads? The
answer is that C(5 2) sequences have exactly 2 heads, C(5 3) have exactly 3 heads,
and so on. Thus, the number of sequences having at least 2 heads is given by
C(5 2) + C(5 3) + C(5 4) + C(5 5) = 26:
The total number of possible sequences is 25 = 32. Hence,
probability of having at least two heads = 32 26 = 13 :
16
Example 2.21 Reliability of Systems Imagine that a system has n compo-
nents, each of which can work or fail to work. Let xi be 1 if the ith component
works and 0 if it fails. Let the bit string x1x2    xn describe the system. Thus, the
bit string 0011 describes a system with four components, with the rst two failing
12 Even this statement is not quite accurate, because it is slightly more likely to have a boy than
a girl (see Cummings 1997]). Thus, the four events we have chosen are not exactly equally likely.
For example, BB is more likely than GG. However, the assertion is a good working approximation.
44 Chapter 2. Basic Counting Rules

Table 2.4: The Switching Function F That is 1 if and Only if Two or Three
Components of a System Work
x1 x2x3 111 110 101 100 011 010 001 000
F (x1x2x3 ) 1 1 1 0 1 0 0 0

and the third and fourth working. Since many systems have built-in redundancy,
the system as a whole can work even if some components fail. Let F(x1x2    xn)
be 1 if the system described by x1 x2    xn works and 0 if it fails. Then F is a
function from bit strings of length n to f0 1g, that is, an n-variable switching func-
tion (Example 2.4). For instance, suppose that we have a highly redundant system
with three identical components, and the system works if and only if at least two
components work. Then F is given by Table 2.4. We shall study other specic ex-
amples of functions F in Section 3.2.4 and Exercise 22, Section 13.3. Suppose that
components in a system are equally likely to work or not to work.13 Then any two
bit strings are equally likely to be the bit string x1x2    xn describing the system.
Now we may ask: What is the probability that the system works, that is, what is
the probability that F(x1x2    xn ) = 1? This is a measure of the reliability of the
system. In our example, 4 of the 8 bit strings, 111, 110, 101, and 011, signal the
event that F(x1x2x3 ) = 1. Since all bit strings are equally likely, the probability
that the system works is 84 = 12 . For more on this approach to system reliability,
see Karp and Luby !1983] and Barlow and Proschan !1975].
The theory of reliability of systems has been studied widely for networks of
all kinds: electrical networks, computer networks, communication networks, and
transportation routing networks. For a general reference on the subject of reliability
of networks, see Hwang, Monma, and Roberts !1991] or Ball, Colbourn, and Provan
!1995]. 
Before closing this section, we observe that some common statements about
probabilities of events correspond to operations on the associated subsets. Thus,
we have:
Probability that event E does not occur is the probability of E c .
Probability that event E or event F occurs is the probability of E  F.
Probability that event E and event F occur is the probability of E \ F.
It is also easy to see from the denition of probability that
probability of E c =1 ; probability of E: (2.4)
If E and F are disjoint,
probability of E  F= probability of E + probability of F (2.5)
13 In a more general analysis, we would rst estimate the probability p that the ith component
i
works.
2.8 Probability 45

E F Figure 2.5: A Venn diagram


related to Equation (2.6).

and, in general,
probability of E  F = probability of E + probability of F
; probability of E \ F: (2.6)

To see why Equation (2.6) is true, consider the Venn diagram in Figure 2.5. Notice
that when adding the probability of E and the probability of F , we are adding the
probability of the intersection of E and F twice. By subtracting the probability of
the intersection of E and F from the sum of their probabilities, Equation (2.6) is
obtained.
To illustrate these observations, let us consider the die-tossing experiment. Then
the probability of not getting a 3 is 1 minus the probability of getting a 3 that is,
it is 1 ; 16 = 65 . What is the probability of getting a 3 or an even number? Since
E = f3g and F = f2 4 6g are disjoint, (2.5) implies it is probability of E plus
probability of F = 16 + 36 = 32 . Finally, what is the probability of getting a number
larger than 4 or an even number? The event in question is the set f2 4 5 6g, which
has probability 64 = 23 . Note that this is not the same as the probability of a number
larger than 4 plus the probability of an even number = 62 + 36 = 65 . This is because
E = f5 6g and F = f2 4 6g are not disjoint, for E \ F = f6g. Applying (2.6), we
have probability of E  F = 26 + 36 ; 16 = 23 , which agrees with our rst computation.
Example 2.22 Food Allergies (Example 2.5 Revisited) In Example 2.5 we
studied the switching functions associated with food allergies brought on by some
combination of four foods: tomatoes, chocolate, corn, and peanuts. We saw that
there are a total of 24 = 16 possible food combinations. We considered the situation
where a person develops an allergic reaction any time tomatoes are in the diet or
corn and peanuts are in the diet. What is the probability of not having such an
allergic reaction?
To nd the probability in question, we rst calculate the probability that there
is a reaction. Note that there is a reaction if the foods present are represented by
the bit string (1 y z w) or the bit string (x y 1 1), where x y z w are (binary)
0-1 variables. Since there are three binary variables in the rst type and two binary
variables in the second type, there are 23 = 8 dierent bit strings of the rst type
and 22 = 4 of the second type. If there was no overlap between the two types, then
(2.5) would allow us to merely add probabilities. However, there is overlap when x,
z, and w are all 1. In this case y could be 0 or 1. Thus, by (2.6), the probability of
a food reaction is
8 + 4 ; 2 = 10 = 5 :
16 16 16 16 8
46 Chapter 2. Basic Counting Rules
By (2.4), the probability of no reaction is 1 ; 85 = 83 . 
In the Example 2.22, enumeration of the possible combinations would be an
ecient solution technique. However, if as few as 10 foods are considered, then
enumeration would begin to get unwieldy. Thus, the techniques developed and
used in this section are essential to avoid enumeration.

EXERCISES FOR SECTION 2.8


1. Are the outcomes in the following experiments equally likely?
(a) A citizen of California is chosen at random and his or her town of residence is
recorded.
(b) Two drug pills and three placebos (sugar pills) are placed in a container and
one pill is chosen at random and its type is recorded.
(c) A snow"ake is chosen at random and its appearance is recorded.
(d) Two fair dice are tossed and the sum of the numbers appearing is recorded.
(e) A bit string of length 3 is chosen at random and the sum of its digits is observed.
2. Calculate the probability that when a die is tossed, the outcome will be:
(a) An odd number (b) A number less than or equal to 2
(c) A number divisible by 3
3. Calculate the probability that a family of 3 children has:
(a) Exactly 2 boys (b) At least 2 boys
(c) At least 1 boy and at least 1 girl
4. If black hair, brown hair, and blond hair are equally likely (and no other hair colors
can occur), what is the probability that a family of 3 children has at least two
blondes?
5. Calculate the probability that in four tosses of a fair coin, there are at most three
heads.
6. Calculate the probability that if a DNA chain of length 5 is chosen at random, it
will have at least four A's.
7. If a card is drawn at random from a deck of 52, what is the probability that it is a
king or a queen?
8. Suppose that a card is drawn at random from a deck of 52, the card is replaced,
and then another card is drawn at random. What is the probability of getting two
kings?
9. If a bit string of length 4 is chosen at random, what is the probability of having at
least three 1's?
10. What is the probability that a bit string of length 3, chosen at random, does not
have two consecutive 0's?
2.9 Sampling with Replacement 47

11. Suppose that a system has four independent components, each of which is equally
likely to work or not to work. Suppose that the system works if and only if at least
three components work. What is the probability that the system works?
12. Repeat Exercise 11 if the system works if and only if the fourth component works
and at least two of the other components work.
13. A medical lab can operate only if at least one licensed x-ray technician is present
and at least one phlebotomist. There are three licensed x-ray technicians and two
phlebotomists, and each worker is equally likely to show up for work on a given day
or to stay home. Assuming that each worker decides independently whether or not
to come to work, what is the probability that the lab can operate?
14. Suppose that we have 10 dierent pairs of gloves. From the 20 gloves, 4 are chosen
at random. What is the probability of getting at least one pair?
15. Use rules (2.4){(2.6) to calculate the probability of getting, in six tosses of a fair
coin:
(a) Two heads or three heads (b) Two heads or two tails
(c) Two heads or a head on the rst toss
(d) An even number of heads or at least nine heads
(e) An even number of heads and a head on the rst toss
16. Use the denition of probability to verify rules:
(a) (2.4) (b) (2.5) (c) (2.6)
17. Repeat the problem in Example 2.22 when allergic reactions occur only in diets:
(a) Containing either tomatoes and corn or chocolate and peanuts
(b) Containing either tomatoes or all three other foods

2.9 SAMPLING WITH REPLACEMENT


In the National Hockey League (NHL), a team can either win (W), lose (L), or lose
in overtime (OTL) each of its games. In an 82-game schedule, how many dierent
seasons14 can a particular team have? By the product rule, the answer is 382. There
are three possibilities for each of the 82 games: namely, W, L, or OTL. We say that
we are sampling with replacement. We are choosing an 82-permutation out of a
3-set, fW, L, OTLg, but with replacement of the elements in the set after they
are drawn. Equivalently, we are allowing repetition. Let P R (m r) be the number
of r-permutations of an m-set, with replacement or repetition allowed. Then the
product rule gives us
P R(m r) = mr : (2.7)
The number P (m r) counts the number of r-permutations of an m-set if we are
sampling without replacement or repetition.
14 Do not confuse \seasons" with \records." Records refer to the nal total of wins, losses, and
ties while seasons counts the number of dierent ways that each record could be attained.
48 Chapter 2. Basic Counting Rules
We can make a similar distinction in the case of r-combinations. Let C R (m r)
be the number of r-combinations of an m-set if we sample with replacement or rep-
etition. For instance, the 4-combinations of a 2-set fa bg if replacement is allowed
are given by
fa a a ag fa aabg fa abbg fab b bg fbb bbg:
Thus, C R (2 4) = 5. We now state a formula for C R (m r).
Theorem 2.3
C R (m r) = C(m + r ; 1 r):
We shall prove Theorem 2.3 at the end of this section. Here, let us illustrate it with
some examples.
Example 2.23 The Chocolate Shoppe Suppose that there are three kinds of
tru'es available at a chocolate shoppe: cherry (c), orange (o), and vanilla (v). The
store allows a customer to design a box of chocolates by choosing a dozen tru'es.
How many dierent tru'e boxes are there? We can think of having a 3-set, fc o vg,
and picking a 12-combination from it, with replacement. Thus, the number of tru'e
boxes is
C R (3 12) = C(3 + 12 ; 1 12) = C(14 12) = 91: 
Example 2.24 DNA Strings: Gamow's Encoding In Section 2.1 we studied
DNA strings on the alphabet fA C G Tg and the minimum length required for
such a string to encode for an amino acid. We noted that there are 20 dierent
amino acids, and showed in Section 2.1 that there are only 16 dierent DNA strings
of length 2, so a string of length at least 3 is required. But there are 43 = 64 dif-
ferent strings of length 3. Gamow !1954a,b] suggested that there was a relationship
between amino acids and the rhombus-shaped \holes" formed by the bases in the
double helix structure of DNA. Each rhombus (see Figure 2.6) consists of 4 bases,
with one base located at each corner of the rhombus. We will identify each rhombus
with its 4-base sequence xyzw that starts at the top of the rhombus and continues
clockwise around the rhombus. For example, the rhombus in Figure 2.6 would be
written GTTA. Due to base pairing in DNA, the fourth base, w, in the sequence is
always xed by the second base, y, in the sequence. If the second base is T, then
the fourth base is A (and vice versa), or if the second base is G, then the fourth
base is C (and vice versa).
Gamow proposed that rhombus xyzw encodes the same amino acid as (a) xwzy
and (b) zyxw. Thus, GTTA, GATT, TTGA, and TAGT would all encode the same
amino acid. If Gamow's suggestion were correct, how many amino acids could be
encoded using 4-base DNA rhombuses? In the 4-base sequence xyzw there are only
2 choices for the y-w pair: A-T (or equivalently, T-A) and G-C (or equivalently, C-
G). Picking the other two bases is an application of Theorem 2.3. We have m = 4
objects, we wish to choose r = 2 objects, with replacement, and order doesn't
matter. This can be done in
C(m + r ; 1 r) = C(5 2) = 10
2.9 Sampling with Replacement 49
G

A T Figure 2.6: A 4-base DNA rhombus.

Table 2.5: Choosing a Sample of r Elements from a Set of m Elements


Order Repetition The sample Number of ways to
counts? allowed? is called: choose the sample Reference
No No r-combination m !
C (m r) = r!(m ; r)! Corollary 2.1.1
Yes No r-permutation m !
P (m r) = (m ; r)! Eq. (2.1)
No Yes r-combination C R (mr) = C (m + r ; 1r) Theorem 2.3
with replacement
Yes Yes r-permutation P R (mr) = mr Eq. (2.7)
with replacement

dierent ways. Thus, it would be possible to encode 2 10 = 20 dierent amino


acids using Gamow's 4-base DNA rhombuses which is precisely the correct number.
Unfortunately, it was later discovered that this is not the way the coding works.
See Golomb !1962] for a discussion. See also Griths, et al. !1996]. 
Our discussion of sampling with and without replacement is summarized in
Table 2.5.
Example 2.25 Voting Methods In most elections in the United States, a num-
ber of candidates are running for an oce and each registered voter may vote for
the candidate of his or her choice. The winner of the election is the candidate with
the highest vote total. (There could be multiple winners in case of a tie, but then
tie-breaking methods could be used.) This voting method is called plurality voting.
Suppose that 3 juniors are running for student class president of a class of 400
students. How many dierent results are possible if everyone votes? By \dierent
results" we are referring to the number of dierent \patterns" of vote totals obtained
by the 3 candidates. A pattern is a sequence (a1  a2 a3) where ai is the number
of votes obtained by candidate i, i = 1 2 3. Thus, (6 55 339) is dierent from
(55 339 6), and (6 55 338) is not possible. !We will make no distinction among
the voters (i.e., who voted for whom), only in the vote totals for each candidate.]
Again, this is an application of Theorem 2.3. We have m = 3 objects and we wish
to choose r = 400 objects, with replacement (obviously). This can be done in
C(m + r ; 1 r) = C(402 400) = 80,601
50 Chapter 2. Basic Counting Rules
dierent ways. This answer assumes that each voter voted. Exercise 10 addresses
the question of vote totals when not all voters necessarily vote.
Another voting method, called cumulative voting, can be used in elections where
more than one candidate needs to be elected. This is the case in many city council,
board of directors, and school board elections. (Cumulative voting was used to elect
the Illinois state legislature from 1870 to 1980.) With cumulative voting, voters cast
as many votes as there are open seats to be lled and they are not limited to giving
all of their votes to a single candidate. Instead, they can put multiple votes on
one or more candidates. In such an election with p candidates, q open seats, and
r voters, a total of qr votes are possible. The winners, analogous to the case of
plurality voting, are the candidates with the q largest vote totals. Again consider
the school situation of 3 candidates and 400 voters. However, now suppose that
the students are not voting to elect a junior class president but two co-presidents.
Under the cumulative voting method, how many dierent vote totals are possible?
If, as in the plurality example above, each voter is required to vote for at least one
candidate, then at least 400 votes and at most 2  400 = 800 votes must be cast.
Consider the case of j votes being cast where 400 j 800. By Theorem 2.3,
there are    
R 3 + j ;
C (3 j) = 3 ; 1 = 2 1 2 + j

dierent vote totals. Since j can range from 400 to 800, using the sum rule, there
are a total of

2 + 400 + 2 + 401 +    + 2 + 800 = 75,228,001
2 2 2
dierent vote totals. Cumulative voting with votes not required and other voting
methods are addressed in the exercises. For a general introduction to the methods
and mathematics of voting see Aumann and Hart !1998], Brams !1994], Brams and
Fishburn !1983], Farquharson !1969], or Kelly !1987]. 
Proof of Theorem 2.3. 15 Suppose that the m-set has elements a1  a2 : : : am.
Then any sample of r of these objects can be described by listing how many a1's
are in it, how many a2 's, and so on. For instance, if r = 7 and m = 5, typical samples
are a1 a1a2 a3a4 a4 a5 and a1a1 a1a2 a4a5 a5 . We can also represent these samples by
putting a vertical line after the last ai, for i = 1 2 : : : m ; 1. Thus, these two
samples would be written as a1 a1 j a2 j a3 j a4 a4 j a5 and a1a1 a1 j a2 jj a4 j a5 a5,
where in the second case we have two consecutive vertical lines since there is no
a3 . Now if we use this notation to describe a sample of r objects, we can omit the
subscripts. For instance, aa j aa jjj aaa represents a1a1 j a2 a2 jjj a5 a5a5 . Then the
number of samples of r objects is just the number of dierent arrangements of r
letters a and m ; 1 vertical lines. Such an arrangement has m +r ; 1 elements, and
we determine the arrangement by choosing r positions for the a's. Hence, there are
C(m + r ; 1 r) such arrangements. Q.E.D.
15 The proof may be omitted.
Exercises for Section 2.9 51

EXERCISES FOR SECTION 2.9


1. If replacement is allowed, nd all:
(a) 5-permutations of a 2-set (b) 2-permutations of a 3-set
(c) 5-combinations of a 2-set (d) 2-combinations of a 3-set
2. Check your answers in Exercise 1 by using Equation (2.7) or Theorem 2.3.
3. If replacement is allowed, compute the number of:
(a) 7-permutations of a 3-set (b) 7-combinations of a 4-set
4. In how many ways can we choose eight concert tickets if four concerts are available?
5. In how many dierent ways can we choose 12 microwave desserts if 5 dierent
varieties are available?
6. Suppose that a codeword of length 8 consists of letters A, B, or C or digits 0 or 1,
and cannot start with 1. How many such codewords are there?
7. How many DNA chains of length 6 have at least one of each base T, C, A, and G?
Answer this question under the following assumptions:
(a) Only the number of bases of a given type matter.
(b) Order matters.
8. In an 82-game NHL season, how many dierent nal records16 are possible:
(a) If a team can either win, lose, or overtime lose each game?
(b) If overtime losses are not possible?
9. The United Soccer League in the United States has a shootout if a game is tied at
the end of regulation. So there are wins, shootout wins, losses, or shootout losses.
How many dierent 12-game seasons are possible?
10. Calculate the number of dierent vote totals, using the plurality voting method (see
Example 2.25), when there are m candidates and n voters and each voter need not
vote.
11. Calculate the number of dierent vote totals, using the cumulative voting method
(see Example 2.25), when there are m candidates, n voters, l open seats, and each
voter need not vote.

2.10 OCCUPANCY PROBLEMS17


2.10.1 The Types of Occupancy Problems
In the history of combinatorics and probability theory, problems of placing balls
into cells or urns have played an important role. Such problems are called occu-
pancy problems. Occupancy problems have numerous applications. In classifying
16 See footnote on page 47.
17 For a quick reading of this section, it suces to read Section 2.10.1.
52 Chapter 2. Basic Counting Rules

Table 2.6: The Distributions of Two Distinguishable Balls to Three


Distinguishable Cells
Distribution
1 2 3 4 5 6 7 8 9
1 ab a a b b
Cell 2 ab b a a b
3 ab b b a a

types of accidents according to the day of the week in which they occur, the balls
are the types of accidents and the cells are the days of the week. In cosmic-ray
experiments, the balls are the particles reaching a Geiger counter and the cells are
the counters. In coding theory, the possible distributions of transmission errors on
k codewords are obtained by studying the codewords as cells and the errors as balls.
In book publishing, the possible distributions of misprints on k pages are obtained
by studying the pages as cells and the balls as misprints. In the study of irradia-
tion in biology, the light particles hitting the retina correspond to balls, the cells
of the retina to the cells. In coupon collecting, the balls correspond to particular
coupons, the cells to the types of coupons. We shall return in various places to
these applications. See Feller !1968, pp. 10{11] for other applications.
In occupancy problems, it makes a big dierence whether or not we regard two
balls as distinguishable and whether or not we regard two cells as distinguishable.
For instance, suppose that we have two distinguishable balls, a and b, and three
distinguishable cells, 1, 2, and 3. Then the possible distributions of balls to cells
are shown in Table 2.6. There are nine distinct distributions. However, suppose
that we have two indistinguishable balls. We can label them both a. Then the
possible distributions to three distinguishable cells are shown in Table 2.7. There
are just six of them. Similarly, if the cells are not distinguishable but the balls
are, distributions 1{3 of Table 2.6 are considered the same: two balls in one cell,
none in the others. Similarly, distributions 4{9 are considered the same: two cells
with one ball, one cell with no balls. There are then just two distinct distributions.
Finally, if neither the balls nor the cells are distinguishable, then distributions 1{3
of Table 2.7 are considered the same and distributions 4{6 are as well, so there are
two distinct distributions.
It is also common to distinguish between occupancy problems where the cells
are allowed to be empty and those where they are not. For instance, if we have two
distinguishable balls and two distinguishable cells, then the possible distributions
are given by Table 2.8. There are four of them. However, if no cell can be empty,
there are only two, distributions 3 and 4 of Table 2.8.
The possible cases of occupancy problems are summarized in Table 2.9. The
notation and terminology in the fourth column, which has not yet been dened,
will be dened below. We shall now discuss the dierent cases.
2.10 Occupancy Problems 53

Table 2.7: The Distributions of Two Indistinguishable Balls to Three


Distinguishable Cells
Distribution
1 2 3 4 5 6
1 aa a a
Cell 2 aa a a
3 aa a a

Table 2.8: The Distributions of Two Distinguishable Balls to Two


Distinguishable Cells
Distribution
1 2 3 4
1 ab a b
Cell 2 ab b a

2.10.2 Case 1: Distinguishable Balls and


Distinguishable Cells
Case 1a is covered by the product rule: There are k choices of cells for each ball.
If k = 3 and n = 2, we get kn = 9, which is the number of distributions shown in
Table 2.6. Case 1b is discussed in Section 2.10.4.
2.10.3 Case 2: Indistinguishable Balls and
Distinguishable Cells 18
Case 2a follows from Theorem 2.3, for we have the following result.
Theorem 2.4 The number of ways to distribute n indistinguishable balls into
k distinguishable cells is C(k + n ; 1 n).
Proof. Suppose that the cells are labeled C1 C2 : : : Ck. A distribution of balls
into cells can be summarized by listing for each ball the cell into which it goes.
Then, a distribution corresponds to a collection of n cells with repetition allowed.
For instance, in Table 2.7, distribution 1 corresponds to the collection fC1  C1g
and distribution 5 to the collection fC1 C3g. If there are four balls, the collection
fC1 C2 C3 C3g corresponds to the distribution that puts one ball into cell C1, one
ball into cell C2, and two balls into cell C3. Because a distribution corresponds to
a collection Ci1  Ci2  : : : Cin , the number of ways to distribute the balls into cells
is the same as the number of n-combinations of the k-set fC1 C2 : : : Ck g in which
repetition is allowed. This is given by Theorem 2.3 to be C(k + n ; 1 n). Q.E.D.
18 The rest of Section 2.10 may be omitted.
54 Chapter 2. Basic Counting Rules

Table 2.9: Classication of Occupancy Problems


Distinguished Distinguished Can cells Number of ways to
balls? cells? be empty? place n balls into k cells:
Case 1
1a Yes Yes Yes kn
1b Yes Yes No k!S (n k)
Case 2
2a No Yes Yes C (k + n ; 1 n)
2b No Yes No C (n ; 1 k ; 1)
Case 3
3a Yes No Yes S (n 1) + S (n 2) +  + S (n k)
3b Yes No No S (n k)
Case 4
4a No No Yes Number of partitions of
n into k or fewer parts
4b No No No Number of partitions of
n into exactly k parts

Theorem 2.4 is illustrated by Table 2.7. We have k = 3 n = 2, and C(k + n ;


1 n) = C(4 2) = 6. The result in case 2b now follows from the result in case 2a.
Given n indistinguishable balls and k distinguishable cells, we rst place one ball in
each cell. There is one way to do this. It leaves n ; k indistinguishable balls. We
wish to place these into k distinguishable cells, with no restriction as to cells being
nonempty. By Theorem 2.4 this can be done in
C(k + (n ; k) ; 1 n ; k) = C(n ; 1 k ; 1)
ways. We now use the product rule to derive the result for case 2b of Table 2.9.
Note that C(n ; 1 k ; 1) is 0 if n < k. There is no way to assign n balls to k cells
with at least one ball in each cell.

2.10.4 Case 3: Distinguishable Balls and


Indistinguishable Cells
Let us turn next to case 3b. Let S(n k) be dened to be the number of ways to
distribute n distinguishable balls into k indistinguishable cells with no cell empty.
The number S(n k) is called a Stirling number of the second kind.19 In Section 5.5.3
we show that
k  
S(n k) = k!1 (;1)i ki (k ; i)n :
X
(2.8)
i=0
19 A Stirling number of the rst kind exists and is found in other contexts. See Exercise 24 of
Section 3.4.
2.10 Occupancy Problems 55

To illustrate this result, let us consider the case n = 2 k = 2. Then

S(n k) = S(2 2) = 12 !22 ; 2  12 + 0] = 1:


There is only one distribution of two distinguishable balls a and b to two indistin-
guishable cells such that each cell has at least one ball: one ball in each cell.
The result in case 3a now follows from the result in case 3b by the sum rule.
For to distribute n distinguishable balls into k indistinguishable cells with no cell
empty, either one cell is not empty or two cells are not empty or .. .. The result in
case 1b now follows also, since putting n distinguishable balls into k distinguishable
cells with no cells empty can be accomplished by putting n distinguishable balls
into k indistinguishable cells with no cells empty !which can be done in S(n k)
ways] and then labeling the cells (which can be done in k! ways). For instance, if
k = n = 2, then by our previous computation, S(2 2) = 1. Thus, the number of
ways to put two distinguishable balls into two distinguishable cells with no cells
empty is 2!S(2 2) = 2. This is the observation we made earlier from Table 2.8.

2.10.5 Case 4: Indistinguishable Balls and


Indistinguishable Cells
To handle cases 4a and 4b, we dene a partition of a positive integer n to be a
collection of positive integers that sum to n. For instance, the integer 5 has the
partitions
f1 1 1 1 1g f1 1 1 2g f1 22g f11 3g f23g f1 4g f5g:
Note that f3 2g is considered the same as f2 3g. We are interested only in what
integers are in the collection, not in their order. The number of ways to distribute
n indistinguishable balls into k indistinguishable cells is clearly the same as the
number of ways to partition the integer n into at most k parts. This gives us the
result in case 4a of Table 2.9. For instance, if n = 5 and k = 3, there are ve
possible partitions, all but the rst two listed above. If n = 2 and k = 3, there
are two possible partitions, f1 1g and f2g. This corresponds in Table 2.7 to the
two distinct distributions: two cells with one ball in each or one cell with two balls
in it. The result in case 4b of Table 2.9 follows similarly: The number of ways to
distribute n indistinguishable balls into k indistinguishable cells with no cell empty
is clearly the same as the number of ways to partition the integer n into exactly k
parts. To illustrate this, if n = 2 and k = 3, there is no way.
We will explore partitions of integers briey in the exercises and return to them
in the exercises of Sections 5.3 and 5.4, where we approach them using the method
of generating functions. For a detailed discussion of partitions, see most number
theory books, for instance, Niven !1991] or Hardy and Wright !1980]. See also Berge
!1971] or Riordan !1980].
56 Chapter 2. Basic Counting Rules
2.10.6 Examples
We now give a number of examples, applying the results of Table 2.9. The reader
should notice that whether or not balls or cells are distinguishable is often a matter
of judgment, depending on the interpretation and in what we are interested.
Example 2.26 Hospital Deliveries Suppose that 80 babies are born in the
month of September in a hospital and we record the day each baby is born. In how
many ways can this event occur? The babies are the balls and the days are the cells.
If we do not distinguish between 2 babies but do distinguish between days, we are
in case 2, n = 80 k = 30, and the answer is given by C(109 80). The answer is
given by C(79 29) if we count only the number of ways this can happen with each
day having at least 1 baby. If we do not care about what day a particular number
of babies is born but only about the number of days in which 2 babies are born, the
number in which 3 are born, and so on, we are in case 4 and we need to consider
partitions of the integer 80 into 30 or fewer parts. 
Example 2.27 Coding Theory In coding theory, messages are rst encoded
into coded messages and then sent through a transmission channel. The channel
may be a telephone line or radio wave. Due to noise or weak signals, errors may
occur in the received codewords. The received codewords must then be decoded
into the (hopefully) original messages. (An introduction to cryptography with an
emphasis on coding theory is contained in Chapter 10.)
In monitoring the reliability of a transmission channel, suppose that we keep a
record of errors. Suppose that 100 coded messages are sent through a transmission
channel and 30 errors are made. In how many ways could this happen? The errors
are the balls and the codewords are the cells. It seems reasonable to disregard the
distinction between errors and concentrate on whether more errors occur during
certain time periods of the transmission (because of external factors or a higher
load period). Then codewords are distinguished. Hence, we are in case 2, and the
answer is given by C(129 30). 
Example 2.28 Gender Distribution Suppose that we record the gender of
the rst 1000 people to get a degree in computer science at a school. The people
correspond to the balls and the two genders are the cells. We certainly distinguish
cells. If we distinguish individuals, that is, if we distinguish between individual
1 being male and individual 2 being male, for example, then we are in case 1.
However, if we are interested only in the number of people of each gender, we are
in case 2. In the former case, the number of possible distributions is 21000. In the
latter case, the number of possible distributions is given by C(1001 1000) = 1001.

Example 2.29 Auditions A director has called back 24 actors for 8 dierent
touring companies of a \one-man" Broadway show. (More than one actor may
be chosen for a touring company in case of the need for a stand-in.) The actors
correspond to the balls and the touring companies to the cells. If we are interested
Exercises for Section 2.10 57

only in the actors who are in the same touring company, we can consider the balls
distinguishable and the cells indistinguishable. Since each touring company needs
at least one actor, no cell can be empty. Thus we are in case 3. The number of
possible distributions is given by S(24 8). 
Example 2.30 Statistical Mechanics In statistical mechanics, suppose that
we have a system of t particles. Suppose that there are p dierent states or levels
(e.g., energy levels), in which each of the particles can be. The state of the system
is described by giving the distribution of particles to levels. In all, if the particles
are distinguishable, there are pt possible distributions. For instance, if we have 4
particles and 3 levels, there are 34 = 81 dierent arrangements. One of these has
particle 1 at level 1, particle 2 at level 3, particle 3 at level 2, and particle 4 at level
3. Another has particle 1 at level 2, particle 2 at level 1, and particles 3 and 4 at
level 3. If we consider any distribution of particles to levels to be equally likely,
then the probability of any given arrangement is 1=pt. In this case we say that
the particles obey the Maxwell-Boltzmann statistics. Unfortunately, apparently no
known physical particles exhibit these Maxwell-Boltzmann statistics the pt dierent
arrangements are not equally likely. It turns out that for many dierent particles,
in particular photons and nuclei, a relatively simple change of assumption gives rise
to an empirically accurate model. Namely, suppose that we consider the particles
as indistinguishable. Then we are in case 2: Two arrangements of particles to levels
are considered the same if the same number of particles is assigned to the same level.
Thus, the two arrangements described above are considered the same, as they each
assign one particle to level 1, one to level 2, and two to level 3. By Theorem 2.4,
the number of distinguishable ways to arrange t particles into p levels is now given
by C(p + t ; 1 t). If we consider any distribution of particles to levels to be equally
likely, the probability of any one arrangement is
1
C(p + t ; 1 t) :
In this case, we say that the particles satisfy the Bose-Einstein statistics. A third
model in statistical mechanics arises if we consider the particles indistinguishable
but add the assumption that there can be no more than two particles at a given
level. Then we get the Fermi-Dirac statistics (see Exercise 21). See Feller !1968] or
Parzen !1992] for a more detailed discussion of all the cases we have described. 

EXERCISES FOR SECTION 2.10


Note to the reader: When it is unclear whether balls or cells are distinguishable, you should
state your interpretation, give a reason for it, and then proceed.
1. Write down all the distributions of:
(a) 3 distinguishable balls a b c into 2 distinguishable cells 1 2
(b) 4 distinguishable balls a b cd into 2 distinguishable cells 1 2
58 Chapter 2. Basic Counting Rules
(c) 2 distinguishable balls a b into 4 distinguishable cells 1 2 3 4
(d) 3 indistinguishable balls a a a into 2 distinguishable cells 1 2
(e) 4 indistinguishable balls a a a a into 2 distinguishable cells 1 2
(f) 2 indistinguishable balls a a into 4 distinguishable cells 1 2 3 4
2. In Exercise 1, which of the distributions are distinct if the cells are indistinguishable?
3. Use the results of Table 2.9 to compute the number of distributions in each case
in Exercise 1 and check the result by comparing the distributions you have written
down.
4. Repeat Exercise 3 if the cells are indistinguishable.
5. Use the results of Table 2.9 to compute the number of distributions with no empty
cell in each case in Exercise 1. Check the result by comparing the distributions you
have written down.
6. Repeat Exercise 5 if the cells are indistinguishable.
7. Find all partitions of:
(a) 4 (b) 7 (c) 8
8. Find all partitions of:
(a) 9 into four or fewer parts (b) 11 into three or fewer parts
9. Compute:
(a) S (n 0) (b) S (n 1) (c) S (n 2)
(d) S (n n ; 1) (e) S (n n)
10. In checking the work of a proofreader, we look for 5 kinds of misprints in a textbook.
In how many ways can we nd 12 misprints?
11. In Exercise 10, suppose that we do not distinguish the types of misprints but we do
keep a record of the page on which a misprint occurred. In how many dierent ways
can we nd 25 misprints in 75 pages?
12. In Example 2.27, suppose that we pinpoint 30 kinds of errors and we want to nd
out whether these errors tend to appear together, not caring in which codeword they
appear together. In how many ways can we nd 30 kinds of errors in 100 codewords
if each kind of error is known to appear exactly once in some codeword?
13. An elevator with 9 passengers stops at 5 dierent "oors. If we are interested only
in the passengers who get o together, how many possible distributions are there?
14. If lasers are aimed at 5 tumors, how many ways are there for 10 lasers to hit? (You
do not have to assume that each laser hits a tumor.)
15. A Geiger counter records the impact of 6 dierent kinds of radioactive particles over
a period of time. How many ways are there to obtain a count of 30?
16. Find the number of ways to distribute 10 customers to 7 salesmen so that each
salesman gets at least 1 customer.
17. Find the number of ways to pair o 10 students into lab partners.
18. Find the number of ways to assign 6 jobs to 4 workers so that each job gets a worker
and each worker gets at least 1 job.
2.11 Multinomial Coecients 59

19. Find the number of ways to partition a set of 20 elements into exactly 4 subsets.
20. In Example 2.30, suppose that there are 8 photons and 4 energy levels, with 2
photons at each energy level. What is the probability of this occurrence under the
assumption that the particles are indistinguishable (the Bose-Einstein case)?
21. Show that in Example 2.30, if particles are indistinguishable but no two particles
can be at the same level, then there are C (p t) possible arrangements of t particles
into p levels. (Assume that t  p.)
22. (a) Show by a combinatorial argument that
S (n k) = kS (n ; 1 k) + S (n ; 1 k ; 1):
(b) Use the result in part (a) to describe how to compute Stirling numbers of the
second kind by a method similar to Pascal's triangle.
(c) Apply your result in part (b) to compute S (6 3).
23. Show by a combinatorial argument that
S (n + 1 k) = C (n 0)S (0 k ; 1) + C (n 1)S (1 k ; 1) +  + C (n n)S (n k ; 1):
24. (a) If order counts in a partition, then f3 2g is dierent from f2 3g. Find the
number of partitions of 5 if order matters.
(b) Find the number of partitions of 5 into exactly 2 parts where order matters.
(c) Show that the number of partitions of n into exactly k parts where order
matters is given by C (n ; 1 k ; 1).
25. The Bell number Bn is the number of partitions of a set of n elements into nonempty,
indistinguishable cells. Note that
Bn = S (n 0) + S (n 1) +  + S (n n):
Show that
 !  !  !
Bn = n ;0 1 B0 + n ;1 1 B1 +  + nn ; 1
; 1 Bn;1 :

2.11 MULTINOMIAL COEFFICIENTS


2.11.1 Occupancy Problems with a Specied Distribution
In this section we consider the occupancy problem of distributing n distinguishable
balls into k distinguishable cells. In particular, we consider the situation where we
distribute n1 balls into the rst cell, n2 into the second cell, : : :, nk into the kth
cell. Let
C(n n1 n2 : : : nk )
denote the number of ways this can be done. This section is devoted to the study
of the number C(n n1 n2 : : : nk ), which is sometimes also written as
 
n
n1  n2 : : : nk
and called the multinomial coe cient.
60 Chapter 2. Basic Counting Rules
Example 2.31 Campus Registration The university registrar's oce is having
a problem. It has 11 new students to squeeze into 4 sections of an introductory
course: 3 in the rst, 4 each in the second and third, and 0 in the fourth (that section
is already full). In how many ways can this be done? The answer is C(11 3 4 4 0).
Now there are C(11 3) choices for the rst section for each of these there are C(8 4)
choices for the second section for each of these there are C(4 4) choices for the third
section for each of these there are C(0 0) choices for the fourth section. Hence, by
the product rule, the number of ways to assign sections is
C(11 3 4 4 0) = C(11 3) C(8 4) C(4 4) C(0 0)
11! 8!
= 3!8! 4! 0! = 11! 
4!4! 4!0! 0!0! 3!4!4!
since 0! = 1. Of course, C(0 0) always equals 1, so the answer is equivalent to
C(11 3 4 4). Additionally, C(4 4) = 1, so the answer is also equivalent to C(11 3)
C(8 4). The reason for this is that once the 3 students for the rst section and 4
students for the second section have been chosen, there is only one way to choose
the remaining 4 for the third section.
Note that if section assignments for 11 students are made at random, there are
411 possible assignments: For each student, there are 4 choices of section. Hence,
the probability of having 3 students in the rst section, 4 each in the second and
third sections, and 0 in the fourth is given by
C(11 3 4 40) :
411
In general, suppose that Pr(n n1 n2 : : : nk ) denotes the probability that if n balls
are distributed at random into k cells, there will be ni balls in cell i, i = 1 2 : : : k.
Then
Pr(n n1 n2 : : : nk) = C(n n1 knn2 : : : nk ) :
(Why?) Note that when calculating the multinomial coecient, the acknowledg-
ment of empty cells does not aect the calculation. This is because
C(n n1 n2 : : : nj  0 0 : :: 0) = C(n n1 n2 : : : nj ):
However, the probability of a multinomial distribution is aected by empty cells as
the denominator is based on the number of cells, both empty and nonempty.
Continuing with our example, suppose that suddenly, spaces in the fourth section
become available. The registrar's oce now wishes to put 3 people each into the
rst, second, and third sections, and 2 into the fourth. In how many ways can this
be done? Of the 11 students, 3 must be chosen for the rst section of the remaining
8 students, 3 must be chosen for the second section of the remaining 5 students, 3
must be chosen for the third section nally, the remaining 2 must be put into the
fourth section. The total number of ways of making the assignments is
C(11 3 3 32) = C(11 3) C(8 3) C(5 3) C(2 2)
11! 8!
= 3!8! 5! 2! = 11! :
3!5! 3!2! 2!0! 3!3!3!2! 

You might also like