Quatry
ENGINEERING
USING ROBUST
DESIGN
MADHAV S. PHADKEAtst
(lw
QUALITY ENGINEERING
USING
ROBUST DESIGN
MADHAV S. PHADKE
AT&T Bell Laboratories
P T R Prentice Hall, Englewood Cliffs, New Jersey 07632Library of Congress Cataloging-in-Publication Data
Phadke, Madhav shridhar
Quality engineering using robust design / Madhav S. Phadke.
PB. ca.
Includes index.
ISBN 0-13-745167-9
1. Engineering design. 2. Computer-aided design. 3. UNIX
(Computer operating system) 4. Integrated circuits--Very large
scale integration. I. Title.
TA174.P49 1989
620 ' 0042 ’ 0285--de20 99-3927
CIP
© 1989 by AT&T Bell Laboratories
Published by PT R Prentice-Hall, Inc.
A Simon & Schuster Company
Englewood Cliffs, New Jersey 07632
All rights reserved. No part of this book may be
reproduced, in any form or by any means,
without permission in writing from the publisher.
Printed in the United States of America
10
ISBN 0~-13-745167-9
Prentice-Hall International (UK) Limited, London
Prentice-Hall of Australia Pty. Limited, Sydney
Prentice-Hall Canada Inc., Toronto
Prentice-Hall Hispanoamericana, S.A., Mexico
Prentice-Hall of India Private Limited, New Dethi
Prentice-Hall of Hapan, Inc., Tokyo
Simon & Schuster Asia Pte. Ltd., Singapore
Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro‘To my parents, and Maneesha, Kedar, and Lata.CONTENTS
Foreword xiii
Preface xv
Acknowledgments xvii
CHAPTER 1 INTRODUCTION 1
1d A Historical Perspective 2
1.2 What Is Quality? 3
1.3. Elements of Cost 4
1.4 Fundamental Principle 5
1.5 Tools Used in Robust Design 6
1.6 Applications and Benefits of Robust Design 8
1.7 Organization of the Book 10
18 Summary 10
vilvii
CHAPTER 2
21
2.2
23
24
25
2.6
27
28
29
CHAPTER 3
3.1
oe
3.3
3.4
a8
3.6
CHAPTER 4
4.1
Contents,
PRINCIPLES OF QUALITY ENGINEERING 13
Quality Loss Function—The Fraction
Defective Fallacy 14
Quadratic Loss Function 18
Noise Factors—Causes of Variation 23
Average Quality Loss 25
Exploiting Nonlinearity 27
Classification of Parameters: P Diagram 30
Optimization of Product and Process Design 32
Role of Various Quality Control Activities 35
Summary 38
MATRIX EXPERIMENTS USING
ORTHOGONAL ARRAYS a
Matrix Experiment fora CVD Process 42
Estimation of Factor Effects 44
Additive Model for Factor Effects 48
Analysis of Variance 51
Prediction and Diagnosis 59
Summary 63
STEPS IN ROBUST DESIGN 67
‘The Polysilicon Deposition Process and Its
Main Function 68Contents
42
43
44
45
46
47
48
49
CHAPTER 5
5.1
ae
S32
5.4
£3
5.6
CHAPTER 6
6.1
6.2
63
64
Noise Factors and Testing Conditions 71
Quality Characteristics and Objective Functions 72
Control Factors and Their Levels 74
Matrix Experiment and Data Analysis Plan 76
Conducting the Matrix Experiment 79
Data Analysis 80
Verification Experiment and Future Plan 90
Summary 93
SIGNAL-TO-NOISE RATIOS 97
Optimization for Polysilicon Layer Thickness Uniformity 98
Evaluation of Sensitivity to Noise 105
SIN Ratios for Static Problems 108
SIN Ratios for Dynamic Problems 114
Analysis of Ordered Categorical Data. 121
Summary 128
ACHIEVING ADDITIVITY 133
Guidelines for Selecting Quality Characteristics 135
Examples of Quality Characteristics 136
Examples of S/N Ratios 138
Selection of Control Factors 144Contents
6.5 Role of Orthogonal Arrays 146
66 Summary 146
CHAPTER 7 CONSTRUCTING ORTHOGONAL ARRAYS. 149
7.1 Counting Degrees of Freedom — 150
7.2. Selecting a Standard Orthogonal Array 151
7.3 Dummy Level Technique 154
7.4 Compound Factor Method 156
7.5 Linear Graphs and Interaction Assignment 157
7.6 — Modification of Linear Graphs 163,
7.7 Column Merging Method 166
7.8 Branching Design 168
7.9 Strategy for Constructing an Orthogonal Array 171
7.10 Comparison with the Classical Statistical Experiment
Design 174
7.11 Summary 181
CHAPTER 8 COMPUTER AIDED ROBUST DESIGN 183
8.1 Differential Op-Amp Circuit 184
8.2 Description of Noise Factors 186
8.3 Methods of Simulating the Variation in Noise
Factors 189
8.4 Orthogonal Array Based Simulation of Variation in
Noise Factors 190Contents
xi
8.5 Quality Characteristic and S/N Ratio 194
8.6 Optimization of the Design 194
8.7 Tolerance Design 202
8.8 Reducing the Simulation Effort 205
8.9 Analysis of Nonlinearity 207
8.10 Selecting an Appropriate S/N Ratio 208
8.11 Summary — 209
CHAPTER 9 DESIGN OF DYNAMIC SYSTEMS 213
9.1 Temperature Control Circuit and Its Function 214
9.2 Signal, Control, and Noise Factors 217
9.3 Quality Characteristics and S/N Ratios 218
9.4 Optimization of the Design 222
9.5 Iterative Optimization 227
9.6 Summary 228
CHAPTER 10 TUNING COMPUTER SYSTEMS
10.1
10.2
10.3
10.4
10.5
FOR HIGH PERFORMANCE 231
Problem Formulation 232
Noise Factors and Testing Conditions 234
Quality Characteristic and S/N Ratio 235
Control Factors and Their Alternate Levels 236
Design of the Matrix Experiment and the
Experimental Procedure 238xi
10.6
10.7
10.8
10.9
Contents
Data Analysis and Verification Experiments 240
Standardized S/N Ratio 246
Related Applications 249
Summary 249
CHAPTER 11 RELIABILITY IMPROVEMENT 253
APPENDIX A
APPENDIX B
APPENDIX C
11
112
11.3
114
1g.3
11.6
11.7
11.8
1L9
Role of S/N Ratios in Reliability Improvement 254
The Routing Process 256
Noise Factors and Quality Characteristics 256
Control Factors and Their Levels 257
Design of the Matrix Experiment 258
Experimental Procedure 265
Data Analysis 265
Survival Probability Curves 271
Summary 275
ORTHOGONALITY OF A MATRIX
EXPERIMENT 277
UNCONSTRAINED OPTIMIZATION 281
STANDARD ORTHOGONAL ARRAYS AND
LINEAR GRAPHS 285
REFERENCES 321
INDEXFOREWORD
The main task of a design engineer is to build in the function specified by the product
planning people at a competitive cost. An engineer knows that all kinds of functions
are energy transformations. ‘Therefore, the product designer must identify what is
input, what is output, and what is ideal function while developing a new product. It is
important to make the product’s function as close to the ideal function as possible.
Therefore, it is very important to measure correctly the distance of the product's
performance from the ideal function. This is the main role of quality engineering. In
order to measure the distance, we have to consider the following problems:
4, Identify signal and noise space
2. Select several points from the space
3. Select an adequate design parameter to observe the performance
4. Consider possible calibration or adjustment method
5. Select an appropriate measurement related with the mean distance
As most of those problems require engineering knowledge, a book on quality engineer-
ing must be written by a person who has enough knowledge of engineering.
Dr. Madhav Phadke, a mechanical engineer, has worked at AT&T Bell Labora-
tories for many years and has extensive experience in applying the Robust Design
method to problems from diverse engineering fields. He has made many eminent and
pioneering contributions in quality engineering, and he is one of the best qualified per-
sons to author a book on quality engineering.xiv, Foreword
The greatest strength of this book is the case studies. Dr. Phadke presents four
real instances where the Robust Design method was used to improve the quality and
cost of products. Robust Design is universally applicable to all engineering fields.
You will be able to use these case studies to improve the quality and cost of your
products,
This is the first book on quality engineering, written in English by an engineer.
‘The method described here has been applied successfully in many companies in Japan,
USA, and other countries. I recommend this book for all engineers who want to apply
experimental design for actual product design,
G. TaguchiPREFACE
Designing high-quality products and processes at low cost is an economic and technol-
ogical challenge to the engineer. A systematic and efficient way to meet this challenge
is a new method of design optimization for performance, quality, and cost. The
method, called Robust Design, consists of
1, Making product performance insensitive to raw material variation, thus allowing
the use of low grade material and components in most cases,
2. Making designs robust against manufacturing variation, thus reducing labor and
material cost for rework and scrap,
3. Making the designs least sensitive to the variation in operating environment, thus
improving reliability and reducing operating cost, and
4. Using a new structured development process so that engineering time is used
most productively.
All engineering designs involve setting values of a large number of decision vari-
ables. Technical experience together with experiments, through prototype hardware
models or computer simulations, are needed to come up with the most advantageous
decisions about these variables. Studying these variables one at a time or by trial and
error is the common approach to the decision process. This leads to either a very long
and expensive time span for completing the design or premature termination of the
design process so that the product design is nonoptimal. This can mean missing the
market window and/or delivering an inferior quality product at an inflated cost.xvi Preface
The Robust Design method uses a mathematical tool called orthogonal arrays to
study a large number of decision variables with a small number of experiments. It also
uses a new measure of quality, called signal-to-noise (S/N) ratio, to predict the quality
from the customer's perspective. Thus, the most economical product and process
design from both manufacturing and customers’ viewpoints can be accomplished at the
smallest, affordable development cost. Many companies, big and small, high-tech and
low-tech, have found the Robust Design method valuable in making high-quality prod-
ucts available to customers at a low competitive price while still maintaining an accept-
able profit margin.
This book will be useful to practicing engineers and engineering managers from
all disciplines. It can also be used as a text in a quality engineering course for seniors
and first year graduate students, The method is explained through a series of real case
studies, thus making it easy for the readers to follow the method without the burden of
leaming detailed theory. At AT&T, several colleagues and I have developed a two and
a half day course on this topic. My experience in teaching the course ten times has
convinced me that the case studies approach is the best one to communicate how to use
the method in practice. The particular case studies used in this book relate to the fabri-
cation of integrated circuits, circuit design, computer tuning, and mechanical routing.
Although the book is written primarily for engineers, it can also be used by stat-
isticians to study the wide range of applications of experimental design in quality
engineering. This book differs from the available books on statistical experimental
design in that it focuses on the engineering problems rather than on the statistical
theory. Only those statistical ideas that are relevant for solving the broad class of pro-
duct and process design problems are discussed in the book.
Chapters 1 through 7 describe the necessary theoretical and practical aspects of
the Robust Design method. The remaining chapters show a variety of applications
from different engineering disciplines. The best way for readers to use this book is,
after reading each section, to determine how the concepts apply to their projects. My
experience in teaching the method has revealed that many engineers like to see an
application of the method in their own field. Chapters 8 through 11 describe case stud-
ies from different engineering fields. It is hoped that these case studies will help
readers see the breadth of the applicability of the Robust Design method and assist
them in their own applications.
Madhav S. Phadke
AT&T Bell Laboratories
Holmdel, N.J.ACKNOWLEDGMENTS
Thad the greatest fortune to leam the Robust Design methodology directly from its
founder, Professor Genichi Taguchi. It is with the deepest gratitude that I acknowledge
his inspiring work. My involvement in the Robust Design method began when Dr.
Roshan Chaddha asked me to host Professor Taguchi’s visit to AT&T Bell Labora-
tories in 1980. I thank Dr. Chaddha (Bellcore, formerly with AT&T Bell Labs) for the
invaluable encouragement he gave me during the early applications of the method in
AT&T and also while writing this book. I also received valuable support and
encouragement from Dr. E. W. Hinds, Dr. A. B. Godfrey, Dr. R. E. Kerwin, and Mr.
E, Fuchs in applying the Robust Design method to many different engineering fields
which led to deeper understanding and enhancement of the method.
Writing a book of this type needs a large amount of time. I am indebted to Ms.
Cathy Savolaine for funding the project. 1 also thank Mr. J. V. Bodycomb and Mr.
Larry Bernstein for supporting the project.
The case studies used in this book were conducted through collaboration with
many colleagues, Mr. Gary Blaine, Mr. Dave Chrisman, Mr. Joe Leanza, Dr. T. W.
Pao, Mr. C. S. Sherrerd, Dr. Peter Hey, and Mr. Paul Sherry. I am grateful to them for
allowing me to use the case studies in the book.
T also thank my colleagues, Mr. Don Speeney, Dr. Raghu Kackar, and Dr. Mike
Grieco, who worked with me on the first Robust Design case study at AT&T.
Through this case study, which resulted in huge improvements in the window photo-
lithography process used in integrated circuits fabrication, I gained much insight into
the Robust Design method.
xviixvii Acknowledgments
I thank Mr, Rajiv Keny for numerous discussions on the organization of the
book. A number of my colleagues read the draft of the book and provided me with
valuable comments. Some of the people who provided the comments are: Dr. Don
Clausing (M.LT.), Dr. A. M. Jogiekar (Honeywell), Dr. C. W. Hoover, Jr. (Polytechnic
University), Dr. Jim Pennell (IDA), Dr. Steve Eick, Mr. Don Speeney, Dr. M.
Daneshmand, Dr. V. N. Nair, Dr. Mike Luvalle, Dr. Ajit S. Manocha, Dr. V. V. S.
Rana, Ms, Cathy Hudson, Dr. Miguel Perez, Mr. Chris Sherrerd, Dr. M. H. Sherif, Dr.
Helen Hwang, Dr. Vasant Prabhu, Ms. Valerie Partridge, Dr. Sachio Nakamura, Dr. K.
Dehnad, and Dr. Gary Ulrich. 1 thank them all for their generous help in improving
the content and readability of the book. I also thank Mr, Akira Tomishima
(Yamatake-Honeywell), Dr. Mohammed Hamami, and Mr. Bruce Linick for helpful
discussions on specific topics in the book. Thanks are also due to Mr. Yuin Wu (ASI)
for valuable general discussions.
I very much appreciate the editorial help I received from Mr. Robert Wright and
Ms. April Cormaci through the various stages of manuscript preparation, Also, I thank
Ms. Eve Engel for coordinating text processing and the artwork during manuscript
preparation.
The text of this volume was prepared using the UNIX* operating system, 5.2.6a,
and a LINOTRONIC® 300 was used to typeset the manuscript. Mr. Wright was
responsible for designing the book format and coordinating production, Mr. Don Han-
kinson, Ms. Mari-Lynn Hankinson, and Ms. Marilyn Tomaino produced the final illus-
trations and were responsible for the layout. Ms. Kathleen Attwooll, Ms. Sharon Mor-
gan, and several members of the Holmdel Text Processing Center provided electronic
text processing.Chapter 1
INTRODUCTION
The objective of engineering design, a major part of research and development (R&D),
is to produce drawings, specifications, and other relevant information needed to
manufacture products that meet customer requirements. Knowledge of scientific
phenomena and past engineering experience with similar product designs and manufac-
turing processes form the basis of the engineering design activity (see Figure 1.1).
However, a number of new decisions related to the particular product must be made
regarding product architecture, parameters of the product design, the process architec-
ture, and parameters of the manufacturing process. A large amount of engineering
effort is consumed in conducting experiments (either with hardware or by simulation)
to generate the information needed to guide these decisions. Efficiency in generating
such information is the key to meeting market windows, keeping development and
manufacturing costs low, and having high-quality products. Robust Design is an
engineering methodology for improving productivity during research and development
so that high-quality products can be produced quickly and at low cost.
This chapter gives an overview of the basic concepts underlying the Robust
Design methodology:
* Section 1.1 gives a brief historical background of the method.
* Section 1.2 defines the term quality as it is used in this book.
* Section 1.3 enumerates the basic elements of the cost of a product.
* Section 1.4 describes the fundamental principle of the Robust Design methodol-
ogy with the help of a manufacturing example.Introduction Chap. 1
Section 1.5 briefly describes the major tools used in Robust Design.
Section 1.6 presents some representative problems and the benefits of using the
Robust Design method in addressing them
Section 1.7 gives a chapter-by-chapter outline of the rest of the book.
Section 1.8 summarizes the important points of this chapter.
In the subsequent chapters, we describe Robust Design concepts in detail and, through
case studies, we show how to apply them.
Product
Design
Customer's
Requirements
RAD for Design
* Desired function and Low cost
+ Usage environment Manufacturing + High quality
+ Failure cost (low failure cost)
Scientific
Knowledge
Engineering
Knowledge
Understanding
of natural
phenomena
Experience with
previous designs
and manufacturing
processes
Figure L.1 Block diagram of R&D acti
1.1 A HISTORICAL PERSPECTIVE
When Japan began its reconstruction efforts after World War I, it faced an acute short-
age of good-quality raw material, high-quality manufacturing equipment and skilled
engineers. The challenge was to produce high-quality products and continue to
improve the quality under those circumstances. ‘The task of developing a methodology
to meet the challenge was assigned to Dr. Genichi Taguchi, who at that time was a
manager in charge of developing certain telecommunications products at the Electrical
Communications Laboratories (ECL) of Nippon Telephone and Telegraph Company
(NTT). Through his research in the 1950s and the early 1960s, Dr. Taguchi developed
the foundations of Robust Design and validated its basic philosophies by applyingSec. 1.2 What Is Quality? 3
them in the development of many products. In recognition of this contribution,
Dr. Taguchi received the individual Deming Award in 1962, which is one of the
highest recognitions in the quality field.
The Robust Design method can be applied to a wide variety of problems. The
application of the method in electronics, automotive products, photography and many
other industries has been an important factor in the rapid industrial growth and the sub-
sequent domination of international markets in these industries by Japan.
Robust Design draws on many ideas from statistical experimental design to plan
experiments for obtaining dependable information about variables involved in making
engineering decisions. The science of statistical experimental design originated with
the work of Sir Ronald Fisher in England in the 1920s. Fisher founded the basic prin-
ciples of experimental design and the associated data-analysis technique called analysis
of variance (ANOVA) during his efforts to improve the yield of agricultural crops.
The theory and applications of experimental design and the related technique of
response surface methodology have been advanced by many statistical researchers.
Today, many excellent textbooks on this subject exist, for example, Box, Hunter and
Hunter [B3], Box and Draper [B2], Hicks [H2J, John [J2], Raghavarao [RI], and
Kempthorne [K4]. Various types of matrices are used for planning experiments to
study several decision variables simultaneously. Among them, Robust Design makes
heavy use of the orthogonal arrays, whose use for planning experiments was first pro-
posed by Rao [R2].
Robust Design adds a new dimension to statistical experimental design—it
explicitly addresses the following concems faced by all product and process designers:
* How to reduce economically the variation of a product’s function in the
customer's environment. (Note that achieving a product’s function consistently
on target maximizes customer satisfaction.)
* How to ensure that decisions found to be optimum during laboratory experiments
will prove to be so in manufacturing and in customer environments,
In addressing these concems, Robust Design uses the mathematical formalism of sta-
tistical experimental design, but the thought process behind the mathematics is different
in many ways. The answers provided by Robust Design to the two concems listed
above make it a valuable tool for improving the productivity of the R&D activity.
The Robust Design method is still evolving. With the active research being car-
ried out in the United States, Japan, and other countries, it is expected that the applica-
tions of the method and the method itself will grow rapidly in the coming decade.
1.2 WHAT IS QUALITY?
Because the word quality means different things to different people (see, for example,
Juran [J3], Deming [D2], Crosby [C5], Garvin [G1], and Feigenbaum [F1}), we need to
define its ‘use in this book. First, let us define what we mean by the ideal quality4 Introduction Chap. 1
which can serve as a reference point for measuring the quality level of a product. The
ideal quality a customer can expect is that every product delivers the target perfor-
mance each time the product is used, under ali intended operating conditions, and
throughout its intended life, with no harmful side effects. Note that the traditional con-
cepts of reliability and dependability are part of this definition of quality. In specific
situations, it may be impossible to produce a product with ideal quality. Nonetheless,
ideal quality serves as a useful reference point for measuring the quality level.
The following example helps clarify the definition of ideal quality. People buy
automobiles for different purposes. Some people buy them to impress their friends
while others buy them to show off their social status. To satisfy these diverse pur-
poses, there are different types (species) of cars—sports cars, luxury cars, etc—on the
market. For any type of car, the buyer always wants the automobile to provide reliable
transportation. Thus, for each type of car, an ideal quality automobile is one that
works perfectly each time it is used (on hot summer days and cold winter days),
throughout its intended life (not just the warranty life) and does not pollute the atmo-
sphere.
When a product’s performance deviates from the target performance, its quality
is considered inferior. The performance may differ from one unit to another or from
one environmental condition to another, or it might deteriorate before the expiration of
the intended life of the product. Such deviation in performance causes loss to the user
of the product, the manufacturer of the product, and, in varying degrees, to the rest of
the society as well. Following Taguchi, we measure the quality of a product in terms
of the total loss to society due to functional variation and harmful side effects. Under
the ideal quality, the loss would be zero; the greater the loss, the lower the quality.
In the automobile example, if a car breaks down on the road, the driver would, at
the least, be delayed in reaching his or her destination. The disabled car might be the
cause of traffic jams or accidents, The driver might have to spend money to have the
car towed. If the car were under warranty, the manufacturer would have to pay for
repairs. The concept of quality loss includes all these costs, not just the warranty cost.
Quantifying the quality loss is difficult and is discussed in Chapter 2.
Note that the definition of quality of a product can be easily extended to
processes as well as services. As a matter of fact, the entire discussion of the Robust
Design method in this book is equally applicable for processes and services, though for
simplicity, we do not state so each time.
1.3 ELEMENTS OF COST
Quality at what cost? Delivering a high-quality product at low cost is an interdisci-
plinary problem involving engineering, economics, statistics, and management. The
three main categories of cost one must consider in delivering a product are:Sec. 1.4 Fundamental Principle 5
1, Operating Cost. Operating cost consists of the cost of energy needed to operate
the product, environmental control, maintenance, inventory of spare parts and
units, etc. Products made by different manufacturers can have different energy
costs. If a product is sensitive to temperature and humidity, then elaborate and
costly air conditioning and heating units are needed. A high product failure rate
of a product causes large maintenance costs and costly inventory of spare units.
A manufacturer can greatly reduce the operating cost by designing the product
robust—that is, minimizing the product’s sensitivity to environmental and usage
conditions, manufacturing variation, and deterioration of parts.
2. Manufacturing Cost. Important elements of manufacturing cost are equipment,
machinery, raw materials, labor, scrap, rework, etc. In a competitive environ-
ment, it is important to keep the unit manufacturing cost (umc) low by using
low-grade material, employing less-skilled workers, and using less-expensive
equipment, and at the same time maintain an appropriate level of quality. This is
possible by designing the product robust, and designing the manufacturing pro-
cess robust—that is, minimizing the process’ sensitivity to manufacturing distur-
bances.
3. R&D Cost. The time taken to develop a new product plus the amount of
engineering and laboratory resources needed are the major elements of R&D
cost. The goal of R&D activity is to keep the ume and operating cost low.
Robust Design plays an important role in achieving this goal because it improves
the efficiency of generating information needed to design products and processes,
thus reducing development time and resources needed for development.
Note that the manufacturing cost and R&D cost are incurred by the producer and
then passed on to the customer through the purchase price of the product. The operat-
ing cost, which is also called usage cost, is borne directly by the customer and it is
directly related to the product’s quality. From the customer's point of view, the pur-
chase price plus the operating cost determine the economics of satisfying the need for
which the product is bought. Higher quality means lower operating cost and vice
versa, Robust Design is a systematic method for keeping the producer's cost low
while delivering a high-quality product, that is, while keeping the operating cost low.
1.4 FUNDAMENTAL PRINCIPLE
The key idea behind Robust Design is illustrated by the experience of Ina Tile Com-
pany, described in detail in Taguchi and Wu [T7]. During the late 1950s, Ina Tile
Company in Japan faced the problem of high variability in the dimensions of the tiles
it produced [see Figure 1.2(a)]. Because screening (rejecting those tiles outside
specified dimensions) was an expensive solution, the company assigned a team of
expert engineers to investigate the cause of the problem. The team’s analysis showed
that the tiles at the center of the pile inside the kiln (see Figure 1.2 (b)] experienced
lower temperature than those on the periphery. This nonuniformity of temperature dis-
tribution proved to be the cause of the nonuniform tile dimensions. The team reported6 Introduction Chap. 1
that it would cost approximately half a million dollars to redesign and build a kiln in
which all the tiles would receive uniform temperature distribution. Although this alter-
native was less expensive than screening it was still too costly,
The team then brainstormed and defined a number of process parameters that
could be changed easily and inexpensively. After performing a small set of well-
planned experiments according to Robust Design methodology, the team concluded that
increasing the lime content of the clay from 1 percent to 5 percent would greatly
reduce the variation of the tile dimensions. Because lime was the least expensive
ingredient, the cost implication of this change was also favorable.
Thus, the problem of nonuniform tile dimensions was solved by minimizing the
effect of the cause of the variation (nonuniform temperature distribution) without con-
tolling the cause itself (the kiln design), As illustrated by this example, the fundamen-
tal principle of Robust Design is to improve the quality of a product by minimizing the
effect of the causes of variation without eliminating the causes. This is achieved by
optimizing the product and process designs o make the performance minimally sensi-
tive to the various causes of variation. This is called parameter design. However,
parameter design alone does not always lead to sufficiently high quality. Further
improvement can be obtained by controlling the causes of variation where economi-
cally justifiable, typically by using more expensive equipment, higher grade com-
ponents, better environmental controls, etc., all of which lead to higher product cost, or
operating cost, or both. The benefits of improved quality must justify the added prod-
uct cost.
1.5 TOOLS USED IN ROBUST DESIGN
A great deal of engineering time is spent generating information about how different
design parameters affect performance under different usage conditions. Robust Design
methodology serves as an "amplifier"—that is, it enables an engineer to generate infor-
mation needed for decision-making with half (or even less) the experimental effort.
There are two important tasks to be performed in Robust Design:
1, Measurement of Quality During Design/Development. We want a leading indi-
cator of quality by which we can evaluate the effect of changing a particular
design parameter on the product's performance.
2. Efficient Experimentation to Find Dependable Information about the Design
Parameters. It is critical to obtain dependable information about the design
parameters so that design changes during manufacturing and customer use can be
avoided. Also, the information should be obtained with minimum time and
Tesources.
The estimated effects of design parameters must be valid even when other param-
eters are changed during the subsequent design effort or when designs of related sub-
systems change, This can be achieved by employing the signal-to-noise (SIN) ratio to
measure quality and orthogonal arrays to study many design parameters simultane-
ously. These tools are described later in this book.Sec. 1.5 Tools Used in Robust Design
Probability Distribution
(b) Schematic Diagram of the Kiin
Figure 1.2 Tile manufacturing example.8 Introduction Chap. 1
1.6 APPLICATIONS AND BENEFITS OF ROBUST DESIGN
The Robust Design method is in use in many areas of engineering throughout the
United States, For example, AT&T's use of Robust Design methodology has lead to
improvement of several processes in very large scale integrated (VLSI) circuit fabrica-
tion used in the manufacture of I-megabit and 256-kilobit memory chips, 32-bit pro-
cessor chips, and other products. Some of the VLSI applications are:
+ The window photolithography application (documented in Phadke, Kackar,
Speeney, and Grieco [P5]) was the first application in the United States that
demonstrated the power of Taguchi’s approach to quality and cost improvement
through robust process design. In particular, the benefits of the application were:
— 4-fold reduction in process variance
— 3-fold reduction in fatal defects
— 2-fold reduction in processing time (because the process became stable.
allowing time-consuming inspection to be dropped)
— Easy transition of design from research to manufacturing
— Easy adaptation of the process to finer-line technology (adaptation from
3.5-micron to 2.5-micron technology), which is typically a very difficult
problem.
* The aluminum etching application originated from a belief that poor photoresist
print quality leads to line width loss and to undercutting during the etching pro-
cess. By making the etching process insensitive to photoresist profile variation
and other sources of variation, the visual defects were reduced from 80 percent to
15 percent. Moreover, the etching step could then tolerate the variation in the
photoresist profile.
The reactive ion etching of tantalum silicide (described in Katz and Phadke
IK3)), used to give highly nonuniform etch quality, so only 12 out of 18 possible
wafer positions could be used for production. After optimization, 17 wafer posi-
tions became usable—a hefty 40 percent increase in machine utilization, Also,
the efficiency of the orthogonal array experimentation allowed this project to be
completed by the 20-day deadline. In this case, $1.2 million was saved in equip-
ment replacement costs not including the expense of disruption on the factory
floor.
The polysilicon deposition process had between 10 and 5000 surface defects per
unit area. As such, it represented a serious road block in advancing to line
widths smaller than 1.75 micron, Six process parameters were investigated with
18 experiments leading to consistently less than 10 surface defects per unit area.
As a result, the scrap rate was reduced significantly and it became possible to
process smaller line widths, This case study is described in detail in Chapter 4.Sec. 1.6 Applications and Benefits of Robust Design 9
Other AT&T applications include:
© The router bit life-improvement project (described in Chapter 11 and Phadke
{P3]) led to a 2-fold to 4-fold increase in the life of router bits used in cutting
printed wiring boards. The project illustrates how reliability or life improvement
Projects can be organized to find the best settings of the routing process parame-
ters with a very small number of samples. The number of samples needed in
this approach is very small, yet it can give valuable information about how each
parameter changes the survival probability curve (change in survival probability
as a function of time).
In the differential operational amplifier circuit optimization application
(described in Chapter 8 and Phadke [P3]), a 40-percent reduction in the root
mean square (rms) offset voltage was realized by simply finding new nominal
values for the circuit parameters. This was done by reducing sensitivity to all
tolerances and temperature, rather than reducing tolerances, which could have
increased manufacturing cost.
The Robust Design method was also used to find optimum proportions of
ingredients for making water-soluble flux, By simultaneous study of the parame-
ters for the wave soldering process and the flux composition, the defect rate was
reduced by 30 to 40 percent (see Lin and Kackar [1.3].
Orthogonal array experiments can be used to tune hardware/sofiware systems.
By simultaneous study of three hardware and six software parameters, the
response time of the UNIX operating system was reduced 60 percent for a partic-
ular set of load conditions experienced by the machine (see Chapter 10 and Pao,
Phadke, and Sherrerd [P1]).
Under the leadership of American Supplier Institute and Ford Motor Company, a
number of automotive suppliers have achieved quality and cost improvement through
Robust Design. These applications include improvements in metal casting, injection
molding of plastic parts, wave soldering of electronic components, speedometer cable
design, integrated circuit chip bonding, and picture tube lens coating. Many of these
applications are documented in the Proceedings of Supplier Symposia on Taguchi
Methods [P9].
All these examples show that the Robust Design methodology offers simultane-
ous improvement of product quality, performance and cost, and engineering produc-
tivity. Its widespread use in industry will have a far-reaching economic impact because
this methodology can be applied profitably in all engineering activities, including prod-
uct design and manufacturing process design.
The philosophy behind Robust Design is not limited to engineering applications.
Yokoyama and Taguchi [Y1] have also shown its applications in profit planning in
business, cash-flow optimization in banking, government policymaking, and other
areas, The method can also be used for tasks such as determining optimum work force
mix for jobs where the demand is random, and improving the runway utilization at an
airport,10 Introduction Chap. 1
1.7 ORGANIZATION OF THE BOOK
This book is divided into three parts. The first part (Chapters | through 4) describes
the basics of the Robust Design methodology. Chapter 2 describes the quality loss
function, which gives a quantitative way of evaluating the quality level of a product
rather than just the "good-bad" characterization. After categorizing the sources of vari-
ation, the chapter further describes the steps in engineering design and the classification
of parameters affecting the product’s function. Quality control activities during dif-
ferent stages of the product realization process are also described there. Chapter 3 is
devoted to orthogonal array experiments and basic analysis of the data obtained
through such experiments. Chapter 4 illustrates the entire strategy of Robust Design
through an integrated circuit (IC) process design example. The strategy begins with
problem formulation and ends with verification experiment and implementation. This
case study could be used as a model in planning and carrying out manufacturing pro-
cess optimization for quality, cost, and manufacturability. The example also has the
basic framework for optimizing a product design.
The second part of the book (Chapters 5 through 7) describes, in detail, the tech-
niques used in Robust Design. Chapter 5 describes the concept of signal-to-noise ratio
and gives appropriate signal-to-noise ratios for a number of common engineering prob-
lems. Chapter 6 is devoted to a critical decision in Robust Design: choosing an
appropriate response variable, called quality characteristic, for measuring the quality of
a product or a process. The guidelines for choosing quality characteristics are illus-
trated with examples from many different engineering fields. A step-by-step procedure
for designing orthogonal array experiments for a large variety of industrial problems is
given in Chapter 7.
‘The third part of the book (Chapters 8 through 11) describes four more case
studies to illustrate the use of Robust Design in a wide variety of engineering discip-
lines, Chapter 8 shows how the Robust Design method can be used to optimize prod-
uct design when computer simulation models are available. The differential operational
amplifier case study is used to illustrate the optimization procedure. This chapter also
shows the use of orthogonal arrays to simulate the variation in component values and
environmental conditions, and thus estimate the yield of a product. Chapter 9 shows
the procedure for designing an ON-OFF control system for a temperature controller,
The use of Robust Design for improving the performance of a hardware-software sys-
tem is described in Chapter 10 with the help of the UNIX operating system tuning case
study. Chapter 11 describes the router bit life study and explains how Robust Design
can be used to improve reliability.
1.8 SUMMARY
+ Robust Design is an engineering methodology for improving productivity during
research and development so that high-quality products can be produced quickly
and at low cost. Its use can greatly improve an organization’s ability to meet
market windows, keep development and manufacturing costs low, and deliver
high-quality products.Sec. 1.8 Summary "
+ Through his research in the 1950s and early 1960s, Dr. Genichi Taguchi
developed the foundations of Robust Design and validated the basic, underlying
philosophies by applying them in the development of many products.
+ Robust Design uses many ideas from statistical experimental design and adds a
new dimension to it by explicitly addressing two major concerns faced by all
product and process designers:
a. How to reduce economically the variation of a product’s function in the
customer's environment.
b. How to ensure that decisions found optimum during laboratory experiments
will prove to be so in manufacturing and in customer environments.
.
The ideal quality a customer can receive is that every product delivers the target
performance each time the product is used, under all intended operating condi-
tions, and throughout the product's intended life, with no harmful side effects.
The deviation of a product's performance from the target causes loss to the user
of the product, the manufacturer, and, in varying degrees, to the rest of society as
well. The quality level of a product is measured in terms of the total loss to the
society due to functional variation and harmful side effects.
The three main categories of cost one must consider in delivering a product are:
(1) operating cost: the cost of energy, environmental control, maintenance, inven-
tory of spare parts, etc. (2) manufacturing cost: the cost of equipment,
machinery, raw materials, labor, scrap, network, ete. (3) R&D cost: the time
taken to develop a new product plus the engineering and laboratory resources
needed
The fundamental principle of Robust Design is to improve the quality of a prod-
izing the effect of the causes of variation without eliminating the
izing the product and process designs 10 make
the performance minimally sensitive to the various causes of variation, a process
called parameter design.
* The two major tools used in Robust Design are: (1) signal-to-noise ratio, which
measures quality and (2) orthogonal arrays, which are used to study many
design parameters simultaneously.
The Robust Design method has been found valuable in virtually all engineering
fields and business applications.Chapter 2
PRINCIPLES OF
QUALITY ENGINEERING
A produet’s life cycle can be divided into two main parts: before sale to the customer
and after sale to the customer. All costs incurred prior to the sale of the product are
added to the unit manufacturing cost (ume), while all costs incurred after the sale are
lumped together as quality loss. Quality engineering is concerned with reducing both
of these costs and, thus, is an interdisciplinary science involving engineering design,
manufacturing operations, and economics.
It is often said that higher quality (lower quality loss) implies higher unit
manufacturing cost. Where does this misconception come from? It arises because
engineers and managers, unaware of the Robust Design method, tend to achieve higher
quality by using more costly parts, components, and manufacturing processes. In this
chapter we delineate the basic principles of quality engineering and put in perspective
the role of Robust Design in reducing the quality loss as well as the ume, This chapter
contains nine sections:
* Sections 2.1 and 2.2 are concerned with the quantification of quality loss. Sec-
tion 2.1 describes the shortcomings of using fraction defective as a measure of
quality loss. (This is the most commonly used measure of quality loss.) Sec-
tion 2.2 describes the quadratic loss function, which is a superior way of quanti-
fying quality loss in most situations.
* Section 2.3 describes the various causes, called noise factors, that lead to the
deviation of a product’s function from its target.
134 Principles of Quality Engineering Chap. 2
Section 2.4 focuses on the computation of the average quality loss, its com-
ponents, and the relationship of these components to the noise factors.
Section 2.5 describes how Robust Design exploits nonlinearity to reduce the
average quality loss without increasing umc.
Section 2.6 describes the classification of parameters, an important activity in
quality engineering for recognizing the different roles played by the various
parameters that affect a product’s performance.
Section 2.7 discusses different ways of formulating product and process design
optimization problems and gives a heuristic solution.
Section 2.8 addresses the various stages of the product realization process and
the role of various quality control activities in these stages.
Section 2.9 summarizes the important points of this chapter.
Various aspects of quality engineering are described in the following references:
Taguchi (T2], Taguchi and Wu [T7], Phadke [P2], Taguchi and Phadke (T6],
Kackar [K1,K2J, Taguchi (T4], Clausing [C1], and Byme and Taguchi [B4].
2.1 QUALITY LOSS FUNCTION—THE FRACTION DEFECTIVE
FALLACY
‘We have defined the quality level of a product to be the total loss incurred by society
due to the failure of the product to deliver the target performance and due to harmful
side effects of the product, including its operating cost. Quantifying this loss is
difficult because the same product may be used by different customers, for different
applications, under different environmental conditions, etc. However, it is important to
quantify the loss so that the impact of alternative product designs and manufacturing
processes on customers can be evaluated and appropriate engineering decisions made.
Moreover, it is critical that the quantification of loss not become a major task that con-
sumes substantial resources at various stages of product and process design.
It is common to measure quality in terms of the fraction of the total number of
units that are defective. This is referred to as fraction defective. Although commonly
used, this measure of quality is often incomplete and misleading. It implies that ail
products that meet the specifications (allowable deviations from the target response) are
equally good, while those outside the specifications are bad. The fallacy here is that
the product that barely meets the specifications is, from the customer's point of view,
as good or as bad as the product that is barely outside the specifications. In reality, the
product whose response is exactly on target gives the best performance. As the
product's response deviates from the target, the quality becomes progressively worse.Seo. 2.1 Quality Loss Function—The Fraction Defective Fallacy 15
Example—Television Set Color Density:
The deficiency of fraction defective as a quality measure is well-illustrated by the Sony
television customer preference study published by the Japanese newspaper, The
Asahi (T8]. In the late 1970s, American consumers showed a preference for the televi-
sion sets made by Sony-Japan over those made by Sony-USA. The reason cited in the
study was quality. Both factories, however, made televisions using identical designs
and tolerances. What could then account for the perceived difference in quality?
In its investigative report, the newspaper showed the distribution of color density
for the sets made by the two factories (see Figure 2.1). In the figure, m is the target
color density and m+5 are the tolerance limits (allowable manufacturing deviations).
‘The distribution for the Sony-Japan factory was approximately normal with mean on
target and a standard deviation of 5/3. The distribution for Sony-USA was approxi-
mately uniform in the range of m5. Among the sets shipped by Sony-Japan, about
0.3 percent were outside the tolerance limits, while Sony-USA shipped virtually no sets
outside the tolerance limits. Thus, the difference in customer preference could not be
explained in terms of the fraction defective sets.
Sony—USA Sony—Japan
Figure 2.1 Distribution of color density in television sets. (Source: The Asahi, April 17,
1979),
The perceived difference in quality becomes clear when we look closely at the
sets that met the tolerance limits. Sets with color density very near m perform best and
can be classified grade A. As the color density deviates from m, the performance
becomes progressively worse, as indicated in Figure 2,1 by grades B and C, It is clear16 Principles of Quality Engineering Chap. 2
that Sony-Japan produced many more grade A sets and many fewer grade C sets when
compared to Sony-USA. Thus, the average grade of sets produced by Sony-Japan was
better, hence the customer's preference for the sets made by Sony-Japan,
In short, the difference in the customer's perception of quality was a result of
Sony-USA paying attention only to meeting the tolerances, whereas in Sony-Japan the
attention was focused on meeting the target.
Example—Telephone Cable Resistance:
Using a wrong measurement system can, and often does, drive the behavior of people
in wrong directions. The telephone cable example described here illustrates how using
fraction defective as a measure of quality loss can permit suboptimization by the
manufacturer leading to an increase in the total cost, which is the sum of quality loss
and ume.
A certain gauge of copper wires used in telephone cables had a nominal resis-
tance value of m ohms/mile and the maximum allowed resistance was (m+ Ap)
ohms/mile. This upper limit was determined by taking into consideration the manufac-
turing capability, represented by the distribution (a) in Figure 2.2, at the time the
specifications were written. Consequently, the upper limit (m-+Ap) was an adequate
way to ensure that the drawing process used to form the copper wire was kept in con-
trol with the mean on target.
(b)
Ny
ON
m-A, m m+A,
Resistance (Ohms/Mile) —>
Figure 2.2 Distribution of telephone cable resistance. (a) Initial distribution. (6) After pro-
cess improvement and shifting the meanSec. 2.1 Quality Loss Function—The Fraction Detective Fallacy 7
By improving the wire drawing process through the application of new technol-
ogy, the manufacturer was able to reduce substantially the process variance. This per-
mitted the manufacturer to move the mean close to the upper limit and still meet the
fraction defective criterion for quality [see distribution (b) in Figure 2.2]. At the same
time, the manufacturer saved on the cost of copper since larger resistance implies a
smaller cross section of the wire. However, from the network point of view, the larger
average resistance resulted in high electrical loss, causing complaints from the tele-
phone users. Solving the problem in the field meant spending a lot more money for
installing additional repeaters and for other corrective actions than the money saved in
manufacturing—that is, the increase in the quality loss far exceeded the saving in the
umc. Thus, there was a net loss to the society consisting of both the manufacturer and
the telephone company who offered the service. Therefore, a quality loss metric that
permits such local optimization leading to higher total cost should be avoided. Section
2.2 discusses a better way to measure the quality loss.
Interpretation of Engineering Tolerances
The examples above bring out an important point regarding quantification of quality
loss. Products that do not meet tolerances inflict a quality loss on the manufacturer, a
loss visible in the form of scrap or rework in the factory, which the manufacturer adds
to the cost of the product. However, products that meet tolerance also inflict a quality
loss, a loss that is visible to the customer and that can adversely affect the sales of the
product and the reputation of the manufacturer. Therefore, the quality loss function
must also be capable of measuring the loss due to products that meet the tolerances.
Engineering specifications are invariably written as m + Ag. These specifications
should not be interpreted to mean that any value in the range (m — Ap) to (m + Ag) is
equally good for the customer and that as soon as the range is exceeded the product is
bad. In other words, the step function shown below and in Figure 2.3(a) is an inade-
quate way to quantify the quality loss:
0 if |y-m| sd
LO)= lay otherwise a)
Here, Ag is the cost of replacement or repair. Use of such a loss function is apt to lead
to the problems that Sony-USA and the cable manufacturer faced and, hence, should be
avoided.18 Principles of Quality Engineering Chap. 2
(a) Step Function
(b) Quadratic
Loss Function
Figure 2.3 Quality loss function,
2.2 QUADRATIC LOSS FUNCTION
The quadratic loss function can meaningfully approximate the quality loss in most
situations. Let y be the quality characteristic of a product and m be the target value for
y. (Note that the quality characteristic is a product's response that is observed for
quantifying quality level and for optimization in a Robust Design project.) According
to the quadratic loss function, the quality loss is given by
LQ) =kQy-my* (2.2)
where k is a constant called quality loss coefficient. Equation (2.2) is plotted in Figure
2.3(b). Notice that at y = m the loss is zero and so is the slope of the loss function.
This is quite appropriate because m is the best value for y. The loss L(y) increases
slowly when we are near m; but as we go farther from m the loss increases more
rapidly. Qualitatively, this is exactly the kind of behavior we would like the quality
loss function to have. The quadratic loss function given by Equation (2.2) is the sim-
plest mathematical function that has the desired qualitative behavior.Sec. 2.2 ‘Quadratic Loss Function 19
Note that Equation (2.2) does not imply that every customer who receives a prod-
uct with y as the value of the quality characteristic will incur a precise quality loss equal
to L(y). Rather, it implies that the average quality loss incurred by those customers is
L(). The quality loss incurred by a particular customer will obviously depend on that
customer’s operating environment.
It is important to determine the constant & so that Equation (2.2) can best approxi-
mate the actual loss within the region of interest. This is a rather difficult, though impor-
tant, task. A convenient way to determine k is to determine first the functional limits for
the value of y. Functional limit is the value of y at which the product would fail in half of
the applications. Let m + Ag be the functional limits. Suppose, the loss at m + Ag is
Ag. Then by substitution in Equation (2.2), we obtain
>
"
(2.3)
Ble
Note that Ag is the cost of repair or replacement of the product. It includes the loss
due to the unavailability of the product during the repair period, the cost of transporting
the product by the customer to and from the repair center, etc. If a product fails in an
unsafe mode, such as an automobile breaking down in the middle of a road, then the
losses from the resulting consequences should also be included in Ag. Regardless of who
pays for them—the customer, the manufacturer, or a third party—all these losses should
be included in Ag.
Substituting Equation (2,3) in Equation (2.2) we obtain
A
LO) = SO-myP . QA)
Ao
‘We will now consider two numerical examples.
Example—Television Set Color Density:
Suppose the functional limits for the color density are m+7. This means about half the
customers, taking into account the diversity of their environment and taste, would find the
television set to be defective if the color density is m+7. Let the repair of a television set
in the field cost on average Ay = $98. By substituting in Equation (2.4), the quadratic
Joss function can be written as
98
ae (yom) = 2y- my.
LQ) =20 Principles of Quality Engineering Chap. 2
Thus, the average quality loss incurred by the customers receiving sets with color den-
sity m + 4 is L(m +4) = $32, while customers receiving sets with color density m +2
incur an average quality loss of only L(m + 2) = $8.
Example—Power Supply Circuit:
Consider a power supply circuit used in a stereo system for which the target output
voltage is 110 volts. If the output voltage falls outside 110 +20 volts, then the stereo
fails in half the situations and must be repaired. Suppose it costs $100 to repair the
stereo. Then the average loss associated with a particular value y of output voltage is
given by
100 2 2
Lo) = 22S @- 110? = 0.25~- 110" .
0) = Se O- Ho) (y - 110);
Variations of the Quadratic Loss Function
The quadratic loss function given by Equation (2.2) is applicable whenever the quality
characteristic y has a finite target value, usually nonzero, and the quality loss is sym-
metric on either side of the target. Such quality characteristics are called nominal-the-
best type quality characteristics and Equation (2.2) is called the nominal-the-best type
quality loss function. The color density of a television set and the output voltage of a
power supply circuit are examples of the nominal-the-best type quality characteristic.
Some variations of the quadratic loss function in Equation (2.2) are needed to
cover adequately certain commonly occurring situations. Three such variations are
given below.
© Smaller-the-better type characteristic. Some characteristics, such as radiation
leakage from a microwave oven, can never take negative values. Also, their
ideal value is equal to zero, and as their value increases, the performance
becomes progressively worse. Such characteristics are called smaller-the-better
type quality characteristics. The response time of a computer, leakage current in
electronic circuits, and pollution from an automobile are additional examples of
this type of quality characteristic. The quality loss in such situations can be
approximated by the following function, which is obtained from Equation (2.2)
by substituting m = 0:
LQ) = ky? 5)
Note this is a one-sided loss function because y cannot take negative values. As
described earlier, the quality loss coefficient k can be determined from the func-
tional limit, Ap, and the quality loss, Ag, can be determined at the functional
limit by using Equation (2.3).Sec. 2.2 Quadratic Loss Function a
* Larger-the-better type characteristic. Some characteristics, such as the bond
strength of adhesives, also do not take negative values. But, zero is their worst
value, and as their value becomes larger, the performance becomes progressively
better—that is, the quality loss becomes progressively smaller. Their ideal value
is infinity and at that point the loss is zero. Such characteristics are called
larger-the-better type quality characteristics. It is clear that the reciprocal of
such a characteristic has the same qualitative behavior as a smaller-the-better type
characteristic. Thus, we approximate the loss function for a larger-the-better type
characteristic by substituting 1/y for y in Equation (2.5):
oye4[4] : (2.6)
y
‘The rationale for using Equation (2.6) as the quality loss function for larger-the-
better type characteristics is discussed further in Chapter 5. To determine the
constant k for this case, we find the functional limit, Ap, below which more than
half of the products fail, and the corresponding loss Ag. Substituting Ay and Ag
in Equation (2.6), and solving for k, we obtain
k=AgA3 . 7)
* Asymmetric loss function, In certain situations, deviation of the quality charac-
teristic in one direction is much more harmful than in the other direction, In
such cases, one can use a different coefficient k for the two directions. ‘Thus, the
quality loss would be approximated by the following asymmetric loss function:
kio-m), y>m
L0)= |pg-m)?, y
or D3, since the
average 7| for both D2 and D3 is —40 dB. Based on the matrix experiment, we can
conclude that the settings A,B,C D> and A,B,C2D¥ would give the highest 1| or the
lowest surface defect count.
‘The predicted best settings need not correspond to one of the rows in the matrix
experiment. In fact, often they do not correspond as is the case in the present example.
Also, typically, the value of 7 realized for the predicted best settings is better than the
best among the rows of the matrix experiment.48 Matrix Experiments Using Orthogonal Arrays Chap. 3
3.3 ADDITIVE MODEL FOR FACTOR EFFECTS
In the preceding section, we used simple averaging to estimate factor effects. The
same nine observations (1;, N2. + To) are grouped differently to estimate the factor
effects. Also, the optimum combination of settings was determined by examining the
effect of each factor separately. Justification for this simple procedure comes from
* Use of the additive model as an approximation
* Use of an orthogonal array to plan the matrix experiment
We now examine the additive model. The relationship between 7) and the pro-
cess parameters A, B, C, and D can be quite complicated. Empirical determination of
this relationship can, therefore, tum out to be quite expensive. However, in most situa-
tions, when 7) is chosen judiciously, the relationship can be approximated adequately
by the following additive model:
11 (Ai, By Cp, Di) =P +a; + bj +o +d; +e. 5)
In the above equation, 1 is the overall mean—that is, the mean value of 1 for the
experimental region; the deviation from 1 caused by setting factor A at level A; is a;;
the terms b;, cy and d; represent similar deviations from 1 caused by the settings B;,
C, and D, of factors B, C, and D, respectively; and e stands for the error. Note that by
error we imply the error of the additive approximation plus the error in the repeatabil-
ity of measuring 1 for a given experiment.
An additive model is also referred to as a superposition model or a variables
separable model in engineering literature. Note that superposition model implies that
the total effect of several factors (also called variables) is equal to the sum of the indi-
vidual factor effects. It is possible for the individual factor effects to be linear, qua-
dratic, or of higher order. However, in an additive model cross product terms involv-
ing two or more factors are not allowed.
By definition a1, a2, and a3 are the deviations from 41 caused by the three levels
of factor A. Thus,
a, +a, +43=0. (3.6)
Similarly,
b, +b2+b3=0
ce, te2 te, =
d,; +d) +43 =0
G7Sec. 3.3 Additive Model for Factor Effects aa
It can be shown that the averaging procedure of Section 3.2 for estimating the factor
effects is equivalent to fitting the additive model, defined by Equations (3.5), (3.6), and
(3.7), by the least squares method. This is a consequence of using an orthogonal array
to plan the matrix experiment.
Now, consider Equation (3.2) for the estimation of the effect of setting tempera-
ture at level 3:
1
may = (Mr + Ts + My)
5 wtasto tes 4d, +00)
+Utas tb, te, tds +es)
+(Uta; +b, +e, +d) + ey)
= Fou +day+ tb, +b: +644 (€1 +e, +3)
thea rayne t (e7 + es + ey)
ueayet (e7 + es + ey) - (3.8)
Note that the terms corresponding to the effects of factors B, C and D drop out because
of Equation (3.7). Thus, ma, is an estimate of (+a).
Furthermore, the error term in Equation (3.8) is an average of three error terms.
Suppose 0 is the average variance for the error terms ¢1, 2, °** , é9. Then the error
variance for the estimate m,, is approximately (1/3)62. (Note that in computing the
error variances of the estimate m,, and other estimates in this chapter, we treat the
individual error terms as independent random variables with zero mean and variance
6. In reality, this is only an approximation because the error terms include the error
of the additive approximation so that the error terms are not strictly independent ran-
dom variables with zero mean. ‘This approximation is adequate because the error vari-
ance is used for only qualitative purposes.) This represents a 3-fold reduction in error
variance compared to conducting a single experiment at the setting A; of factor A.50 Matrix Experiments Using Orthogonal Arrays Chap. 3
Substituting Equation (3.5) in Equation (3.3) verifies that ma, estimates 1 + a2
with error variance (1/3)62. Similarly, substituting Equation (3.5) in Equation (3.4)
shows that mg, estimates |. + b2 with error variance (1/3)o. It can be verified that
similar relationships hold for the estimation of the remaining factor effects.
The term replication number is used to refer to the number of times a particular
factor level is repeated in an orthogonal array. The error variance of the average effect
for a particular factor level is smaller than the error variance of a single experiment by
a factor equal to its replication number. To obtain the same accuracy of the factor
level averages, we would need a much larger number of experiments if we were to use
the traditional approach of studying one factor at a time. For example, we would have
to conduct 3 x 3 = 9 experiments to estimate the average 7, for three levels of tempera-
ture alone (three repetitions each for the three levels), while keeping other factors fixed
at certain levels, say, By, C1, D1.
We may then fix temperature at its best setting and experiment with levels B,
and B of pressure. This would need 3 x 2 = 6 additional experiments. Continuing in
this manner, we can study the effects of factors C and D by performing 2 x 6 = 12
additional experiments. Thus, we would need a total of 9 + 3 x 6 = 27 experiments to
study the four factors, one at a time. Compare this to only nine experiments needed
for the orthogonal array based matrix experiment to obtain the same accuracy of the
factor level averages.
Another common approach to finding the optimum combination of factor levels
is to conduct a full factorial experiment—that is, conduct experiments under all combi-
nations of factor levels. In the present example, it would mean conducting experiments
under 34 = 81 distinct combinations of factor levels, which is much larger than the
nine experiments needed for the matrix experiment. When the additive model [Equa-
tion (3.5)] holds, it is obviously unnecessary to experiment with all combinations of
factor levels. Fortunately, in most practical situations the additive model provides an
excellent approximation. The additivity issue is discussed in much detail in Chapters 5
and 6.
Conducting matrix experiments using orthogonal arrays has another statistical
advantage. If the errors, ¢;, are independent with zero mean and equal variance, then
the estimated factor effects are mutually uncorrelated. Consequently, the best level of
each factor can be determined separately.
In order to preserve the benefits of using an orthogonal array, it is important that
all experiments in the matrix be performed. If experiments corresponding to one or
more rows are not conducted, or if their data are missing or erroneous, the balancing
property and, hence, the orthogonality is lost. In some situations, incomplete matrix
experiments can give useful results, but the analysis of such experiments is compli-
cated, (Statistical techniques used for analyzing such data are regression analysis and
linear models; see Draper and Smith [D4].) Thus, we recommend that any missing
experiments be performed to complete the matrix.Sec. 3.4 Analysis of Variance 51
3.4 ANALYSIS OF VARIANCE
Different factors affect the surface defect formation to a different degree. The relative
magnitude of the factor effects could be judged from Table 3.3, which gives the aver-
age 1) for each factor level. A better feel for the relative effect of the different factors
can be obtained by the decomposition of variance, which is commonly called analysis
of variance (ANOVA). ANOVA is also needed for estimating the error variance for
the factor effects and variance of the prediction error.
Analogy with Fourier Analysis
An important reason for performing Fourier analysis of an electrical signal is to deter-
mine the power in each harmonic to judge the relative importance of the various har-
monics. The larger the amplitude of a harmonic, the larger the power is in it and the
more important it is in describing the signal. Similarly, an important purpose of
ANOVA is to determine the relative importance of the various factors. In fact, there is
a strong analogy between ANOVA and the decomposition of the power of an electrical
signal into different harmonics:
+ The nine observed values of 1) are analogous to the observed signal.
* The sum of squared values of 7 is analogous to the power of the signal.
+ The overall mean 7) is analogous to the de part of the signal.
* The four factors are like four harmonics.
* The columns in the matrix experiment are orthogonal, which is analogous to the
orthogonality of the different harmonics.
The analogy between the Fourier analysis of the power of an electrical signal and
ANOVA is displayed in Figure 3.2. The experiments are arranged along the horizontal
axis like time. The overall mean is plotted as a straight line like a dc component. The
effect of each factor is displayed as an harmonic. The level of factor A for experi-
ments 1, 2, and 3 is Ay. So, the height of the wave for A is plotted as my, for these
experiments. Similarly, the height of the wave for experiments 4, 5, and 6 is m,,, and
the height for experiments 7, 8, and 9 is m4,. The waves for the other factors are also
plotted similarly. By virtue of the additive model [Equation (3.5)], the observed 1 for
any experiment is equal to the sum of the height of the overall mean and the deviation
from mean caused by the levels of the four factors. By referring to the waves of the
different factors shown in Figure 3.2 it is clear that factors A, B, C, and D are in the
decreasing order of importance. Further aspects of the analogy are discussed in the rest
of this section.52 Matrix Experiments Using Orthogonal Arrays
Observed
S/N Ratio
Overall
Mean
Effect of
Factor A
Effect of
Factor B
Effect of
Factor C
Effect of
Factor D
123456789
Experiment Number
Figure 3.2 Orthogonal decomposition of the observed S/N ratio.
Chap. 3Sec. 3.4 Analysis of Variance 53
Computation of Sum of Squares
The sum of the squared values of 1 is called grand total sum of squares. Thus, we
have
9
Grand total sum of squares = 1?
Ai
= (- 207 + 10Y +++ + 70"
= 19,425 (dB)? .
The grand total sum of squares is analogous to the total signal power in Fourier
analysis. It can be decomposed into two parts—sum of squares due to mean and total
sum of squares which are defined as follows:
Sum of squares due to mean = (number of experiments) x m?
=9 (41.677
= 15,625 (dB)? .
9
Total sum of squares = J (n;-m)*
A
= (-20-41.67? + (-10-41.67)? ++»
+ (-70-41.67""
= 3,800 (dB)? .
The sum of squares due to mean is analogous to the de power of the signal and the
total sum of squares is analogous to the ac power of the signal in Fourier analysis.
Because m is the average of the nine 1); values, we have the following algebraic
identity:
(i-myP
Me54 Matrix Experiments Using Orthogonal Arrays Chap. 3
which can also be written as
Total sum of squares = (grand total sum of squares)
—(sum of squares due to mean)
The above equation is analogous to the fact from Fourier analysis that the ac power is
equal to the difference between the total power and the de power of the signal.
The sum of squares due to factor A is equal to the total squared deviation of the
wave for factor A from the line representing the overall mean, There are three experi-
ments each at levels Ay, A2, and A3. Consequently,
Sum of squares due to factor A
= 3¢mg,—m)? + 3¢mg, my? + (m4, -m)?
= 3-20+41.67) +3(-45 +41.67)? +3(-60+41.67)>
= 2450 (4B).
Proceeding along the same lines, we can show that the sum of squares due to factors
B, C, and D are, respectively, 950, 350, and 50 (dB)*. These sums of squares values
are tabulated in Table 3.4. The sums of squares values due to various factors are
analogous to the power in various harmonics, and are a measure of the relative impor-
tance of the factors in changing the values of 1.
Thus, factor A explains a major portion of the total variation of 1. In fact, it is
responsible for (2450/3800) x 100 = 64.5 percent of the variation of 7. Factor B is
responsible for the next largest portion, namely 25 percent; and factors C and D
together are responsible for only a small portion, a total of 10.5 percent, of the varia-
tion in q.
Knowing the factor effects (that is, knowing the yalues of a;, bj, cy, and d)), we
can use the additive model given by Equation (3.5) to calculate the error term ¢; for
each experiment i. The sum of squares due to error is the sum of the squares of the
error terms. Thus we have,
9
Sum of squares due to error = e?
isl
In the present case study, the total number of model parameters (\l, a1, @2, 43, 61, b2,
etc.) is 13; the number of constraints, defined by Equations (3.6) and (3.7) is 4. TheSec. 3.4 Analysis of Variance 55
number of model parameters minus the number of constraints is equal to the number of
experiments. Hence, the error term is identically zero for each experiment. Hence, the
sum of squares due to error is also zero. Note that this need not be the situation with
all matrix experiments.
TABLE 3.4 ANOVA TABLE FOR n
Degrees of | Sum of | Mean
Factor/Source Freedom | Squares | Square | F
‘A. Temperature 2 raso | 1225 | 12.25
B. Pressure 2 950 47s | 4.75
C. Settling time 2 aso | 175
D. Cleaning method 2 50* 25
Enor 0 ° -
Total 8 3800
(Error) ) (400) | (100)
* Indicates sum of squares added together to estimate the pooled
error sum of squares indicated by parentheses. F ratio is calculated
by using the pooled error mean square.
Relationship Among the Various Sums of Squares
The orthogonality of the matrix experiment implies the following relationship among
the various sums of squares:
(Total sum of squares) = (sum of the sums of squares
due to various factors)
+ (sum of squares due to error) . (3.9)
Equation (3.9) is analogous to Parseval's equation for the decomposition of the power
of a signal into power in different harmonics, Equation (3.9) is often used for calculat-
ing the sum of squares due to erfor after computing the total sum of squares and the
sum of squares due to various factors. Derivation of Equation (3.9) as well as detailed
mathematical description of ANOVA can be found in many books on statistics, such as
Scheffé [S1], Rao [R3], and Searle [S2}.56 Matrix Experiments Using Orthogonal Arrays Chap. 3
For the matrix experiment described in this chapter, Equation (3.9) implies:
(Total sum of squares) = (sum of the sums of squares
due to factors A, B, C, and D)
+ (sum of squares due to error) .
Note that the various sums of squares tabulated in Table 3.4 do satisfy the above
equation.
Degrees of Freedom
The number of independent parameters associated with an entity like a matrix experi-
ment, or a factor, or a sum of squares is called its degrees of freedom. A matrix experi-
ment with nine rows has nine degrees of freedom and so does the grand total sum of
squares. The overall mean has one degree of freedom and so does the sum of squares due
to mean. Thus, the degrees of freedom associated with the total sum of squares is
9-1=8. (Note that total sum of squares is equal to grand total sum of squares minus the
sum of squares due to mean.)
Factor A has three levels, so its effect can be characterized by three parameters: a1,
a, and @3. But these parameters must satisfy the constrain given by Equation (3.6).
Thus, effectively, factor A has only two independent parameters and, hence, two degrees
of freedom. Similarly, factors B, C, and D have two degrees of freedom each. In gen-
eral, the degrees of freedom associated with a factor is one less than the number of levels.
The orthogonality of the matrix experiment implies the following relationship
among the various degrees of freedom:
(Degrees of freedom for the total sum of squares)
= (sum of the degrees of freedom for the various factors)
+ (degrees of freedom for the error) . 3.10)
Note the similarity between Equations (3.9) and (3.10). Equation (3.10) is useful for
computing the degrees of freedom for error. In the present case study, the degrees of
freedom for error comes out to be zero. This is consistent with the earlier observation
that the error term is identically zero for each experiment in this case study.
It is customary to write the analysis of variance in a tabular form shown in Table
3.4, The mean square for a factor is computed by dividing the sum of squares by the
degrees of freedom.Sec. 3.4 Analysis of Variance 57
Estimation of Error Variance
The error variance, which is equal to the error mean square, can then be estimated as
follows:
sum of squares due to error
Error Variaiice = <7 ees of feedomfor error”
GAD
The error variance is denoted by 02.
In the interest of gaining the most information from a matrix experiment, all or
most of the columns should be used to study process or product parameters. As a
Tesult, no degrees of freedom may be left to estimate error variance. Indeed, this is the
situation with the present example. In such situations, we cannot directly estimate the
error variance.
However, an approximate estimate of the error variance can be obtained by pool-
ing the sum of squares corresponding to the factors having the lowest mean square. As
a rule of thumb, we suggest that the sum of squares corresponding to the bottom half
of the factors (as defined by lower mean square) corresponding to about half of the
degrees of freedom be used to estimate the error mean square or error variance. This
tule is similar to considering the bottom half harmonics in a Fourier expansion as error
and using the rest to explain the function being investigated. In the present example,
we use factors C and D to estimate the error mean square. Together, they account for
four degrees of freedom and the sum of their sum of squares is 400. Hence, the error
variance is 100. Error variance computed in this manner is indicated by parentheses,
and the computation method is called pooling. (By the traditional statistical assump-
tions, pooling gives a biased estimate of error variance. To obtain a better estimate of
error variance, a significantly larger number of experiments would be needed, the cost
of which is usually not justifiable compared to the added benefit.)
In Fourier analysis of a signal, it is common to compute the power in all har-
monics and then use only those harmonics with large power to explain the signal and
treat the rest as error. Pooling of sum of squares due to bottom half factors is exactly
analogous to that practice. After evaluating the sum of squares due to all factors, we
retain only the top half factors to explain the variation in the process response n and
the rest to estimate approximately the error variance.
The estimation of the error variance by pooling will be further illustrated through
the applications discussed in the subsequent chapters. As it will be apparent from
these applications, deciding which factors’ sum of squares should be included in the
error variance is usually obvious by inspecting the mean square column. The decision
process can sometimes be improved by using a graphical data analysis technique called
half-normal plots (see Daniel [D1] and Box, Hunter, and Hunter [B3)).88 Matrix Experiments Using Orthogonal Arrays Chap. 3
Confidence intervals for Factor Effects
Confidence intervals for factor effects are useful in judging the size of the change
caused by changing a factor level compared to the error standard deviation. As shown
in Section 3.3, the variance of the effect of each factor level for this example is
(1/3) 62 = (1/3)(100) = 33.3 (dB)”. Thus, the width of the two-standard-deviation
confidence interval, which is approximately 95 percent confidence interval, for each
estimated effect is +2 ¥33.3 =+11.5 dB. In Figure 3.1 these confidence intervals are
plotted for only the starting level to avoid crowding.
Variance Ratio
The variance ratio, denoted by F in Table 3.4, is the ratio of the mean square due to a
factor and the error mean square. A large value of F means the effect of that factor is
large compared to the error variance. Also, the larger the value of F, the more impor-
tant that factor is in influencing the process response 1). So, the values of F can be
used to rank order the factors.
In statistical literature, the F value is often compared with the quantiles of a
probability distribution called the F-distribution to determine the degree of confidence
that a particular factor effect is real and not just a random occurrence (see, for example,
Hogg and Craig (H3]), However, in Robust Design we are not concemed with such
probability statements; we use the F ratio for only qualitative understanding of the rela-
tive factor effects, A value of F less than one means the factor effect is smaller than
the error of the additive model. A value of F larger than two means the factor is not
quite small, whereas larger than four means the factor effect is quite large.
Interpretation of ANOVA Tables
Thus far in this section, we have described the computation involved in the ANOVA
table, as well as the inferences that can be made from the table. A variety of computer
programs can be used to perform the calculations, but the experimenter must make
appropriate inferences. Here we put together the major inferences from the ANOVA
table.
Referring to the sum of squares column in Table 3.4, notice that factor A makes
the largest contribution to the total sum of squares, namely, (2450/3800) x 100 = 64.5
percent. Factor B makes the next largest contribution, (950/3800) x 100 = 25.0 per-
Cent, to the total sum of squares. Factors C and D together make only a 10.5 percent
contribution to the total sum of squares. The larger the contribution of a particular fac-
tor to the total sum of squares, the larger the ability is of that factor to influence 7).
In this matrix experiment, we have used all the degrees of freedom for estimating
the factor effects (four factors with two degrees of freedom each make up all the eight
degrees of freedom for the total sum of squares). Thus, there are no degrees ofSec. 3.5 Prediction and Diagnosis 59
freedom left for estimating the error variance. Following the rule of thumb spelled out
earlier in this section, we use the bottom half factors that have the smallest mean
square to estimate the error variance. Thus, we obtain the error sum of squares, indi-
cated by parentheses in the ANOVA table, by pooling the sum of squares due to fac-
tors C and D. This gives 100 as an estimate of the error variance.
The largeness of a factor effect relative to the error variance can be judged from
the F column. The larger the F value, the larger the factor effect is compared to the
error variance.
This section points out that our purpose in conducting ANOVA is to determine
the relative magnitude of the effect of each factor on the objective function 1) and to
estimate the error variance. We do not attempt to make any probability statements
about the significance of a factor as is commonly done in statistics. In Robust Design,
ANOVA is also used to choose from among many altematives the most appropriate
quality characteristic and S/N ratio for a specific problem. Such an application of
ANOVA is described in Chapter 8. Also, ANOVA is useful in computing the S/N
ratio for dynamic problems as described in Chapter 9.
3.5 PREDICTION AND DIAGNOSIS
Prediction of n under Optimum Conditions
‘As discussed earlier, a primary goal of conducting Robust Design experiments is to
determine the optimum level for each factor. For the CVD project, one of the two
identified optimum conditions is A; B, C2, D2. The additive model, Equation (3.5),
can be used to predict the value of 1) under the optimum conditions, denoted by Nop,
as follows:
Mop =m + (m4,—m) + (mg,—m)
= ~41.67 + (-20+41.67) + (-30+41.67)
=-8.33 dB. @.12)
Note that since the sum of squares due to factors C and D are small and that
these terms are included as error, we do not include the corresponding improvements in
the prediction of 7 under optimum conditions. Why are the contributions by factors
having a small sum of squares ignored? Because if we include the contribution from
all factors, it can be shown that the predicted improvement in 7) exceeds the actual
realized improvement—that is, our prediction would be biased on the higher side. By
ignoring the contribution from factors with small sums of squares, we can reduce this60 Matrix Experiments Using Orthogonal Arrays Chap. 3
bias. Again, this is a rule of thumb. For more precise prediction, we need to use
appropriate shrinkage coefficients described by Taguchi [T1}.
Thus, by Equation (3.12) we predict that the defect count under the optimum
conditions would be —8.33 dB. This is equivalent to a mean square count of
=
y=lo ©
= 10°83 = 6.8 (defects /unit area)”
The corresponding root-mean-square defect count is V6.8 = 2.6 defects/unit area,
The purpose of taking log in constructing the S/N ratio can be explained in terms
of the additive model. If the actual defect count were used as the characteristic for
constructing the additive model, it is quite possible that the defect count predicted
under the optimum conditions would have been negative. This is highly undesirable
since negative counts are meaningless. However, in the log scale, such negative counts
cannot occur. Hence, it is preferable to take the log.
The additive model is also useful in predicting the difference in defect counts
between two process conditions. The anticipated improvement in changing the process
conditions from the initial settings (ABC ,D1) to the optimum settings (A;B;C2D2)
is
AM = Nop ~ Nhinitiat = (Ma, — ™a,) + (mg, — mig,)
(—20+45) + (-30+40)
=35dB. (3.13)
Once again we do not include the terms corresponding to factors C and D for the rea-
sons explained earlier.
Verification (Confirmation) Experiment
After determining the optimum conditions and predicting the response under these con-
ditions, we conduct an experiment with optimum parameter settings and compare the
observed value of 1) with the prediction. If the predicted and observed 1 are close to
each other, then we may conclude that the additive model is adequate for describing
the dependence of nj on the various parameters. On the contrary, if the observation is
drastically different from the prediction, then we say the additive model is inadequate.
This is evidence of a strong interaction among the parameters, which is described later
in this section.Sec. 3.5 Prediction and Diagnosis 61
Variance of Prediction Error
We need to determine the variance of the prediction error so that we can judge the
closeness of the observed Mop: to the predicted Nog. The prediction error, which is the
difference between the observed Noi» and the predicted Tho, has two independent com-
ponents. The first component is the error in the prediction of Tho caused by the errors
in the estimates of m, m,,, and mg,. The second component is the repetition error of
an experiment. Because these two components are independent, the variance of the
prediction error is the sum of their respective variances,
Consider the first component. Its variance can be shown equal to (1/n9)6?
where o2 is the error variance whose estimation was discussed earlier, no is the
equivalent sample size for the estimation of Nox. The equivalent sample size ny can
be computed as follows:
(3.14)
where n is the number of rows in the matrix experiment and m4, is the number of
times level A, was repeated in the matrix experiment—that is, m4, is the replication
number for factor level A, and ng, is the replication number for factor level B .
Observe the correspondence between Equations (3.14) and (3.12). The term
(1/n) in Equation (3.14) corresponds to the term m in the prediction Equation (3.12);
and the terms (1/n,,—1/n) and (1/ng, —1/n) correspond, respectively, to the terms
(m,,—m) and (mp,—m). This correspondence can be used to generalize Equation
(3.14) to other prediction formulae.
Now, consider the second component. Suppose we repeat the verification experi-
ment 7, times under the optimum conditions and call the average 7) for these exp
ments as the observed Nop. The repetition error is given by (1/n,)02. Thus, the vari-
ance of the prediction error, OZped, iS
2 fig fh
cia (Jet (2
In the example, n = 9 and ng, = ng, = 3. Thus, (1/9) = (1/9) + (1/3-1/9) +
(1/3-1/9) = (5/9). Suppose n, = 4. Then
3 G15)
Rea = [$e + [+e = 80.64?62 Matrix Experiments Using Orthogonal Arrays Chap. 3
The corresponding two-standard-deviation confidence limits for the prediction error are
£17.96 dB. If the prediction error is outside these limits, we should suspect the possi-
bility that the additive model is not adequate. Otherwise, we consider the additive
model to be adequate.
Uniformity of Prediction Error Variance
It is obvious from Equation (3.15) that the variance of the prediction error, SFreds is the
same for all combinations of the factor levels in the experimental region. It does not
matter whether the particular combination does or does not correspond to one of the
rows in the matrix experiment. Before conducting the matrix experiment we do not
know what would be the optimum combination. Hence, it is important to have the
property of uniform prediction error.
Interactions among Control Factors
The concept of interactions can be understood from Figure 3.3. Figure 3.3(a) shows
the case of no interaction between two factors A and B. Here, the lines of the effect of
factor A for the settings B,, B>, and B3 of factor B are parallel to each other. Parallel
lines imply that if we change the level of factor A from A, to Az or A3, the
corresponding change in 1) is the same regardless of the level of factor B. Similarly, a
change in level of B produces the same change in 1) regardless of the level of factor A.
The additive model is perfect for this situation. Figures 3.3(b) and 3.3(c) show two
‘examples of presence of interaction. In Figure 3.3(b), the lines are not parallel, but the
direction of improvement does not change. In this case, the optimum levels identified
by the additive model are still valid. Whereas in Figure 3.3(c), not only are the lines
not parallel, but the direction of improvement is also not consistent, In such a case,
the optimum levels identified by the additive model can be misleading. The type of
interaction in Figure 3.3(b) is sometimes called synergistic interaction while the one in
Figure 3.3(c) is called antisynergistic interaction. The concept of interaction between
two factors described above can be generalized to apply to interaction among three or
more factors.
When interactions between two or more factors are present, we need cross prod-
uct terms to describe the variation of 7 in terms of the control factors, A model for
such a situation needs more parameters than an additive model and, hence, it needs
more experiments to estimate all the parameters. Further, as discussed in Chapter 6,
using a model with interactions can have problems in the field. ‘Thus, we consider the
presence of interactions to be highly undesirable and try to eliminate them.
When the quality characteristic is correctly chosen, the S/N ratio is properly con-
structed, and the control factors are judiciously chosen (see Chapter 6 for guidelines),
the additive model provides excellent approximation for the relationship between 1) and
the control factors. The primary purpose of the verification experiment is to warn usSec.38 Summary 63
By 8,
A A, Ay A A, Ay A A, Ay
(@) No Interaction (b) Synergistic (0) Antisynergistic
Interaction Interaction
Figure 3.3. Examples of interaction.
when the additive model is not adequate and, thus, prevent faulty process and product
designs from going downstream. Some applications call for a broader assurance of the
additive model. In such cases, the verification experiment consists of two or more con-
ditions rather than just the optimum conditions. For the additive model to be con-
sidered adequate, the predictions must match the observation under all conditions that
are tested. Also, in certain situations, we can judge from engineering knowledge that
particular interactions are likely to be important. Then, orthogonal arrays can be suit-
ably constructed to estimate those interactions along with the main effects, as described
in Chapter 7.
3.6 SUMMARY
* A matrix experiment consists of a set of experiments where the settings of
several product or process parameters to be studied are changed from one experi-
ment to another.
Matrix experiments are also called designed experiments, parameters are also
called factors, and parameter settings are also called levels.
Conducting matrix experiments using orthogonal arrays is an important technique
in Robust Design. It gives more reliable estimates of factor effects with fewer
experiments when compared to the traditional methods, such as one factor at a
time experiments. Consequently, more factors can be studied in given R&D
Tesources, leading to more robust and less expensive products.Matrix Experiments Using Orthogonal Arrays Chap. 3
* The columns of an orthogonal array are pairwise orthogonal—that is, for every
pair of columns, all combinations of factor levels occur an equal number of
times. The columns of the orthogonal array represent factors to be studied and
the rows represent individual experiments.
Conducting a matrix experiment with an orthogonal array is analogous to finding
the frequency response function of a dynamic system by using a multifrequency
input. The analysis of data obtained from matrix experiments is analogous to
Fourier analysis,
* Some important terms used in matrix experiments are: The region formed by the
factors being studied and their alternate levels is called the experimental region,
The starting levels of the factors are the levels used before conducting the matrix
experiment. The main effects of the factors are their separate effects. If the
effect of a factor depends on the level of another factor, then the two factors are
said to have an interaction. Otherwise, they are considered to have no interac-
tion. The replication number of a factor level is the number of experiments in
the matrix experiment that are conducted at that factor level. The effect of a fac-
tor level is the deviation it causes from the overall mean response. The optimum
level of a factor is the level that gives the highest S/N ratio.
An additive model (also called superposition model or variables separable
model) is used to approximate the relationship between the response variable and
the factor levels, Interactions are considered errors in the additive model.
* Orthogonal array based matrix experiments are used for a variety of purposes in
Robust Design. ‘They are used to:
— Study the effects of control factors
— Study the effects of noise factors
— Evaluate the S/N ratio
— Determine the best quality characteristic or S/N ratio for particular applica-
tions,
+ Key steps in analyzing data obtained from a matrix experiment are:
1. Compute the appropriate summary statistics, such as the S/N ratio for each
experiment.
2. Compute the main effects of the factors.
3. Perform ANOVA to evaluate the relative importance of the factors and the
error variance.
4, Determine the optimum level for each factor and predict the S/N ratio for
the optimum combination.Sec.36 Summary 85
5. Compare the results of the verification experiment with the prediction, If
the results match the prediction, then the optimum conditions are con-
sidered confirmed; otherwise, additional analysis and experimentation are
needed.
* If one or more experiments in a matrix experiment are missing or erroneous, then
those experiments should be repeated to complete the matrix. This avoids the
need for complicated analysis.
* Matrix experiment, followed by a verification experiment, is a powerful tool for
detecting the presence of interactions among the control factors. If the predicted
response under the optimum conditions does not match the observed response,
then it implies that the interactions are important. If the predicted response
matches the observed response, then it implies that the interactions are probably
not important and that the additive model is a good approximation.Chapter 4
STEPS IN ROBUST
DESIGN
As explained in Chapter 2, optimizing a product or process design means determining
the best architecture, levels of control factors, and tolerances. Robust Design is a
methodclogy for finding the optimum settings of the control factors to make the prod-
uct or process insensitive to noise factors. It involves eight steps that can be grouped
into the three major categories of planning experiments, conducting them, and analyz-
ing and verifying the results.
* Planning the experiment
1) Identify the main function, side effects, and failure modes.
2) Identify noise factors and the testing conditions for evaluating the quality
loss.
3) Identify the quality characteristic to be observed and the objective function
to be optimized.
4) Identify the control factors and their alternate levels.
5) Design the matrix experiment and define the data analysis procedure,
+ Performing the experiment
6) Conduct the matrix experiment.
6768 Steps in Robust Design Chap. 4
+ Analyzing and verifying the experiment results
7) Analyze the data, determine optimum levels for the control factors, and
predict performance under these levels.
8) Conduct the verification (also called confirmation) experiment and plan
future actions.
These eight steps make up a Robust Design cycle. We will illustrate them in
this chapter by using a case study of improving a polysilicon deposition process. The
case study was conducted by Peter Hey in 1984 as a class project for the first offering
of the 3-day Robust Design course developed by the author, Madhav Phadke, and Chris
Sherrerd, Paul Sherry, and Rajiv Keny of AT&T Bell Laboratories. Hey and Shenry
jointly planned the experiment and analyzed the data. The experiment yielded a 4-fold
reduction in the standard deviation of the thickness of the polysilicon layer and nearly
two orders of magnitude reduction in surface defects, a major yield-limiting problem
which was virtually eliminated. These results were achieved by studying the effects of
six control factors by conducting experiments under 18 distinct combinations of the
levels of these factors—a rather small investment for huge benefits in quality and yield.
This chapter consists of nine sections:
* Sections 4.1 through 4.8 describes in detail the polysilicon deposition process
case study in terms of the eight steps that form a Robust Design cycle.
* Section 4.9 summarizes the important points of this chapter.
4.1 THE POLYSILICON DEPOSITION PROCESS AND ITS MAIN
FUNCTION
Manufacturing very large scale intergrated (VLSI) circuits involves about 150 major
steps. Deposition of polysilicon comes after about half of the steps are complete, and,
as a result, the silicon wafers (thin disks of silicon) used in the process have a
significant amount of value-added by the time they reach this step. The polysilicon
layer is very important for defining the gate electrodes for the transistors, There are
over 250,000 transistors in a square centimeter chip areaefor the 1.75 micron (microme-
ter = micron) design rules used in the case study.
A hot-wall, reduced-pressure reactor (see Figure 4.1) is used to deposit polysili-
con on a wafer. The reactor consists of a quartz tube which is heated by a 3-zone fur-
nace. Silane and nitrogen gases are introduced at one end and pumped out the other.
The silane gas pyrolizes, and a polysilicon layer is deposited on top of the oxide layer
on the wafers. The wafers are mounted on quartz carriers. Two carriers, each carrying
25 wafers, can be placed inside the reactor at a time so that polysilicon is deposited
simultaneously on 50 wafers.Sec. 4.1 The Polysilicon Deposition Process and its Main Function 69
Pressure
Figure 4.1 Schematle diagram of a reduced pressure reactor.
The function of the polysilicon deposition process is to deposit a uniform layer
of a specified thickness, In the case study, the experimenters were interested in achiev-
ing 3600 angstrom(A ) thickness (1A = 10° meter). Figure 4.2 shows a cross sec-
tion of the wafer after the deposition of the polysilicon layer.
Interlevel Dielectric
SiO, 2300A P-doped Fotyaesn
3600
‘Si Substrate,
AR
SiO, Gate Layer SiO, Gate Layer
360A 360A
P-doped Polysilicon
2700
Figure 4.2 Cross section of a wafer showing polysilicon layer.
At the start of the study, two main problems occurred during the deposition pro-
cess: (1) too many surface defects (see Figure 4.3) were encountered, and (2) too large70 Steps in Robust Design Chap. 4
a thickness variation existed within wafers and among wafers. In a subsequent VLSI
manufacturing step, the polysilicon layer is patterned by an etching process to form
lines of appropriate width and length. Presence of surface defects causes these lines to
have variable width, which degrades the performance of the integrated circuits. The
nonuniform thickness is detrimental to the etching process because it can lead to resid-
ual polysilicon in some areas and an etching away of the underlying oxide layer in
other areas.
Figure 4.3 Photographs of polysilicon surface showing surface defects.
Prior to the case study, Hey noted that the surface-defect problem was crucial
because a significant percentage of wafers were scrapped due to excessive defects.
Also, he observed that controlling defect formation was particularly difficult due to its
intermittent occurrence; for example, some batches of wafers (50 wafers make one
batch) had approximately ten defects per unit area, while other batches had as many as
5,000 defects per unit area. Furthermore, no theoretical models existed to predict
defect formation as a function of the various process parameters; therefore, experi-
mentation was the only way to control the surface-defect problem. However, theSec. 4.2 Noise Factors and Testing Conditions n
intermittency of the problem had rendered the traditional method of experimentation,
where only one process parameter is changed at a time, virtually useless.
4.2 NOISE FACTORS AND TESTING CONDITIONS
To minimize sensitivity to noise factors, we must first be able to estimate the sensi-
tivity in a consistent manner for any combination of the control factor levels. This is
achieved through proper selection of testing conditions.
In a Robust Design project, we identify all noise factors (factors whose levels
cannot be controlled during manufacturing, which are difficult to control, or expensive
to control), and then select a few testing conditions that capture the effect of the more
important noise factors. Simulating the effects of all noise factors is impractical
because the experimenter may not know all the noise sources and because total simula-
tion would require too many testing conditions and be costly. Although it is not neces-
sary to include the effect of all noise factors, the experimenter should list as many of
them as possible and, then, use engineering judgment to decide which are more impor-
tant and what testing conditions are appropriate to capture their effects.
Various noise factors exist in the deposition process. The nonuniform thickness
and the surface defects of the polysilicon layer are caused by the variations in the
parameters involved in the chemical reactions associated with the deposition process.
First, the gases are introduced at one end of the reactor (see Figure 4.1). As they
travel to the other end, the silane gas decomposes into polysilicon, which is deposited
on the wafers, and into hydrogen. This activity causes a concentration gradient along
the length of the reactor. Further, the flow pattern (direction and speed) of the gases
need not be the same as they travel from one end of the tube to the other. The flow
pattem could also vary from one part of a wafer to other parts of the same wafer.
Another important noise factor is the temperature variation along the length and cross
section of the tube, There are, of course, other sources of variation or noise factors,
such as topography of the wafer surface before polysilicon deposition, variation in
pumping speed, and variation in gas supply.
For the case study of the polysilicon deposition process, Hey and Sherry decided
to process one batch of 50 wafers to evaluate the quality associated with each combina-
tion of control factor settings suggested by the orthogonal array experiment. Of these
50 wafers, only 3 were test wafers, while the remaining 47 were dummy wafers, which
provided the needed "full load" effect while saving the cost of expensive test wafers,
To capture the variation in reactant concentration, flow pattem variation, and tempera-
ture variation along the length of the tube, the test wafers were placed in positions 3,
23, and 48 along the tube. Furthermore, to capture the effect of noise variation across
a wafer, the thickness and surface defects were measured at three points on each test
wafer: top, middle, and bottom. Other noise factors were judged to be less important.
To include their effect, the experimenters would have had to process multiple batches,
thus making the experiments very expensive. Consequently, the other noise factors
were ignored.72 Stops in Robust Design Chap. 4
The testing conditions for this case study are rather simple: observe thickness
and surface defects at three positions of three wafers, which are placed in specific posi-
tions along the length of the reactor. Sometimes orthogonal arrays (called noise
orthogonal arrays) are used to determine the testing conditions that capture the effect
of many noise factors. In some other situations, the technique of compound noise fac-
tor is used. These two techniques of constructing testing conditions are described in
Chapter 8.
4.3 QUALITY CHARACTERISTICS AND OBJECTIVE FUNCTIONS
It is often tempting to observe the percentage of units that meet the specification and
use that percentage directly as an objective function to be optimized, But, such temp-
tation should be meticulously avoided. Besides being a poor measure of quality loss,
using percentage of good (or bad) wafers as an objective function leads to orders of
magnitude reduction in efficiency of experimentation. First, to observe accurately the
percentage of "good" wafers, we need a large number (much larger than three) of test
wafers for each combination of control factor settings. Secondly, when the percentage
of good wafers is used as an objective function, the interactions among control factors
often become dominant; consequently, additive models cannot be used as adequate
approximations. The appropriate quality characteristics to be measured for the polysili-
con deposition process in the case study were the polysilicon thickness and the surface
defect count. The specifications were that the thickness should be within + 8 percent
of the target thickness and that the surface defect count should not exceed 10 per
square centimeter.
As stated in Section 4.2, nine measurements (3 wafers x 3 measurements pet
wafer) of thickness and surface defects were taken for each combination of control fac-
tor settings in the matrix experiment. The ideal value for surface defects is zero—the
smaller the number of surface defects per cm”, the better the wafer. So, by adopting
the quadratic loss function, we see that the objective function to be maximized is
= 10 logio (mean square surface defects)
133,
= 10 login) 9X Lyi (4.1)
“1 ja
where yj; is the observed surface defect count at position j on test wafer i, Note that
J=1, 2, and 3 stand for top, center, and bottom positions, respectively, on a test wafer.
And i=1, 2, and 3 refer to position numbers 3, 23, and 48, respectively, along the
length of the tube. Maximizing 1 leads to minimization of the quality loss due to sur-
face defects.Sec. 4.3 Quality Characteristics and Objective Functions 73
The target value in the study for the thickness of the polysilicon layer was
%} = 3600 A . Let tj be the observed thickness at position j on test wafer i. The
mean and variance of the thickness are given by
133
Beg ZLwy (4.2)
9A
133 5
7 LL Gy-w. (4.3)
1 i
The goal in optimization for thickness is to minimize variance while keeping the
mean on target. This is a constrained optimization problem, which can be very
difficult, especially when many control factors exist. However, as Chapter 5 shows,
when a scaling factor (a factor that increases the thickness proportionally at all points
on the wafers) exists, the problem can be simplified greatly.
In the case study, the deposition time was a clear scaling factor—that is, for
every surface area where polysilicon was deposited, (thickness) = (deposition rate) x
(deposition time). The deposition rate may vary from one wafer to the next, or from
one position on a wafer to another position, due to the various noise factors cited in the
previous section. However, the thickness at any point is proportional to the deposition
time.
Thus, the constrained optimization problem in the case study can be solved in
two steps as follows:
1. Maximize the Signal-to-noise (S/N) ratio, 1’,
2
10 logio 4. (44)
o
2. Adjust the deposition time so that mean thickness is on target.
In summary, the two quality characteristics to be measured were the surface
defects and the thickness. The corresponding objective functions to be maximized
were | and 1)’ defined by Equations (4.1) and (4.4), respectively. (Note that S/N ratio
is a general term used for measuring sensitivity to noise factors. It takes a different
form depending on the type of quality characteristic, as discussed in detail in
Chapter 5. Both 1) and 1/’ are different types of S/N ratios.)
The economics of a manufacturing process is determined by the throughput as
well as by the quality of the products produced. Therefore, along with the quality
characteristics, a throughput characteristic also must be studied. Thus, in the case
study, the experimenters also observed the deposition rate, r, measured in angstroms of
thickness growth per minute.74 Steps in Robust Design Chap. 4
4.4 CONTROL FACTORS AND THEIR LEVELS
Processes, such as polysilicon deposition, typically have a large number of control fac-
tors (factors that can be freely specified by the process designer). The more complex a
process, the more control factors it has and vice versa. Typically, we choose six to
eight control factors at a time to optimize a process. For each factor we generally
select two or three levels (or settings) and take the levels sufficiently far apart so that a
wide region can be covered by the three levels. Commonly, one of these levels is
taken to be the initial operating condition. Note that we are interested in the nonlinear-
ity, so taking the levels of control factors too close together is not very fruitful. If we
take only two levels, curvature effects would be missed, whereas such effects can be
identified by selecting three levels for a factor (see Figure 4.4). Furthermore, by select-
ing three levels, we can simultaneously explore the region on either side of the initial
operating condition. Hence, we prefer three levels.
7 1
Ay Aa As Ay Az As
(@) With two points we (b) With three points we
can only fit a straight ‘can identify curvature
line. effects and, hence, peaks.
Figure 4.4 Linear and curvature effects of a factor.
In the case study, six control factors were selected for optimization. These fac-
tors and their alternate levels are listed in Table 4.1. The deposition temperature (A) is
the steady state temperature at which the deposition takes place. When the wafers are
placed in the reactor, they first have to be heated from room temperature to the deposi-
tion temperature and then held at that temperature. The deposition pressure (B) is the
constant pressure maintained inside the reactor through appropriate pump speed and
butterfly adjustment. The nitrogen flow (C) and the silane flow (D) are adjusted using
the corresponding flow meters on gas tanks. Settling time (B) is the time between
placing the wafer carriers in the reactors and the time at which gases flow. The set-
tling time is important for establishing thermal and pressure equilibrium inside theSec. 4.4 Control Factors and Their Levels 75
reactor before the reaction is allowed to start. Cleaning method (F) refers to cleaning
the wafers prior to the deposition step. Before undertaking the case study experiment,
the practice was to perform no cleaning. The alternate two cleaning methods the
experimenters wanted to study were CM 2, performed inside the reactor, and CM 3, per-
formed outside the reactor.
TABLE 4.1 CONTROL FACTORS AND THEIR LEVELS
Levels*
Factor 1 2 3
A. Deposition temperature ((C) | To-25 | To T5425
B. Deposition pressure (mtorr) | Po-200 | Po Py +200
C. Nitrogen flow (sccm) No No-150 | No-75
D, Silane flow (sccm) So-100 | So-50 | So
E, Settling time (min) fo tot8 fot 16
F. Cleaning method None | CM cM,
* Starting levels are identified by underscore.
While deciding on the levels of control factors, a frequent tendency is to choose
the levels relatively close to the starting levels. This is due to the experimenter’s con-
cem that a large number of bad products may be produced during the matrix experi-
ment. But, producing bad products during the experiment stage may, in fact, be
beneficial because it tells us which region of control factor levels should be avoided.
Also, by choosing levels that are wide apart, we increase the chance of capturing the
nonlinearity of the relationship between the control factors and the noise factors, and,
thus, finding the levels of control factors that minimize sensitivity to noise. Further,
when the levels are wide apart, the factor effects are large when compared to the exper-
imental errors. As a result, the factor effects can be identified without too many repeti-
tions.
Thus, it is important to resist the tendency to choose control factor levels that are
rather close. Of course, during subsequent refinement experiments, levels closer to
each other could be chosen. In the polysilicon deposition case study, the ratio of the
largest to the smallest levels of factors B, C, D, and, E was between three and five
which represents a wide variation, Temperature variation from (T)-25) °C to
(To +25) °C also represents a wide range in terms of the known impact on the deposi-
tion rate.76 Steps in Robust Design Chap. 4
The initial settings of the six control factors are indicated by an underscore in
Table 4.1. The objective of this project was to determine the optimum level for each
factor so that 1 and 1’ are improved, while ensuring simultaneously that the deposition
fate, r, remained as high as possible. Note that the six control factors and their
selected settings define the experimental region over which process optimization was
done.
4.5 MATRIX EXPERIMENT AND DATA ANALYSIS PLAN
An efficient way to study the effect of several control factors simultaneously is to plan
matrix experiments using orthogonal arrays. As pointed out in Chapter 3, orthogonal
arrays offer many benefits. First, the conclusions arrived at from such experiments are
valid over the entire experimental region spanned by the control factors and their set-
tings. Second, there is a large saving in the experimental effort. Third, the data
analysis is very easy. Finally, it can detect departure from the additive model.
An orthogonal array for a particular Robust Design project can be constructed
from the knowledge of the number of control factors, their levels, and the desire to
study specific interactions. While constructing the orthogonal array, we also take into
account the difficulties in changing the levels of control factors, other physical limita-
tions in conducting experiments, and the availability of resources. In the polysilicon
deposition case study, there were six factors, each at three levels. The experimenters
found no particular reason to study specific interactions and no unusual difficulty in
changing the levels of any factor. The available resources for conducting the experi-
ments were such that about 20 batches could be processed and appropriate measure-
ments made. Using the standard methods of constructing orthogonal arrays, which are
described in Chapter 7, the standard array Lg was selected for this matrix experiment.
The Lig orthogonal array is given in Table 4.2. It has cight columns and cigh-
teen rows. The first column is a 2-level column—that is, it has only two distinct
entries, namely 1 or 2. All the chosen six control factors have three levels. So,
column 1 was kept empty or unassigned. From the remaining seven 3-level columns,
column 7 was arbitrarily designated as an empty column, and factors A through F were
assigned, respectively, to columns 2 through 6 and 8. (Note that keeping one or more
columns empty does not alter the orthogonality property of the array. Thus, the matrix
formed by columns 2 through 6 and 8 is still an orthogonal array. But, if one or more
rows are dropped, the orthogonality is destroyed.) The reader can verify the ortho-
gonality by checking that for every pair of columns all combinations of ievels occur,
and they occur an equal number of times.
The 18 rows of the Lyg array represent the 18 experiments to be conducted.
Thus, experiment 1 is to be conducted at level 1 for each of the six control factors.
These levels can be read from Table 4.1. However, to make it convenient for the
experimenter and to prevent translation errors, the entire matrix of Table 4.2 should beSeo. 4.5
Matrix Experiment and Data Analysis Plan
7
translated using the level definitions in Table 4.1 to create the experimenter’s log shect
shown in Table 4.3.
TABLE 4,2 L,, ORTHOGONAL ARRAY AND FACTOR ASSIGNMENT
Column Numbers and
Factor Assignment*
3
B
4
ic
5
D
6
E
1
1
1
1
eos
10
u
12
B
14
15
16
17
18
* Empty columns are identified by e.Steps in Robust Design
TABLE 4.3 EXPERIMENTER'S LOG
Expt. Settling | Cleaning]
No. | Temperature | Pressure | Nitrogen | Silane | Time | Method
1 | T)-25 Po-200 | No Sp-100 | ty None
2 | T)-25 Py No-150 | So-50 | t+8 | CM |
3 | T)-25 Po +200 | No-75 | So to416 | CM
4 |T% Po—200 | No So-50 | +8 | CMs
5 | To Po No~150 | So to+16 | None
6 | 7% Po+200 | No~75 | So-100 | to CM,
7 | To+28 P,—200 | No~150 | So-100 | to +16 | Ci
8 | T.+25 Po No~15 | So-50 | to None
9 | T)+25 Po+200 | No So tot8 | CM,
10 | T)-25 Po-200 | No-75 | So to+8 | None
| 7-25 Po Ny S-100 | to+16 | CM, |
12 | T)-25 Po +200 | No~150 | So-50 | to cM,
13 | 7 Po-200 | No~150 | So fo CM,
14 | 1%) Po No~75 | So-100 | to+8 | CM;
is To Po+200 | No So-50 tot 16 None
16 | To+25 Po-200 | No~75 | So-50 | to+16 | CM2
17 | T.+25 Po No So oy CM,
18 | T)+25 Po+200 | No-150 | So-100 | to+8 | None
Chap. 4
Now we combine the experimenters log sheet with the testing conditions
described in Section 4.2 to create the following experimental procedure:
1. Conduct 18 experiments as specified by the 18 rows of Table 4.3.
2. For each experiment, process one batch, consisting of 47 dummy wafers and
three test wafers. The test wafers should be placed in positions 3, 23, and 48.Sec. 4.6 Conducting the Matrix Experiment 79
3. For each experiment, compute to your best ability the deposition time needed to
achieve the target thickness of 3600A. Note that in the experiment the actual
thickness may tum out to be much different from 3600A . However, such data
are perfectly useful for analysis. Thus, a particular, experiment need not be
redone by adjusting the deposition time to obtain 360A thickness.
4. For each experiment, measure the surface defects and thickness at three specific
points (top, center, and bottom) on each test wafer. Follow standard laboratory
practice to prepare data sheets with space for every observation to be recorded.
4.6 CONDUCTING THE MATRIX EXPERIMENT
From Table 4.3 it is apparent that, from one experiment to the next, levels of several
control factors must be changed. This poses a considerable amount of difficulty to the
experimenter, Meticulousness in correctly setting the levels of the various control fac-
tors is critical to the success of a Robust Design project. Let us clarify what we mean
by meticulousness. Going from experiment 3 to experiment 4 we must change tem-
perature from (Ty -25) °C to To °C, pressure from (Py +200) mtorr to (P)—200)
mtorr, and so on. By meticulousness we mean ensuring that the temperature, pressure,
and other dials are set to their proper levels. Failure to set the level of a factor
correctly could destroy the valuable property of orthogonality. Consequently, conclu-
sions from the experiment could be erroneous. However, if an inherent error in the
equipment leads to an actual temperature of (To- 1) °C or (Ty +2) *C when the dial is
set at To °C, we should not bother to correct for such variations. Why? Because
unless we plan to change the equipment, such variations constitute noise and will con-
tinue to be present during manufacturing. If our conclusions from the matrix experi-
ment are to be valid in actual manufacturing, our results must not be sensitive to such
inherent variations. By keeping these variations out of our experiments, we lose the
ability to test for robustness against such variations. The matrix experiment, coupled
with the verification experiment, has a built-in check for sensitivity to such inherent
variations,
A difficulty in conducting matrix experiments is their radical difference from the
current practice of conducting product or process design experiments, One common
practice is to guess, using engineering judgment, the improved settings of the control
factors and then conduct a paired comparison with the starting conditions. The guess-
and-test cycle is repeated until some minimum improvement is obtained, the deadline
is reached, or the budget is exhausted. This practice relies heavily on luck, and it is
inefficient and time-consuming.
Another common practice is to optimize systematically one control factor at a
time. Suppose we wish to determine the effect of the three temperature settings while
keeping the settings of the other control factors fixed at their starting levels. To reduce
the effect of experimental error, we must process several batches at each temperature80 Steps in Robust Design Chap. 4
setting. Suppose six batches are processed at each temperature setting. (Note that in
the Lg array the replication number is six; that is, there are six experiments for each
factor level.) Then, we would need 18 batches to evaluate the effect of three tempera-
ture settings. For the other factors, we need to experiment with the two alternate
levels, so that we need to process 12 batches each. Thus, for the six factors, we would
need to process 18 + 5 x 12 = 78 batches. This is a large number compared to the 18
batches needed for the matrix experiment. Further, if there are strong interactions
among the control factors, this method of experimentation cannot detect them.
‘The matrix experiment, though somewhat tedious to conduct, is highly
efficient—that is, when compared to the practices above, we can generate more depend-
able information about more control factors with the same experimental effort. Also,
this method of experimentation allows for the detection of the interactions among the
control factors, when they are present, through the verification experiment.
In practice, many design improvement experiments, where only one factor is
studied at a time, get terminated after studying only a few control factors because both
the R&D budget and the experimenter’s patience run out. As a result, the quality
improvement tums out to be only partial, and the product cost remains somewhat high,
This danger is reduced greatly when we conduct matrix experiments using orthogonal
arrays.
In the polysilicon deposition case study, the 18 experiments were conducted
according to the experimenter’s log given in Table 4.3, It took only nine days (2
experiments per day) to conduct them. The observed data on surface defects are listed
in Table 4.4(a), and the thickness and deposition rate data are shown in Table 4.4(b).
The surface defects were measured by placing the specimen under an optical micro-
scope and counting the defects in a field of 0.2 cm?. When the count was high, the
field area was divided into smaller areas, defects in one area were counted, and the
count was then multiplied by an appropriate number to determine the defect count per
unit area (0.2 cm?), The thickness was measured by an optical interferometer. The
deposition rate was computed by dividing the average thickness by the deposition time.
4.7 DATA ANALYSIS
The first step in data analysis is to summarize the data for each experiment. For the
case study, these calculations are illustrated next.
For experiment number 1, the S/N ratio for the surface defects, given by Equa-
tion (4.1), was computed as follows:
,
Me
Me
1 =-10 logig -
ist jSec. 4.7 Data Analysis 81
2 2) 4.092 2442
=-10 e60 |4 +0 +17) +2 eed +17+0*)
==10 logi (:|
=0.51.
From the thickness data, the mean, variance, and S/N ratio were calculated as follows
by using Equations (4.2), (4.3) and (4.4):
9
Me
Me
See Eq. (4.2) p= ty
(2029+ 1975 + 1961) + (1975 + 1934 + 1907) + (1952+ 1941 + 1949) 7
~|-
= 1958.1 A
See Eq, (4.3) 0? = ; z
3
ist j=
y-wy?
t {c029- 1958.1)? ++ + -+(1949— 1958.1)?
= 1151.36 (A).
2
W = 10 losio EF
1958.17
1151.36
= 10 logio
= 35.22 dB.Steps in Robust Design
TABLE 4.4(a) SURFACE DEFECT DATA (DEFECTS/UNIT AREA)
Test Wafer 1 Test Wafer 2 Test Wafer 3
Expt.
No. | Top | Center | Bottom | Top | Center | Bottom | Top | Center | Bottom
1 1 0 1 2 0 0 1 1 0
2 1] 2 8 | 190) 5 o| 26] 3 1
3 | 3] 35] 106 | 360] 38} 135 | 315; 50] 180
4| 6] 15 6| 17} 2] 1] ts} 40] 18
5 | 1720} 1980 | 2000 | 487| 810 | 400 | 2020] 360) 13,
6 135 360 1620 | 2430| 207 2 | 2500 270 35
7 360} 810 1215 | 1620 7 30 | 1800 720 315
8 | 270| 2730 | sooo | 360) 1 2 | 9999] 225 1
9 | 5000} 1000 | 1000 }3000| 1000 | 1000 | 3000] 2800 | 2000
10 3 0 0 a 0 o 1 0 1
Bie 1 0 1 5 oO 0 1 0 1
2 | 3] 1620] 90} 216] 5 4} 2] 8 3
B 1 25 270 810 16 1 25 3 0
| 3] 21 | 162) 9] 6 1) | 15] 39
AS 450 | 1200 1800 | 2530 | 2080 2080 | 1890 180 25
16 5 6 40 54 oO 8 14 1 1
17 | 1200} 3500 | 3500 | 1000] 3 1 |9999} 600 8
18 | 8000 | 2500 | 3500 | 5000] 1000 | 1000 | s000| 2000 | 2000
Chap. 4Sec, 4.7
Data Analysis
TABLE 4.4(6) THICKNESS AND DEPOSITION RATE DATA
‘Thickness (A)
Test Wafer 1 Test Wafer 2 Test Wafer 3
+
Expt.
No. | Top | Center| Bottom | Top | Center| Bottom |} Top | Center
1 | 2029} 1975 1961 | 1975 | 1934 1907 | 1952} 1941 1949 14.5
2 | 5375} 5191 5242 | 5201 | 5254 5309 | 5323 | 5307 5091 36.6
3 | s99| 5804 | 5874 |152] so1o | sese | 6077| seas | so62 | 414
4 | 2118} 2109 | 2099 | 2140] 2125 2108 | 2149} 2130 | 2111 36.1
| 4102] 4152 | 4174 | 4556| 4504 | 4560 | 5031] soao | 5032 | 73.0
6 | 3022] 2932 | 2913 | 2833 | 2837 2828 | 2934 | 2875 ‘2841 49,5
7 | 3030] 3042 | 3028 | 3486] 3333 | 3389 |3709| 3671 | 3687 | 76.6
8 | 4707| 4472 | 4336 | 4407 4156 | 4094 | 5073] 4898 | 4599 | 105.4
9 | 3859 | 3822 | 3850 | 3871 | 3922 3904 | 4110 | 4067 4110 115.0
10 | 3227) 3205 3242 | 3468 | 3450 3420 | 3599} 3591 3535 24.8
11 | 2521| 2499 | 2499 | 2576 | 2537 | 2512 | 2551 | 2552 | 2570 | 200
12 | 5921] 5766 | 5844 | 5780| soos | sei4 | seai| 5777 | 5743 | 390
13 | 2792| 2752 | 2716 | 2684} 2635 | 2606 | 2765] 2786 | 2773 | 53.1
14 | 2863 | 2835 2859 | 2829 | 2864 | 2839 | 2891 | 2844 2841 45.7
15 | 3218) 3149 3124 | 3261 | 3205 3223 | 3241 | 3189 3197 54.8
16 | 3020} 3008 3016 | 3072 | 3151 3139 | 3235 | 3162 3140 768
17 | 4277] 4150 | 3992 | 3888] 3681 | 3572 | 4593] 4298 | 4219 | 1053
1s |3125| 3119 | 3127 | 3567) 3563 | 3520 | 4120] 4088 | 4138 | 914a4 Steps in Robust Design Chap. 4
‘The deposition rate in the decibel scale for experiment I is given by
‘NY = 10 logyg r? = 20 logig r
= 20 logio(14.5)
= 23.23 dBam
where dBam stands for decibel A /min.
The data summary for all 18 experiments was computed in a similar fashion and
the results are tabulated in Table 4.5.
Observe that the mean thickness for the 18 experiments ranges from 1958 A to
5965 A. But we are least concerned about this variation in the thickness because the
average thickness can be adjusted easily by changing the deposition time. During a
Robust Design project, what we are most interested in is the S/N ratio, which in this
case is a measure of variation in thickness as a proportion of the mean thickness.
Hence, no further analysis on the mean thickness was done in the case study, but the
mean thickness, of course, was used in computing the deposition rate, which was of
interest.
After the data for each experiment are summarized, the next step in data analysis
is to estimate the effect of each control factor on each of the three characteristics of
interest and to perform analysis of variance (ANOVA) as described in Chapter 3.
The factor effects for surface defects (n), thickness (1’), and deposition rate (1),
and the respective ANOVA are given in Tables 4.6, 4.7, and 4.8, respectively. A sum-
mary of the factor effects is tabulated in Table 4.9, and the factor effects are displayed
graphically in Figure 4.5, which makes it easy to visualize the relative effects of the
various factors on all three characteristics.
To assist the interpretation of the factor effects plotted in Figure 4.5, we note the
following relationship between the decibel scale and the natural scale for the three
characteristics:
* An increase in 1) by 6 dB is equivalent to a reduction in the root mean square
surface defects by a factor of 2. An increase in | by 20 dB is equivalent to a
reduction in the root mean square surface defects by a factor of 10.
* The above statements are valid if we substitute 1 or n” for 7, and standard
deviation of thickness or deposition rate for root mean square surface defects.
The task of determining the best setting for each control factor can become com-
plicated when there are multiple characteristics to be optimized. This is because dif-
ferent levels of the same factor could be optimum for different characteristics. The
quality loss function could be used to make the necessary trade-offs when different
characteristics suggest different optimum levels. For the polysilicon deposition caseSec. 4.7
Data Analysis
TABLE 4.5 DATA SUMMARY BY EXPERIMENT
Surface Deposition
Experiment Condition | Defects |_ Thickness Bate
Exp) Mast afe fw) ow
No. eABCDEeF (dB) |(A) (dB) (dBam)
1 Piriiriid 0.51 | 1958 | 35.22 23.23
2 112.2 2222 | -37.30 | 5255 | 35.76 31.27
3 11333333 | -45.17 | 5965) 36.02 32,34
4 12112233 | -2576 [211 | 422s | aias
s }12223311 | 6254} 472 | 2143 | 3727
6 | 12331122 | -6223 [2801 | 3291 | 3.89
7 [13121323 | -s088 | a375 | 2139 | a8
8 |1323.2131|-7169 | 4527) 2284 | 40.46
9 13.3.1 3.21 2 | -68.15 | 3946 | 30.60 4121
10 21133221 -3.47 | 3415 | 26.85 2789
i 21211332 —5.08 | 2535 | 38.80 26.02
2] 21322113] -s485 | 5781 | 3806 | 31.82
13, 22123132 | -49.38 | 2723 | 32.07 34.50
i4 2223 1213 | -3654 | 2852) 43.34 33.20
15, 22312321 | -64.18 | 3201 37.44 34.76
we }23.132312]-2731 | 305) 3186 | 3771
17 23213123) -71L51 | 4074] 22.01 40.45
18 23°32:1231 | -72.00 | 3596 18.42 (39,22
* Empty column is denoted by ¢.86 Stops in Robust Design Chap. 4
1) =-10 log,, (mean square surface defects)
25
a —-\--_\--fa N\A AA A--
50:
75
A, Ar As B, By By C2030, Dy Dz Dy EE, & Fi Fa Fs
a8
40: "=10 log, (45 ) for thickness
‘7
Ap Aap
Ay Ay Ay By BBs CyC,C; D, Dy Dy E Ep Ey Fy Fe Fy
20:
Bam
40:
x“ —- AG a
1" = 10 f0g\q (deposition rate)?
Ay Ay Ay By Bz By Co Cy Cy Dy Dz Dy Ej Ee Ey Fi Fo Fe
Wn
Temp. Pressure Nitrogen Silane Settling Cleaning
(°C) (mtorr) (sccm) ~— (sccm) Time Method
(min)
Figure 4.5 Plots of factor effects. Underline indicates starting level. ‘Two-standard-
deviation confidence limits are also shown for the starting level. Estimated confidence limits
for 1)” are too small to show.Sec. 4.7 Data Analysis a7
study, we can make the following observations about the optimum setting from Fig-
ure 4.5 and Table 4.9:
Deposition temperature (factor A) has the largest effect on all three characteris-
tics, By reducing the temperature from the starting setting of To °C to To -25
°C, n can be improved by {(- 24.23) — (-50.10)} = 26 dB, This is equivalent
to a 20-fold reduction in root mean square surface defect count. The effect of
this temperature change on thickness uniformity is only (35.12-34.91) =
0.21 dB, which is negligible. But the same temperature change would lead to a
reduction in deposition rate by (34.13 28.76) = 5.4 dB, which is approximately
a 2-fold reduction in the deposition rate. Thus, temperature can dramatically
reduce the surface defect problem, but it also would double the deposition time.
Accordingly, there is a trade-off to be made between reducing the quality cost
(including the scrap due to high surface defect count) and the number of wafers
processed per day by the reactor.
Deposition pressure (factor B) has the next largest effect on surface defect and
deposition rate. Reducing the pressure from the starting level of Po mtorr to
(P~200) mtorr can improve 1) by about 20 dB (a 10-fold reduction in the root
mean square surface defect count) at the expense of reducing the deposition rate
by 2.75 dBam (37 percent reduction in deposition rate). The effect of pressure
on thickness uniformity is very small.
Nitrogen flow rate (factor C) has a moderate effect on all three characteristics.
The starting setting of No sccm gives the highest S/N ratios for surface defects
and thickness uniformity. There is also a possibility of further improving these
two S/N ratios by increasing the flow rate of this dilutant gas. This is an impor-
tant fact to be remembered for future experiments, The effect of nitrogen flow
rate on deposition rate is small compared to the effects of temperature and pres-
sure,
Silane flow rate (factor D) also has a moderate effect on all three characteristics.
Thickness uniformity is the best when silane flow rate is set at (Sg—50) sccm.
This can also lead to a small reduction in surface defects and the deposition rate.
Settling time (factor E) can be used to achieve about 10 dB improvement in sur-
face defects by increasing the time from fo minutes to (f9 +8) minutes. The data
indicates that a further increase in the settling time to (fg +16) minutes could
negate some of the reduction in surface defect count. However, this change is
small compared to the standard deviation of the error; and it is not physically
justifiable, Settling time has no effect on the deposition rate and the thickness
uniformity.
Cleaning method (factor F) has no effect on deposition rate and surface defects.
But, by instituting some cleaning prior to deposition, the thickness uniformity
can be improved by over 6.0 dB (a factor of 2 reduction in standard deviation of88 Steps in Robust Design Chap. 4
thickness). Cleaning with CM2 or CM; could give the same improvement in
thickness uniformity. However, CM cleaning can be performed inside the reac-
tor, whereas CM; cleaning must be done outside the reactor. Thus, CM clean-
ing is more convenient.
From these observations, the optimum settings of factors E and F are obvious,
namely E and F. However, for factors A through D, the direction in which the qual-
ity characteristics (surface defects and thickness uniformity) improve tend to reduce the
deposition rate. Thus, a trade-off between quality loss and productivity must be made
in choosing their optimum levels. In the case study, since surface defects were the key
quality problem that caused significant scrap, the experimenters decided to take care of
it by changing temperature from Az to A. As discussed earlier, this also meant a sub-
stantial reduction in deposition rate. Also, they decided to hold the other three factors
at their starting levels, namely B>, Cy, and D3. The potential these factors held would
TABLE 4.6 ANALYSIS OF SURFACE DEFECTS DATA*
Average 1 by Factor Level
(dB)
Degree of | Sum of | Mean
Factor 1 2 3 | Freedom | Squares | Square | F
A. Temperature = 24.23 | -50,10 | - 61.76 2 4427 | 2214 | 27
B. Pressure - 2755 ) -47.44 | - 61.10 2 3416 | 1708 | 21
C. Nitrogen 39.03 | - 55.99 | - 41.07 2 130 | sis | 64
D, Silane ~ 39.20 | - 4685 | -50.04 2 372 | 186 | 23
E, Settling time 51.52 | - 4054 | - 4403 2 378 | 189 | 23
F, Cleaning method — 4158 | - 48.95 2 lott | 82
Error 5 40st | 81
Total 7 10192
(Error) ” 69) | @t)
* Overall mean 7 = —45.36 dB. Underscore indicates starting level.
+ Indicates the sum of squares added together to form the pooled error sum of squares shown
in parentheses.Sec. 4.7 Data Analysis 89
have been used if the confirmation experiment indicated a need to improve the surface
defect and thickness uniformity further. Thus, the optimum conditions chosen were:
A\BoC 1D 3EoF >.
‘The next step in data analysis is to predict the anticipated improvements under
the chosen optimum conditions. To do so, we first predict the S/N ratios for surface
defects, thickness uniformity, and deposition rate using the additive model. These
‘computations for the case study are displayed in Table 4.10. According to the table, an
improvement in surface defects equal to [-19.84—(~56.69)] = 36.85 dB should be
anticipated, which is equivalent to a reduction in the root mean square surface defect
count by a factor of 69.6. The projected improvement in thickness uniformity is
36.79-29.95 = 6.84 dB, which implies a reduction in standard deviation by a factor of
2.2. The corresponding change in deposition rate is 29.60— 34.97 5.37 dB, which
amounts to a reduction in the deposition rate by a factor of 1.9.
TABLE 4.7 ANALYSIS OF THICKNESS DATA*
] Average n’ by Level
(B)
Degree of | Sum of | Mean
Factor i 2 3 | Freedom | Squares | Square | F
A. Temperature 35.12 | 34.91 | 2452 2 440 | 220° | 16
B Pressure 31.61 | 30.70 | 3224 2 Ww) 35
C. Nitrogen 3439 | 27.86 | 32.30 2 134 | 67 30
D. Silane 31.68 | 34.70 | 28.17 2 8 | 64 48
B. Settling time 3052 | 3287 | 31.16 2 we | 9
F, Cleaning method | 27.04 | 33.67 | 33.85 2 181 | 905 | 68
Error 5 ser | 192
Total 7 1004 | 59.1
(Error) oe (21) | 3.4)
* Overall mean n’ = 31.52 4B. Underscore indicates starting level.
+ Indicates the sum of squares added together to form the pooled error sum of squares
shown in parentheses.90 Steps in Robust Design Chap. 4
TABLE 4.8 ANALYSIS OF DEPOSITION RATE DATA*
‘Average 1” by Factor Level
(@Bam)
Degree of | Sum of | Mean
Factor 1 2 3 Freedom | Squares | Square | F
‘A. Temperature 28.76 | 3413 | 39.46 2 343.1 | i7is | 553
B, Pressure 3203 | 3478 | 35.54 2 aio | 25 | 66
C. Nitrogen 35.29 | 34.25 2 18.7 94 | 30
D, Silane 34.53} 35.61 2 363 | 181 | 38
E, Settling time 3399 | 3430 2 ost | 02
F. Cleaning method | 33.81 | 34.10 | 34.44 2 12 | 06
Error 5 13% | 026
Total "7 a9 | 259
(Error) @ @8) | @30
* Overall mean n” = 34.12 dBam. Underscore indicates starting level.
Indicates the sum of squares added together to form the pooled error sum of squares shown in
parentheses.
4.8 VERIFICATION EXPERIMENT AND FUTURE PLAN
Conducting a verification experiment is a crucial final step of a Robust Design project.
Its purpose is to verify that the optimum conditions suggested by the matrix experiment
do indeed give the projected improvement. If the observed S/N ratios under the
optimum conditions are close to their respective predictions, then we conclude that the
additive model.on which the matrix experiment was based is a good approximation of
the reality. Then, we adopt the recommended optimum conditions for our process or
product, as the case may be. However, if the observed S/N ratios under the optimum
conditions differ drastically from their respective predictions, there is an evidence of
failure of the additive model. There can be many reasons for the failure and, thus,
there are many ways of dealing with it. The failure of the additive model generally
indicates that choice of the objective function or the S/N ratio is inappropriate, the
observed quality characteristic was chosen incorrectly, or the levels of the control fac-
tors were chosen inappropriately. The question of how to avoid serious additivity
problems by properly choosing the quality characteristic, the S/N ratio, and the control
factors and their levels is discussed in Chapter 6. Of course, another way to handle theSec. 4.8 Verification Experiment and Future Plan 1
TABLE 4.9 SUMMARY OF FACTOR EFFECTS.
Surface Defects [ Thickness | Deposition Rate
1 v Li
Factor (ae) F | @p) | F | @Bam | F
‘A. Temperature () 14.3 35.12 28.76
5010 | 27) 3491 | 16 | 3413. | 553
61.76 24.52 39.46
B. Pressure (mtorr) 27.55 31.61 32.03
—47.44 21 30.70 ms 34.78 66
-61.10 32.24 35.54
C. Nitrogen (seem) 39.03 3439 3281
-5599 | 64 | 2786] 5.0] 3529 | 30
41.07 32.30 34.25
D. Silane (seem) 39.20 31.68 3221
46.85 23 34.70 48 3453 58
~50.04 28.17 35.61
E. Settling time (min) wEESE: 30.52 34.06
—40.54 23 32.87 ~ 33,99 -
44.03 31.16 34,30
F. Cleaning method 45.56 27.04 33.81
41.58 | - | 3367) 68} 3410 | -
48.95 33.85 34.44
‘Overall mean 45.36 31.52 34.12
additivity problem is to study a few key interactions among the control factors in
future experiments. Construction of orthogonal arrays that permit the estimation of a
few specific interactions, along with all main effects, is discussed in Chapter 7.
The verification experiment has two aspects: the first is that the predictions must
agree under the laboratory conditions; the second aspect is that the predictions should
be valid under actual manufacturing conditions for the process design and under actual
field conditions for the product design. A judicious choice of both the noise factors to
be included in the experiment and the testing conditions is essential for the predictions
made through the laboratory experiment to be valid under both manufacturing and field
conditions.
For the polysilicon deposition case study, four batches of 50 wafers containing 3
test wafers were processed under both the optimum condition and under the starting
conditions. The results are tabulated in Table 4.11. It is clear that the data agree very
well with the predictions about the improvement in the S/N ratios and the deposition
rate. So, we could adopt the optimum settings as the new process settings and proceed
to implement these settings.92 Steps in Robust Design Chap. 4
TABLE 4.10 PREDICTION USING THE ADDITIVE MODEL
Starting Condition Optimum Condition
Contribution? (4B) Contributiont (4B)
Surface Deposition Surface Deposition
Factor | Setting | Defects | Thickness | Rate | Setting ) Defects | Thickness | Rate
at A, | 474} 339 oo A, 21.13 3.60 5.36
B B, | -208 | 0.00 0.66 B, | -208 | 0.00 0.66
c Bi 633 287 =131 CG 633 | 287 -131
D D, 468 3.35 149 Dy 4.68 —3.35 149
Et E, 6.16 0.00 0.00 E, 482 0.00 0.00
mp F, 0.00 ~4.48 0.00 Fy 0.00 215 0.00
Overall 45.36 31.52 34.12 45.36 31.52 34.12
Mean
Total -56.69 | 29.95 34.97 1984 | 36.79 29.60
* Indicates the factors whose levels are changed from the starting to the optimum conditions.
+ By contribution we mean the deviation from the overall mean caused by the particular factor level,
TABLE 4.11 RESULTS OF VERIFICATION EXPERIMENT
Starting | Optimum
Condition | Condition | Improvement
Surface rms 600/cm? Tem?
Defects
ny -55.64B | -16.94B | 38.74B
std.dev. | 0.028 0.013,
Thickness
v 311aB | 37.748 66 dB
Deposition rate 60 A/min | 35 A /min
Rate
nv 35.6 dBam | 30.9 dBam | —4.7 dBam
* Standard deviation of thickness is expressed as a fraction of the mean
thickness.Sec. 4.9 Summary 93
Follow-up Experiments
Optimization of a process or a product need not be completed in a single matrix exper-
iment. Several matrix experiments may have to be conducted in sequence before com-
pleting a product or process design. The information leamed in one matrix experiment
is used to plan the subsequent matrix experiments for achieving even more improve-
ment in the process or the product. The factors studied in such subsequent experi-
ments, or the levels of the factors, are typically different from those studied in the ear-
lier experiments.
From the case-study data on the polysilicon deposition process, temperature
stood out as the most important factor—both for quality and productivity. The experi-
mental data showed that high temperature leads to excessive formation of surface
defects and nonuniform thickness. This led to identifying the type of temperature con-
troller as a potentially important control factor. The controller used first was an under-
damped controller, and, consequently, during the initial period of deposition, the reac-
tor temperature rose significantly above the steady-state set-point temperature. It was
then decided to try a critically damped controller. Thus, an auxiliary experiment was
conducted with two control factors: (1) the type of controller, and (2) the temperature
setting. This experiment identified the critically damped controller as being
significantly better than the underdamped one.
The new controller allowed the temperature setting to be increased to Ty- 10 °C
while keeping the surface defect count below | defect/unit area. The, higher tempera-
ture also led to a deposition rate of 55A /min rather than the 35A /min that was
observed in the initial verification experiment. Simultaneously, a standard deviation of
thickness equal to 0.007 times the mean thickness was achieved.
Range of Applicability
In any development activity, it is highly desirable that the conclusions continue to be
valid when we advance to a new generation of technology. In the case study of the
polysilicon deposition process, this means that having developed the process with 4-
inch wafers, we would want it to be valid when we advance to 5-inch wafers. ‘The
process developed for one application should be valid for other applications. Processes
and products developed by the Robust Design method generally possess this charac-
teristic of design transferability. In the case study, going from 4-inch wafers to 5-inch
wafers was achieved by making minor changes dictated by the thermal capacity calcu-
lations. Thus, a significant amount of development effort was saved in transferring the
process to the reactor that handled 5-inch wafers.
4.9 SUMMARY
Optimizing the product or process design means determining the best architecture, lev-
els of control factors, and tolerances. Robust Design is a methodology for finding the94 Steps in Robust Design Chap. 4
optimum settings of control factors to make the product or process insensitive to noise
factors. It involves eight major steps which can be grouped as planning a matrix
experiment to determine the effects of the control factors (Step 1 through 5), conduct-
ing the matrix experiment (Step 6), and analyzing and verifying the results (Steps 7
and 8).
* Step 1. Identify the main function, side effects and failure modes. This step
requires engineering knowledge of the product or process and the customer's
environment,
Step 2. Identify noise factors and testing conditions for evaluating the quality
Joss. The testing conditions are selected to capture the effect of the more impor-
tant noise factors. It is important that the testing conditions permit a consistent
estimation of the sensitivity to noise factors for any combination of control factor
levels. In the polysilicon deposition case study, the effect of noise factors was
captured by measuring the quality characteristics at three specific locations on
each of three wafers, appropriately placed along the length of the tube. Noise
orthogonal array and compound noise factor are two common techniques for con-
structing testing conditions. These techniques are discussed in Chapter 8.
Step 3. Identify the quality characteristic to be observed and the objective func-
tion to be optimized. Guidelines for selecting the quality characteristic and the
objective function, which is generically called S/N ratio, are given in Chapters 5
and 6. The common temptation of using the percentage of products that meet
the specification as the objective function to be optimized should be avoided. It
leads to orders of magnitude reduction in efficiency of experimentation. While
optimizing manufacturing processes, an appropriate throughput characteristic
should also be studied along with the quality characteristics because the econom-
ics of the process is determined by both of them.
Step 4. Identify the control factors and their alternate levels. The more complex
a product or a process, the more control factors it has and vice versa. Typically,
six to eight control factors are chosen at a time for optimization. For each con-
trol factor two or three levels are selected, out of which one level is usually the
starting level, The levels should be chosen sufficiently far apart to cover a wide
experimental region because sensitivity to noise factors does not usually change
with small changes in control factor settings. Also, by choosing a wide experi-
mental region, we can identify good regions, as well as bad regions, for control
factors. Chapter 6 gives additional guidelines for choosing control factors and
their levels. In the polysilicon deposition case study, we investigated three levels
each of six control factors. One of these factors (cleaning method) had discrete
levels. For four of the factors the ratio of the largest to the smallest levels was
between three and five.
* Step §. Design the matrix experiment and define the data analysis procedure.
Using orthogonal arrays is an efficient way to study the effect of several control
factors simultaneously. The factor effects thus obtained are valid over theSec.4.9 Summary 95,
experimental region and it provides a way to test for the additivity of the factor
effects. The experimental effort needed is much smaller when compared to other
methods of experimentation, such as guess and test (trial and error), one factor at
a time, and full factorial experiments, Also, the data analysis is easy when
orthogonal arrays are used. The choice of an orthogonal array for a particular
project depends on the number of factors and their levels, the convenience of
changing the levels of a particular factor, and other practical considerations.
Methods for constructing a suitable orthogonal array are given in Chapter 7. The
orthogonal array Lg, consisting of 18 experiments, was used for the polysilicon
deposition study. The array Lg happens to be the most commonly used array
because it can be used to study up to seven 3-level and one 2-level factors.
Step 6. Conduct the matrix experiment. Levels of several control factors must
be changed when going from one experiment to the next in a matrix experiment.
Meticulousness in correctly setting the levels of the various control factors is
essential—that is, when a particular factor has to be at level 1, say, it should not
be set at level 2 or 3. However, one should not worry about small perturbations
that are inherent in the experimental equipment. Any erroneous experiments or
missing experiments must be repeated to complete the matrix. Errors can be
avoided by preparing the experimenter’s log and data sheets prior to conducting
the experiments. This also speeds up the conduct of the experiments
significantly. The 18 experiments for the polysilicon deposition case study were
completed in 9 days,
Step 7. Analyze the data, determine optimum levels for the control factors, and
predict performance under these levels. The various steps involved in analyzing
the data resulting from matrix experiments are described in Chapter 3. S/N
ratios and other summary statistics are first computed for each experiment, (In
Robust Design, the primary focus is on maximizing the S/N ratio.) Then, the
factor effects are computed and ANOVA performed. The factor effects, along
with their confidence intervals, are plotted to assist in the selection of their
optimum levels. When a product or a process has multiple quality characteris-
tics, it may become necessary to make some trade-offs while choosing the
optimum factor levels. The observed factor effects together with the quality loss
function can be used to make rational trade-offs. In the polysilicon case study,
the data analysis indicated that levels of three factors—deposition temperature
(A), settling time (E), and cleaning method (F)—be changed, while the levels of
the other five factors be kept at their starting levels.
Step 8. Conduct the verification (confirmation) experiment and plan future
actions. The purpose of this final and crucial step is to verify that the optimum
conditions suggested by the matrix experiments do indeed give the projected
improvement. If the observed and the projected improvements match, we adopt
the suggested optimum conditions. If not, then we conclude that the additive
model underlying the matrix experiment has failed, and we find ways to correct
that problem. The corrective actions include finding better quality characteristics,
or signal-to-noise ratios, or different control factors and levels, or studying a fewSteps in Robust Design Chap. 4
specific interactions among the control factors. Evaluating the improvement in
quality loss, defining a plan for implementing the results, and deciding whether
another cycle of experiments is needed are also a part of this final step of Robust
Design. It is quite common for a product or process design to require more than
one cycle of Steps 1 through 8 for achieving needed quality and cost improve-
ment. In the polysilicon deposition case study, the verification experiment
confirmed the optimum conditions suggested by the data analysis. In a follow up
Robust Design cycle, two control factors were studied—deposition temperature
and type of temperature controller. The final optimum process gave nearly two
orders of magnitude reduction in surface defects and a 4-fold reduction in the
standard deviation of the thickness of the polysilicon layer.Chapter 5
SIGNAL-TO-NOISE RATIOS
The concept of quadratic loss function introduced in Chapter 2 is ideally suited for
evaluating the quality level of a product as it is shipped by a supplier to a customer,
“As shipped” quality means that the customer would use the product without any
adjustment to it or to the way it is used. Of course, the customer and the supplier
could be two departments within the same company.
A few common variations of the quadratic loss function were given in Chapter 2.
Can we use the quadratic loss function directly for finding the best levels of the control
factors? What happens if we do so? What objective function should we use to minim-
ize the sensitivity to noise? We examine these and other related questions in this
chapter. In particular, we describe the concepts behind the signal-to-noise (S/N) ratio
and the rationale for using it as the objective function for optimizing a product or pro-
cess design. We identify a number of common types of engineering design problems
and describe the appropriate S/N ratios for these problems. We also describe a pro-
cedure that could be used to derive S/N ratios for other types of problems. This
chapter has six sections:
* Section 5.1 discusses the analysis of the polysilicon thickness uniformity.
‘Through this discussion, we illustrate the disadvantages of direct minimization of
the quadratic loss function and the benefits of using S/N ratio as the objective
function for optimization.
* Section 5.2 presents a general procedure for deriving the S/N ratio,98 Signal-to-Noise Ratios Chap. 5
* Section 5.3 describes common static problems (where the target value for the
quality characteristic is fixed) and the corresponding S/N ratios.
* Section 5.4 discusses common dynamic problems (where the quality characteris-
tic is expected to follow the signal factor) and the corresponding S/N ratios.
* Section 5.5 describes the accumulation analysis method for analyzing ordered
categorical data.
* Section 5.6 summarizes the important points of this chapter.
5.1 OPTIMIZATION FOR POLYSILICON LAYER THICKNESS
UNIFORMITY
One of the two quality characteristics optimized in the case study of the polysilicon
deposition process in Chapter 4 was the thickness of the polysilicon layer. Recall that
one of the goals was to achieve a uniform thickness of 3600 A . More precisely, the
experimenters were interested in minimizing the variance of thickness while keeping
the mean on target. The objective of many robust design projects is to achieve a par-
ticular target value for the quality characteristic under all noise conditions. These types
of projects were previously referred to as nominal-the-best type problems. The detailed
analysis presented in this section will be helpful in formulating such projects. This
section discusses the following issues:
‘* Comparison of the quality of two process conditions
* Relationship between S/N ratio and quality loss after adjustment (Q,)
‘* Optimization for different target thickness
* Interaction induced by the wrong choice of objective function
* Identification of a scaling factor
* Minimization of standard deviation and mean separately
Comparing the Quality of Two Process Conditions
Suppose we are interested in determining which is a preferred temperature setting,
To °C of (Tq + 25) °C, for achieving uniform thickness of the polysilicon layer around
the target thickness of 3600 A. We may attempt to answer this question by running a
number of batches under the two temperature settings while keeping the other control
factors fixed at certain levels. Suppose the observed mean thickness and standard devi-
ation of thickness for these two process conditions are as given in Table 5.1. Although
no experiments were actually conducted under these conditions, the data in Table 5.1
are Tealistic based on experience with the process. This is also true for all other data
used in this section, Note that under temperature To “C, the mean thickness is 1800
A., which is far away from the target, but the standard deviation is small. Whereas
under temperature (Tp + 25) °C, the mean thickness is 3400 A , which is close to theSec. 5.1 Optimization for Polysilicon Layer Thickness Uniformity 99
target, but the standard deviation is large. As we observe here, it is very typical for
both the mean and standard deviation to change when we change the level of a factor.
TABLE 5.1 EFFECT OF TEMPERATURE ON THICKNESS UNIFORMITY,
Mean Standard
Expt. | Temperature | Thickness (4)* | Deviation (0) ot
No. co (A) (A) ay
1 To 1800 32 3.241 x 10°
2 To +25 3400 200 8.000 x 10*
* Target mean thickness = {tg = 3600 A
TO=- WP to
From the data presented in Table 5.1, which temperature setting can we recom-
mend? Since both the mean and standard deviation change when we change the tem-
perature, we may decide to use the quadratic loss function to select the better tempera-
ture setting. For a given mean, p, and standard deviation, o, the quality loss without
adjustment, denoted by Q, is given by
Q = quality loss without adjustment = k fw —3600)? +0? G1)
where k is the quality loss coefficient. Note that throughout this chapter we ignore the
constant k (that is, set it equal to 1) because it has no effect on the choice of optimum
levels for the control factors. The quality loss under T' °C is 3.24 x 10°, while under
(To +25) °C it is 8.0 x 10'. Thus, we may conclude that (To + 25) ‘°C is the better
temperature setting. But, is that really a correct conclusion?
Recall that the deposition time is a scaling factor for the deposition process—that
is, for any fixed settings of all other control factors, the polysilicon thickness at the
various points within the reactor is proportional to the deposition time. Of course, the
proportionality constant, which is the same as the deposition rate, could be different at
different locations within the reactor. This is what leads to the variance, o”, of the
polysilicon thickness. We can use this knowledge of the scaling factor to estimate the
quality loss after adjusting the mean on target.
For To °C temperature, we can attain the mean thickness of 3600 A by increas-
ing the deposition time by a factor of 3600/1800 = 2.0. Correspondingly, the standard100 Signal-to-Noise Ratios Chap. 5
deviation would also increase by the factor of 3600/1800 to 64A . Thus, the
estimated quality loss after adjusting the mean is 4.1 x 10°, Similarly, for
(To + 25) °C we can obtain 3600 A thickness by increasing the deposition time by a
factor of 3600 / 3400, which would result in a standard deviation of 212 A . Thus, the
estimated quality loss after adjusting the mean is 4.49 x 10*. From these calculations
it is clear that when the mean is adjusted to be on target, the quality loss for Ty °C is
an order of magnitude smaller than the quality loss for (T + 25) °C; that is, the sensi-
tivity to noise is much less when the deposition temperature is Tg “C as opposed to
(To + 25) °C. Hence, Ty °C is the preferred temperature setting.
A decision based on quality loss without adjustment (Q) is influenced not only
by the sensitivity to noise (), but also by the deviation from the target mean (1—[Ug).
Often, such a decision is heavily influenced, if not dominated, by the deviation from
the target mean. As a result, we risk the possibility of not choosing the factor level
that minimizes sensitivity to noise. This, of course, is clearly undesirable. But when
we compute the quality loss after adjustment, denoted by Q,, for all practical purposes
we eliminate the effect of change in mean. In fact, it is a way of isolating the sensi-
tivity to noise factors. Thus, a decision based on Q, minimizes the sensitivity to noise,
which is what we are most interested in during robust design.
Relationship between S/N Ratio and Q,
‘The general formula for computing the quality loss after adjustment for the polysilicon
thickness problem, which is a nominal-the-best type problem, can be derived as fol-
lows: If the observed mean thickness is 1, we have to increase the deposition time by
a factor of [g/t to get the mean thickness on target. The predicted standard deviation
after adjusting the mean on target is (49/1) 6, where G is the observed standard devia-
tion. So, we have
Qq = quality loss after adjustment = (5.2)
We can rewrite Equation (5.2) as follows:
o
Q, = kus [= . (5.3)
Since in a given project k and [lo are constants, we need to focus our attention only on
(p7/0"). We call (17/0) the S/N ratio because 0° is the effect of noise factors and p
is the desirable part of the thickness data. Maximizing (\u?/6”) is equivalent to minim-
izing the quality loss after adjustment, given by Equation (5.3), and also equivalent to
minimizing sensitivity to noise factors.Sec. 5.1 Optimization for Polysilicon Layer Thickness Uniformity 101
For improved additivity of the control factor effects, it is common practice to
take log transform of (4/07) and express the S/N ratio in decibels,
2
n= 10 logo | 6.4)
Although it is customary to refer to both (\u”/o*) and 1) as the S/N ratio, it is clear
from the context which one we mean. The range of values of (u*/o*) is (0, o),
while the range of values of 1 is (— <2, oo). Thus, in the log domain, we have better
additivity of the effects of two or more control factors. Since log is a monotone func-
tion, maximizing (\47/o”) is equivalent to maximizing n.
Optimization for Different Target Thicknesses
Using the S/N ratio rather than the mean square deviation from target as an objective
function has one additional advantage. Suppose for a different application of the
polysilicon deposition process, such as manufacturing a new code of microchips, we
want to have 3000 A target thickness, Then, the optimum conditions obtained by
maximizing the S/N ratio would still be valid, except for adjustment of the mean.
However, the same cannot be said if we used the mean square deviation from target as
the objective function, We would have to perform the optimization again.
The problem of minimizing the variance of thickness while keeping the mean on
target is a problem of constrained optimization. As discussed in Appendix B, by using
the S/N ratio, the problem can be converted into an unconstrained optimization prob-
Tem that is much easier to solve. The property of unconstrained optimization is the
basis for our ability to separate the actions of minimizing sensitivity to noise factors by
maximizing the S/N ratio and the adjustment of mean thickness on target.
When we advance from one technology of integrated circuit manufacturing to a
newer technology, we must produce thinner layers, print and etch smaller width lines,
etc. With this in mind, it is crucial that we focus our efforts on reducing sensitivity to
noise by optimizing the S/N ratio. The mean can then be adjusted to meet the desired
target. This flexible approach to process optimization is needed not only for integrated
circuit manufacturing, but also for virtually all manufacturing processes and optimiza-
tion of all product designs.
During product development, the design of subsystems and components must
proceed in parallel. Even though the target values for various characteristics of the
subsystems and components are specified at the beginning of the development activity,
it often becomes necessary to change the target values as more is learned about the
product. Optimizing the S/N ratio gives us the flexibility to change the target later in
the development effort. Also, the reusability of the subsystem design for other applica-
tions is greatly enhanced, Thus, by using the S/N ratio we improve the overall produc-
tivity of the development activity.102 Signal-to-Noise Ratios Chap. 5
Interactions Induced by Wrong Choice of Objective Function
Using the quality loss without adjustment as the objective function to be optimized can
also lead to unnecessary interactions among the control factors. To understand this
point, let us consider again the data in Table 5.1. Suppose the deposition time for the
two experiments in Table 5.1 was 36 minutes. Now suppose we conducted two more
experiments with 80 minutes of deposition time and temperatures of To °C and
(To +25) °C. Let the data for these two experiments be as given in Table 5.2. For
case of comparison, the data from Table 5.1 are also listed in Table 5.2.
TABLE 5.2 INTERACTIONS CAUSED BY THE MEAN
Deposition | _ Mean Standard
Expt, |Temperature| Time —_| Thickness (§1)* | Deviation (0) at Qt
No. | (© (min) Aw (Ay ay ay
1 To 36 1300 2 3.241 x 10° | 4.096 x 10°
2 | 1425 36 3400 200 8,000 x 10° | 4.484 x 10*
3 Te 80 4000 70 1,649 x 10° | 3.969 x 10°
4 To +25 80 ‘7550 440, 15,796 x 10° | 4.402 x 10°
* Target mean thickness = [1g = 3600A,
FQ =H) +0?
f
o
2Q,=03 |
{ e
The quality loss without adjustment is plotted as a function of temperature for
the two values of deposition tims Figure 5.1(a). We see that for 36 minutes of
deposition time, the (Tp + 25) °C is the preferred temperature, whereas for 80 minutes
of deposition time the preferred temperature is To °C. Such opposite conclusions about
the optimum levels of control factors (called interactions) are a major source of confu-
sion and inefficiency in experimentation for product or process design improvement.
Not only is the estimation of interaction expensive but the estimation might not
yield the true optimum settings for the control factors—that is, if there are strong
antisynergistic interactions among the control factors, we risk the possibility of choos-
ing a wrong combination of factor levels for the optimum conditions. In this example,
based on Q, we would pick the combination of (T9 + 25) °C and 36 minutes as the
best combination. But, if we use the S/N ratio or the Q, as the objective function, we
would unambiguously conclude that To °C is the preferred temperature [see Figure
5.1(b)].Sec. 5.1 Optimization for Polysilicon Layer Thickness Uniformity 103
70
(a) When @ is the objective Se 80 min
function, the control factors, = 36 min
temperature and time, have 2 50
strong antisynergistic 2
interaction. r
30
To +28
70
(b) When Q, is the objective
function, there is no interaction
between temperature and time.
Here, since time is a scaling factor,
the curves for 36 min. and
80 min. deposition time are
10 logio(Q,)
aN
32.
55
almost overlapping. 0
To +25
70
ct ‘80 min
t 36 min
(©) From this figure we see that = 50
much of the interaction in (a) :
is caused by the deviation of 2
the mean from the target. S
30
To +25
Figure 5.1 Interactions caused by the mean.
The squared deviation of the mean from the target thickness is a component of
the objective function Q [see Equation (5.1)]. This component is plotted in Figure
5.1(c).. From the figure it is obvious that the interaction revealed in Figure 5.1(a) is
primarily caused by this component. The objective function Q, does not have the
squared deviation of the mean from the target as a component. Consequently, the
corresponding interaction, which unnecessarily complicates the decision process, is
eliminated.
In general, if we observe that for a particular objective function the interactions
among the control factors are strong, we should look for the possibility that the objec-
tive function may have been selected incorrectly. The possibility exists that the objec-
tive function did not properly isolate the effect of noise factors and that it still has the
deviation of the product's mean function from the target as a component.104 Signal-to-Noise Ratios Chap. 5
Identification of a Scaling Factor
In the polysilicon deposition case study, the deposition time is an easily identified scal-
ing factor. However, in many situations where we want to obtain mean on target, the
scaling factor cannot be identified readily. How should we determine the best settings
of the control factors in such situations?
It might, then, be tempting to use the mean squared deviation from the target as
the objective function to be minimized. However, as explained earlier, minimizing the
mean squared deviation from the target can lead to wrong conclusions about the
optimum levels for the control factors; so, the temptation should be avoided. Instead,
we should begin with an assumption that a scaling factor exists and identify such a fac-
tor through experiments,
The objective function to be maximized, namely n, can be computed from the
observed j1 and 6 without knowing which factor is a scaling factor. Also, the scaling
operation does not change the value of 1. Thus, the process of discovering a scaling
factor and the optimum levels for the various control factors is a simple one. It con-
sists of determining the effect of every control factor on | and y, and then classifying
these factors as follows:
1. Factors that have a significant effect on 1, For these factors, we should pick the
levels that maximize n.
2. Factors that have a significant effect on but practically no effect on. Any
one of these factors can serve as a scaling factor. We use one such factor to
adjust the mean on target. We are generally successful in finding at least one
scaling factor. However, sometimes we must settle for a factor that has a small
effect on 1) as a scaling factor.
3. Factors that have no effect on 1 and no effect on t. These are neutral factors
and we can choose their best levels from other considerations such as ease of
operation or cost.
Minimizing Standard Deviation and Mean Separately
Another way to approach the problem of minimizing variance with the constraint that
the mean should be on target is, first, t0 minimize standard deviation while ignoring
the mean, and, then, bring the mean on target without affecting the standard deviation
by changing a suitable factor. The difficulty with this approach is that often we cannot
find a factor that can change the mean over a wide range without affecting the stan-
dard deviation. This can be understood as follows: In these problems, when the mean
is zero, the standard deviation is also zero. However, for all other mean values, the
standard deviation cannot be identically zero. Thus, whenever a factor changes the
it also affects the standard deviation. Also, an attempt to minimize standard
deviation without paying attention to the mean drives both the standard deviation andSec. 5.2 Evaluation of Sensitivity to Noise 105
the mean to zero, which is not a worthwhile solution. Therefore, we should not try to
minimize 6 without paying attention to the mean. However, we can almost always
find a scaling factor. Thus, an approach where we maximize the S/N ratio leads to
useful solutions.
Note that the above discussion pertains to the class of problems called nominal-
the-best type problems, of which polysilicon thickness uniformity is an example. A
class of problems called signed-target type problems where it is appropriate to first
minimize variance and then bring the mean on target is described in Section 5.3.
5.2 EVALUATION OF SENSITIVITY TO NOISE
Let us now examine the general problem of evaluating sensitivity to noise for a
dynamic system, Recall that in a dynamic system the quality characteristic is expected
to follow the signal factor. The ideal function for many products can be written as
y=M 65)
where y is the quality characteristic (or the observed response) and M is the signal (or
the command input). In this section we discuss the evaluation of sensitivity to noise
for such dynamic systems. For specificity, suppose we are optimizing a servomotor (a
device such as an electric motor whose movement is controlled by a signal from a
command device) and that y is the displacement of the object that is being moved by
the servomotor and M specifies the desired displacement. To determine the sensitivity
of the servomotor, suppose we use the signal values M, Mz, «++, Mj; and for each
signal value, we use the noise conditions x), x2, <*>, %,. Let yy denote the observed
displacement for a particular value of control factor settings, 2 = (21, 22. °**. 29).
when the signal is M; and noise is x;. Representative values of y,; and the ideal func-
tion are shown in Figure 5.2. The average quality loss, Q(z), associated with the con-
trol factor settings, 2, is given by
Oy — MY. (5.6)
Me
z
3}
o@) =
am 2
As shown by Figure 5.2, Q(z) includes not only the effect of noise factors but
also the deviation of the mean function from the ideal function. In practice, Q(2) could
be dominated by the deviation of the mean function from the ideal function. Thus, the
direct minimization of Q(z) could fail to achieve truly minimum sensitivity to noise. Tt
could lead simply to bringing the mean function on target, which is not a difficult
problem in most situations anyway. Therefore, whenever adjusiment is possible, we
should minimize the quality loss after adjustment.106 Signal-to-Noise Ratios Chap. 5
Ideal Function
y=M
e
o
e
2
. Uo.
Bow,
Us ‘Observed
e Mean Function
My M, My a“
Figure 5.2 Evaluation of sensitivity to noise.
For the servomotor, it is possible to adjust a gear ratio so that, referring to Figure
5.2, the slope of the observed mean function can be made equal to the slope of the
ideal function. Let the slope of the observed mean function be B. By changing the
gear ratio we can change every displacement y,; to vy = (1/B)y,j. This brings the mean
function on target.
For the servomotor, change of gear ratio leads to a simple linear transformation
of the displacement yj. In some products, however, the adjustment could lead to a
more complicated function between the adjusted value v;; and the unadjusted value yi.
For a general case, let the effect of the adjustment be to change cach yj; to a value
viz = hp (ij), Where the function Hp defines the adjustment that is indexed by a param-
eter R. After adjustment, we must have the mean function on target—that is, the errorsSec. 5.2 Evaluation of Sensitivity to Noise 107
(4; — Mj) must be orthogonal to the signal M;. Mathematically, the requirement of
orthogonality can be written as
Ms
M:
(yj — Mp) M; = 0. 5.7)
i
Equation (5.7) can be solved to determine the best value of R for achieving the
mean function on target. Then the quality loss after adjustment, Q,(z), can be
evaluated as follows:
kame
Q@)= 7 DD Oy (6.8)
The quantity Q,(z) is a measure of sensitivity to noise. It does not contain any part
that can be reduced by the chosen adjustment process. However, any systematic part
of the relationship between y and M that cannot be adjusted is included in Q,(2). [For
the servomotor, the nonlinearity (2nd, 3rd, and higher order terms) of the relationship
between y and M are contained in Q,(z).] Minimization of Q,(2) makes the design
robust against the noise factors and reduces the nonadjustable part of the relationship
between y and M. Any control factor that has an effect on yj; but has no effect on
Q,(2) can be used to adjust the mean function on target without altering the sensitivity
to noise, which has already been minimized. Such a control factor is called an adjust-
ment factor.
It is easy to verify that minimization of Q,(2), followed by adjusting the mean
function on target using an adjustment factor, is equivalent to minimizing Q(z) subject
to the constraint that the mean function is on target. This optimization procedure is
called a two-step procedure for obvious reasons. For further discussion of the 2-step
procedure and the S/N ratios, see Taguchi and Phadke [T6], Phadke and Dehnad [P4],
Leon, Shoemaker, and Kackar [L2], Nair and Pregibon [N2], and Box [B1].
It is important to be able to predict the combined effect of several control factors
from the knowledge of the effects of the individual control factors. The natural scale
of Q,(2) is not suitable for this purpose because it could easily give us a negative pred-
iction for Q,(2) which is absurd. By using the familiar decibel scale, we not only
avoid the negative prediction, but also improve the additivity of the factor effects.
Thus, to minimize the sensitivity to noise factors, we maximize 1) which is given by
1 = -10 logio Qq(z). 6.9)
Note that the constant k in Q,(2) and sometimes some other constants are generally
ignored because they have no effect on the optimization,108 Signal-to-Noise Ratios Chap. 5
Following Taguchi, we refer to 1) as the S/N ratio. In the polysilicon deposition
example discussed in Section 5.1, we saw that Q, a (u2/07), where 0? is the effect of
the noise factors, and p is the desirable part of the thickness data. Thus Q, is the ratio
of the power of the signal (the desirable part) to the power of the noise (the undesirable
part), As will be seen through the cases discussed in the subsequent sections of this
chapter, whenever a scaling type of an adjustment factor exists, Qg takes the form of a
ratio of the power of the signal (the desirable part of the response) to the power of the
noise (the undesirable part of the response). Therefore, Q, and 7) are both referred to
as the S/N ratio. As a matter of convention, we call Q, and 7 the S/N ratio, even in
other cases where “ratio” form is not that apparent.
The general optimization strategy can be summarized as follows:
1. Evaluate the effects of the control factors under consideration on 7) and on the
mean function.
2. For the factors that have a significant effect on 1, select levels that maximize 1.
3. Select any factor that has no effect on 7) but a significant effect on the mean
function as an adjustment factor. In practice, we must sometimes settle for a fac-
tor that has a small effect on 1 but a significant effect on the mean function as
an adjustment factor. Use the adjustment factor to bring the mean function on
target. Adjusting the mean function on target is the main quality control activity
in manufacturing. It is needed because of changing raw material, varying pro-
cessing conditions, etc. Thus, finding an adjustment factor that can be changed
conveniently during manufacturing is important. However, finding the level of
the adjustment factor that brings the mean precisely on target during product or
process design is not important.
4. For factors that have no effect on 7 and the mean function, we can choose any
level that is most convenient from the point of view of other considerations, such
as other quality characteristics and cost.
What adjustment is meaningful in a particular engineering problem and what fac-
tor can be used to achieve the adjustment depend on the nature of the particular prob-
lem. Subsequent sections discuss several common engineering problems and derive the
appropriate S/N ratios using the results of this section.
5.3 S/N RATIOS FOR STATIC PROBLEMS
Finding a correct objective function to maximize in an engineering design problem is
very important. Failure to do so, as we saw earlier, can lead to great inefficiencies in
experimentation and even wrong conclusions about the optimum levels, The task of
finding what adjustments are meaningful in a particular problem and determining the
correct S/N ratio is not always easy. Here, we describe some common types of static
problems and the corresponding S/N ratios,Sec. 5.3 ‘S/N Ratios for Static Problems 109
Minimizing the surface defect count and achieving target thickness in polysilicon
deposition are both examples of static problems. In each case, we are interested in a
fixed target, so that the signal factor is trivial, and for all practical purposes, we can
say it is absent. In contrast, the design of an electrical amplifier is a dynamic problem
in which the input signal is the signal factor and our requirement is to make the output
signal proportional to the input signal. The tracking of the input signal by the output
signal makes it a dynamic problem. We discuss dynamic problems in Section 5.4.
Static problems can be further characterized by the nature of the quality charac-
teristic. Recall that the response we observe for improving quality is called quality
characteristic. The classification of static problems is based on whether the quality
characteristic is:
* Continuous or discrete
* Scalar, vector, or curve (such as frequency response function)
* Positive or covers the entire real line
+ Such that the target value is extreme or finite
Commonly encountered types of static problems and the corresponding S/N
ratios are described below (see also Taguchi and Phadke [T6] and Phadke and Dehnad
[P4)). In these problems, the signal factor takes only one value. Thus, we denote by
Yis ¥ar‘t*+ Jn the n observations of the quality characteristic under different noise
conditions.
(a) Smaller-the-better Type Problem
Here, the quality characteristic is continuous and nonnegative—that is, it can take any
value from 0 to eo. Its most desired value is zero, Such problems are characterized by
the absence of a scaling factor or any other adjustment factor. The surface defect count
is an example of this type of problem. Note that for all practical purposes we can treat
this count as a continuous variable.
Another example of a smaller-the-better type problem is the pollution from a
power plant. One might say that we can reduce the total pollutants emitted by reduc-
ing the power output of the plant. So why not consider the power output as an adjust-
ment factor? However, reducing pollution by reducing power consumption does not
signify any quality improvement for the power plant. Hence, it is inappropriate to
think of the power output as an adjustment factor. In fact, we should consider the pol-
lution per megawatt-hour of power output as the quality characteristic to be improved
instead of the pollution itself.
Additional examples of smaller-the-better type problems are electromagnetic radi-
ation from telecommunications equipment, leakage current in integrated circuits, and
corrosion of metals and other materials.110 Signal-to-Noise Ratios Chap. 5
Because there is no adjustment factor in these problems, we should simply
minimize the quality loss without adjustment—that is, we should minimize
@Q = k(mean square quality characteristic)
aki F ye
~4{2 PP (5.10)
isl
ing Q is equivalent to maximizing 1) defined by the following equation,
1 = — 10 logy (mean square quality characteristic)
142
=~ 10 logio |= & y?]- (5.1)
a
Note that we have ignored the constant & and expressed the quality loss in the decibel
scale,
In this case the signal is constant, namely to make the quality characteristic equal
to zero, Therefore, the S/N ratio, 7, measures merely the effect of noise.
(b) Nominal-the-best Type Problem
Here, as in smaller-the-better type problem, the quality characteristic is continuous and
nonnegative—that is, it can take any value from 0 to ee. Its target value is nonzero and
finite, For these problems when the mean becomes zero, the variance also becomes
zero. Also, for these problems we can find a scaling factor that can serve as an adjust-
ment factor to move the mean on target.
This type of problem occurs frequently in engineering design. We have already
discussed the problem in great detail with particular reference to achieving target thick-
ness in polysilicon deposition. The objective function to be maximized for such prob-
lems is
wt
n= 10 logo (5.12)Sec. 5.3 SIN Ratios for Static Problems 1
where
z oO - By. (5.13)
In some situations, the scaling factor can be identified readily through engineer-
ing expertise. In other situations, we can identify a suitable scaling factor through
experimentation.
The optimization of the nominal-the-best problems can be accomplished in two
steps:
1. Maximize 1) or minimize sensitivity to noise. During this step we select the lev-
els of control factors to maximize 1) while we ignore the mean.
2. Adjust the mean on target. During this step we use the adjustment factor to
bring the mean on target without changing 1.
Note that as we explained in Section 5.1, we should not attempt to minimize o
and then bring the mean on target.
(c) Larger-the-better Type Problem
Here, the quality characteristic is continuous and nonnegative, and we would like it to
be as large as possible. Also, we do not have any adjustment factor. Examples of
such problems are the mechanical strength of a wire per unit cross-section area, the
miles driven per gallon of fuel for an automobile carrying a certain amount of load,
etc. This problem can be transformed into a smaller-the-better type problem by consid-
ering the reciprocal of the quality characteristic. The objective function to be maxim-
ized in this case is given by
1) =— 10 logyo (mean square reciprocal quality characteristic)
1 | (8.14)
1
=-10 logo |L &
“ere [ y
The following questions are often asked about the larger-the-better type prob-
lems: Why do we take the reciprocal of a larger-the-better type characteristic and then
treat it as a smaller-the-better type characteristic? Why do we not maximize the mean
square quality characteristic? This can be understood from the following result from
mathematical statistics:
Mean square reciprocal quality characteristic = |1 + 3
w112 Signal-to-Noise Ratios Chap. 5
where } and 6 are the mean and variance of the quality characteristic. [Note that if y
denotes the quality characteristic, then the mean square reciprocal quality characteristic
is the same as the expected value of (1/y)’.] Minimizing the mean square reciprocal
quality characteristic implies maximizing j1 and minimizing 0”, which is the desired
thing to do. However, if we were to try to maximize the mean square quality charac-
teristic, which is equal to (4? + 6”), we would end up maximizing both 1 and o?,
which is not a desirable thing to do.
(d) Signed-target Type Problem
In this class of problems, the quality characteristic can take positive as well as negative
values, Often, the target value for the quality characteristic is zero. If not, the target
value can be made zero by appropriately selecting the reference value for the quality
characteristic. Here, we can find an adjustment factor that can move the mean without
changing the standard deviation. Note that signed-target problems are inherently dif
ferent from smaller-the-better type problems, even though in both cases the best value
is zero. In the signed-target problems, the quality characteristic’ can take positive as
well as negative values whereas in the smaller-the-better type problems the quality
characteristic cannot take negative values. The range of possible values for the quality
characteristic also distinguishes signed-target problems from nominal-the-best type
problems. There is one more distinguishing feature. In signed-target type problems
when the mean is zero, the standard deviation is not zero, but in nominal-the-best type
problems when the mean is zero, the standard deviation is also zero.
‘An example of signed-target problems is the de offset voltage of a differential
operational amplifier. The offset voltage could be positive or negative. If the offset
voltage is consistently off zero, then we can easily compensate for it in the circuit that
receives the output of the differential operational amplifier without affecting the stan-
dard deviation. The design of a differential operational amplifier is discussed in
Chapter 8.
In such problems, the objective function to be maximized is given by
1) = - 10 logy 0”
=- 10 logig
+ z ow (6.15)
Note that this type of problem occurs relatively less frequently compared to the
nominal-the-best type problems.Sec. 6.3 SIN Ratios for Static Problems 113
(e) Fraction Defective
This is the case when the quality characteristic, denoted by p, is a fraction taking values
between 0 and 1. Obviously, the best value for p is zero. Also, there is no adjustment
factor for these problems. When the fraction defective is p, on an average, we have to
manufacture 1/(1-p) pieces to produce one good piece. Thus, for every good piece pro-
duced, there is a waste and, hence, a loss that is equivalent to the cost of processing
{1/(1-p) -1] =p/(1-p) pieces. Thus, the quality loss is given by Q,
Q=k oo (5.16)
where k is the cost of processing one piece. Ignoring k, we obtain the objective function
to be maximized in the decibel scale as
10 logio
I (5.17)
lp
Note that the range of possible values of Q is 0 to oo, but the range of possible values of 7
is cc to cc, Therefore, the additivity of factor effects is better for than for Q. The S/N
ratio for the fraction-defective problems is the same as the familiar logit transform, which
is commonly used in biostatistics for studying drug response.
(f) Ordered Categorical
Here, the quality characteristic takes ordered categorical values. For example, after a
drug treatment we may observe a patient’s condition as belonging to one of the following
categories: worse, no change, good, or excellent. In this situation, the extreme category,
excellent, is the most desired category. However, in some other cases, an intermediate
category is the most desired category. For analyzing data from ordered categorical prob-
lems, we form cumulative categories and treat each category (or its compliment, as the
case may be) as a fraction-defective type problem. We give an example of analysis of
ordered categorical data in Section 5.5.
(g) Curve or Vector Response
‘As the name suggests, in this type of problem the quality characteristic is a curve or a
vector rather than a single point. The treatment of this type of problem is described in
Chapter 6 in conjunction with the design of an electrical filter and paper transport in
copying machines. The basic strategy in these problems is to break them into several
scalar problems where each problem is of one of the previously discussed types.114 Signal-to-Noise Ratios Chap. 5
5.4 S/N RATIOS FOR DYNAMIC PROBLEMS
Dynamic problems have even more variety than static problems because of the many
types of potential adjustments. Nonetheless, we use the general procedure described in
Sections 5.1 and 5.3 to derive the appropriate objective functions or the S/N ratio.
Dynamic problems can be classified according to the nature of the quality characteristic
and the signal factor, and, also, the ideal relationship between the signal factor and the
quality characteristic. Some common types of dynamic problems and the correspond-
ing S/N ratios are given below (see also Taguchi [T1], Taguchi and Phadke [T6], and
Phadke and Dehnad [P4]).
(a) Continuous-continuous (C-C)
Here, both the signal factor and the quality characteristic take positive or negative con-
tinuous values. When the signal is zero, that is, M = 0, the quality characteristic is also
zero, that is, y = 0. The ideal function for these problems is y = M, and a scaling fac-
tor exists that can be used to adjust the slope of the relationship between y and M.
This is one of the most common types of dynamic problems. The servomotor
example described in Section 5.2 is an example of this type. Some other examples are
analog telecommunication, design of test sets (such as voltmeter and flow meter), and
design of sensors (such as the crankshaft position sensor in an automobile).
We now derive the S/N ratio for the C-C type problems. As described in Sec-
tion 5.2, let yj; be the observed quality characteristic for the signal value M, and noise
condition x;. The quality loss without adjustment is given by
The quality loss has two components. One is due to the deviation from linearity and
the other is due to the slope being other than one. Of the two components, the latter
can be eliminated by adjusting the slope. In order to find the correct adjustment for
given control factor settings, we must first estimate the slope of the best linear relation-
ship between y;; and M;. Consider the regression of y;; on M; given by
Ji = BM; + 4 (5.18)
where B is the slope and ¢,; is the error. The slope { can be estimated by the least
squares criterion as follows:Sec. 5.4 ‘S/N Ratios for Dynamic Problems: 115
(5.19)
that is (5.20)
that is (5.21)
Note that Equation (5.20) is nothing but a special case of the general Equation
(5.7) for determining the best adjustment. Here, fg (yj) = (1/B) yi = vj and B is the
same as the index R. Also note that the least squares criterion is analogous to the cri-
terion of making the error [(1/B) yj; — Mj] orthogonal to the signal, M;.
The quality loss after adjustment is given by
(5.21)
Oy - BM).
“Ms
M=116 Signal-to-Noise Ratios Chap. §
Minimizing Q, is equivalent to maximizing 1) given by
gs
11 = 10 logig >. (5.22)
oF
Note that B is the change in y produced by a unit change in M. Thus, B? quantifies the
effect of signal. ‘The denominator 67 is the effect of noise. Hence, 1) is called the S/N
ratio. Note that 62 includes sensitivity to noise factors as well as the nonlinearity of
the relationship between y and M. Thus, maximization of 1) leads to reduction in non-
linearity along with the reduction in sensitivity to noise factors.
In summary, the C-C type problems are optimized by maximizing 1 given by
Equation (5.22). After maximization of 7), the slope is adjusted by a suitable scaling
factor. Note that any control factor that has no effect on 1 but an appreciable effect on
B can serve as a scaling factor.
Although we have shown the optimization for the target function y = M, it is still
valid for all target functions that can be obtained by adjusting the slope—that is, the
optimization is valid for any target functions of the form y = ByM where Bo is the
desired slope.
Another variation of the C-C type target function is
Oo + BoM. (5.23)
In this case, we must consider two adjustments: one for the intercept and the other for
the slope. One might think of this as a vector adjustment factor. The S/N ratio to be
maximized for this problem can be shown to be 1), given by Equation (5.22). The two
adjustment factors should be able to change the intercept and the slope, and should
have no effect on 1.
(b) Continuous-digital (C-D)
A temperature controller where the input temperature setting is continuous, while the
output (which is the ON or OFF state of the heating unit) is discrete is an example of
the C-D type problem. Such problems can be divided into two separate problems: one
for the ON function and the other for the OFF function. Each of these problems can
be viewed as a separate continuous-continuous or nominal-the-best type problem. The
design of a temperature control circuit is discussed in detail in Chapter 9.
(c) Digital-continuous (D-C)
The familiar digital-to-analog converter is an example of the D-C case problem. Here
again, we separate the problems of converting the 0 and 1 bits into the respectiveSec. 5.4 SIN Ratios for Dynamic Problems. 7
continuous values, The conversion of 0, as well as the conversion of 1, can be viewed
as a nominal-the-best type static problem.
(d) Digital-digital (D-D)
Digital communication systems, computer operations, etc., where both the signal factor
and the quality characteristic are digital, are examples of the D-D type problem. Here,
the ideal function is that whenever 0 is transmitted, it should be received as 0, and
whenever | is transmitted, it should be received as 1. Let us now derive an appropri-
ate objective function for minimizing sensitivity to noise.
Here, the signal values for testing are My = 0 and M, = 1. Suppose under cer-
tain settings of control factors and noise conditions, the probability of receiving 1,
when 0 is transmitted, is p (see Table 5.3). Thus, the average value of the received
signal, which is the same as the quality characteristic, is p and the corresponding vari-
ance is p(1-p). Similarly, suppose the probability of receiving 0, when 1 is transmit-
ted, is q. Then, the average value of the corresponding received signal is (I-g) and the
corresponding variance is q(1-q). The ideal transmit-receive relationship and the
observed transmit-receive relationship are shown graphically in Figure 5.3, Although
the signal factor and the quality characteristic take only 0-1 values, for convenience we
represent the transmit-receive relationship as a straight line. Let us now examine the
possible adjustments.
TABLE 5.3 TRANSMIT-RECEIVE RELATIONSHIP FOR DIGITAL COMMUNICATION
Probabilities Associated Properties of the
with the Received Signal Received Signal
o 1 Mean | Variance
o l-p Pe P p(l-p)
‘Transmitted
Signal
1 q 1-q 1-q q(l-q)
It is well-known that 2 communication system is inefficient if the errors of
transmitting 0 and 1 are unequal. More efficient transmission is achieved by making
p =. This can be accomplished by a leveling operation, an operation such as chang-
ing the threshold. The leveling operation can be conceptualized as follows: Under-
neath the transmission of a digital signal, there is a continuous signal such as voltage,
frequency, or phase. If it is at all possible and convenient to observe the underlying118 Signal-to-Noise Ratios Chap. §
continuous variable, we should prefer it. In that case, the problem can be classified as
a C-D type and dealt with by the procedure described earlier. Here, we consider the
situation when it is not possible to measure the continuous variable.
(c) After Leveling
(b) Observed
Function
Received Signal
Transmitted Signal
Figure 5.3 Digital communication,
Figure 5.4 shows possible distributions of the continuous variable received at the
output terminal when 0 or 1 are transmitted. If the threshold value is R ,, the errors of
© would be far more likely than the errors of 1. However, if the threshold is moved toSec. 5.4 S/N Ratios for Dynamic Problems 119
R>, we would get approximately equal errors of 0 and 1
. The effect of this adjustment
is also to reduce the total error probability (p + q).
f(vloy
(a) When the threshold is at R;, the error probabilities p and q are not equal.
Avo)
{b) By adjusting the threshold to, , we can make the two error probabilities equal, i.e. p'=q'.
Figure 5.4 Effect of leveling on error probabi
How does one determine p’ (which is equal to q’) corresponding to the observed
error rates p and q? The relationship between p’, p, and q will obviously depend on
the continuous distribution. However, we are considering a situation where we do not120 Signal-to-Noise Ratios Chap. 5
have the ability to observe the distributions. Taguchi has suggested the use of the fol-
lowing relationship for estimating p’ after equalization or leveling:
— 10 logio
pb P q
2 x 10 lo; = - 10 lo : (5.24)
B10 E | B10 ( T (5.24)
The two terms on the right hand side of Equation (5.24) are fraction-defective type S/N
ratios for the separate problems of the errors of 0 and errors of 1. Equation (5.24)
asserts that the effect of equalization is to make the two S/N ratios equal to the average
of the S/N ratios before equalization.
We can rewrite Equation (5.24) as follows:
“1
top. Asa. . (5.25)
Po4
Equation (5.25) provides an explicit expression computing p’ from p and g.
The transmit-receive relationship after leveling is depicted by line c in Fig-
ure 5.3. When 0 is transmitted, the mean value of the received signal yo, is p’.. When
1 is transmitted, the mean value of the received signal, yi, is (I-g’) = (I-p’). In both
cases the variance is p’(I-p’).
Both the lines a and c pass through the point (0.5, 0.5). But their slopes are not
equal. The slope of line c is
= OP" § ay"
B 05 (1-2p").
Thus, the quality loss after adjusting the slope is given by
oF p'(-p’)
Ou kak ape Gan
Thus, ignoring the constant &, the S/N ratio to be optimized is given by
1 = 10 logy oer 2 (5.27)Sec. 5.5 Analysis of Ordered Categorical Data 121
Observe that (1-2p’) is the difference of the averages of the received signal when 0
and 1 are transmitted. The quantity p’(1 — p’) is the variance of the received signal.
So 1 measures the ability of the communication system to discriminate between 0 and
1 at the receiving terminal.
The strategy to optimize a D-D system is to maximize 1, and then use a control
factor which has no effect on 1, but can alter the ratio p:q to equalize the two error
probabilities.
Chemical separation processes can also be viewed as D-D type dynamic prob-
lems. For example, when making iron in a blast furnace, the goal is to separate the
iron molecules from the impurities. Molten iron stays at the bottom and the impurities
float as slag on top of the molten metal. Suppose 100 p percent of iron molecules go
into the slag and 100 q percent of a particular impurity go in the molten metal.
Minimization of p and q can be accomplished by maximizing the S/N ratio given by
Equation (5.27). In this problem, however, the losses due to iron molecules going into
the slag and impurities going into the iron are not equal. We may wish to make the
p:q ratio different from 1:1. The desired p:q ratio can be accomplished during adjust-
ment without altering 1.
5.5 ANALYSIS OF ORDERED CATEGORICAL DATA
Data observed in many matrix experiments is ordered categorical because of the
inherent nature of the quality characteristic or because of the convenience of the meas-
urement technique. Let us consider the surface defect data from the polysilicon deposi-
tion case study of Chapter 4, which we will use to illustrate the analysis of ordered
categorical data. Suppose, because of the inconvenience of counting the number of
surface defects, the experimenters had decided to record the data in the following sub-
jective categories, listed in progressively undersirable order: practically no surface
defects, very few defects, some defects, many defects, and too many defects. For our
illustration, we will take the observations listed in Table 4.4 (a) and associate with
them categories I through V as follows:
I: 0—3 defects
Tl: 4— 30 defects
TIL : 31 — 300 defects
IV: 301 — 1000 defects
V_: 1001 and more defects
Thus, among the nine observations of experiment 2, five belong to category I, two to
category Il, three to category III, and none to categories IV and V. The categorical
data for the 18 experiments are listed in Table 5.122 Signal-to-Noise Ratios Chap. 5
We will now describe Taguchi’s accumulation analysis method [T7, T1], which
is an effective method for determining optimum control factor settings in the case of
ordered categorical data. (See Nair [N1} for an alternate method of analyzing ordered
categorical data.) The first step is to define cumulative categories as follows:
@= : O—3 defects
(dD = HI : O— 30 defects
1) = ++I : 0 — 300 defects
(IV) =I+I+II+IV «0 — 1000 defects
(V) = I+I+I04IV+V : 0 — oo defects
The number of observations in the cumulative categories for the eighteen experiments
are listed in Table 5.4. For example, the number of observations in the five cumulative
categories for experiment 2 are 5, 7, 9, 9, and 9, respectively.
The second step is to determine the effects of the factor levels on the probability
distribution by the defect categories. This is accomplished analogous to the determina-
tion of the factor effects described in Chapter 3. To determine the effect of tempera-
ture of level Ai, we identify the six experiments conducted at that level and sum the
observations in each cumulative category as follows:
Cumulative Categories
@ @ am dy (vy
Experiment 1 9 9 9 9 9
Experiment 2 5 7 9 9 9
Experiment 3 1 1 7 9 9
Experiment 10 9 9 9 9 9
Experiment 11 8 9 9 9 9
Experiment 12 2 3 8 8 9
Total 34 40 51 53 54
The number of observations in the five cumulative categories for every factor
level are listed in Table 5.5. Note that the entry for the cumulative category (V) is
equal to the total number of observations for the particular factor level and that entry is
uniformly 54 in this case study. If we had used the 2-level column, namely column 1,
or if we had used the dummy level technique (described in Chapter 7), the entry in
category (V) would not be 54. The probabilities for the cumulative categories shown
in Table 5.5 are obtained by dividing the number of observations in each cumulative
category by the entry in the last cumulative category for that factor level, which is 54
for the present case.Sec. 5.5 Analysis of Ordered Categorical Data
TABLE 5.4 CATEGORIZED DATA FOR SURFACE DEFECTS
Number of Observations Number of Observations
by Categories by Cumulative Categories
Expt.
N. | 1 0 Mm Wovi@ a am ay Ww
1 9 oO o 0 0 9 9 9 9 9
2/5 2 2 0 of s 7 9 9 9
3 1 oO 6 2 0 1 1 7 9 9
4 0 8 1 oO 0 0 8 9 9 9
5 oO 1 o 4 4 0 1 1 5 2
6/1 0 4 1 3f1 1 5 6 9
7/0 1 1 4 3/0 1 2 6 9
8 |3 0 2 1 3/3 3 5s 6 9
9 0 o 0 4 5 0 0 0 4 9
w}9 0 0 0 of 9 9 9 9 9
nu }|s 1 0 0 of 8 9 9 9 9
wfi]2 3 3 0 1f2 5s 8 8 9
13 4 2 Zz 1 0 4 6 8 9 9
i 2 3 4 0 0 2 5 9 9 9
1s 0 1 1 1 6 0 1 2 3 9
16 3 4 2 0 oO 3 7 9 9 9
iy 2 1 o 2 4 2 3 a 5 =
at 0 0 0 Z 7 0 0 0 2 9
Total | 49 27 28 22 364/49 76 104-126 162
123
The third step in data analysis is to plot the cumulative probabilities. Two useful
plotting methods are the line plots shown in Figure 5.5 and the bar plots shown in Fig-
ure 5,6. From both figures, it is apparent that temperature (factor A) and pressure (fac-
tor B) have the largest impact on the cumulative distribution function for the surface
defects. The effects of the remaining four factors are small compared to temperature
and pressure. Among the factors C, D, E, and F, factor F has a somewhat larger effect.
In the line plots of Figure 5.5, for each control factor we look for a level for
which the curve is uniformly higher than the curves for the other levels of that factor.124 ‘Signal-to-Noise Ratios Chap. 5
TABLE 5.5 FACTOR EFFECTS FOR THE CATEGORIZED SURFACE DEFECT DATA
Number of Observations by Probabilities for the
‘Cumutative Categories ‘Cumulative Categories
Factor Level © a ap ay Mi] o a am ay
AxTo 2 34 41 $4 |] 013 O41 063 0.76 1.00
‘A. Temperature (©) AyiTo-25. | 34 40 515354 |] 063 0.74 094 098 1.00
7
Ay Tot25 | 8 14 19 32 54 | 01S 026 035 0.59 1.00
B. Pressure (mio) By: Py-200 | 25 40 46 St SA | 0.46 0.74 0.85 094 1.00
Br: Po 20 28 36 43 «54 | 037 0.52 067 080 1.00
By:Pot200 | 4 8 22 32 $4 |}007 0.15 041 059 100
C, Nitrogen (seem) 19 30 32 39 «54 || 035 056 059 0.72 1,00
1 2 28 «63 ~— $4 | 020 037 052 072 100
19 2 44 «48 «54 [035 048 O81 089 1.00
D. Silane (sccm) 2 25 34 41 54 |} 037 046 063 0.76 1.00
13 31 42 4454 |] 024 0.57 078 O81 1.00
16 20 28 41 54 |/030 037 052 075 1.00
E, Settling time (min) Ey: fo 227384354 |] 039 0.50 0.70 080 100
Exrtot8 | 16 29 36 42-54 11030 054 0.67 0.78 1.00
Ey toH16 | 12 20 30 4154 {1022 037 056 0.76 1.00
F, Cleaning method Fy: None | 21 23 26 «3454 || 039 043 048 063 1.00
FCM, | 21 30 40 4 54 |}039 056 074 085 1,00
Fy CMs 7 2B 38 46 54 013 043 0.70 085 1.00
A uniformly higher curve implies that the particular factor level produces more obser-
vations with lower defect counts; hence, it is the best level. In Figure 5.6, we look for
a larger height of category I and smaller height of category V. From the two figures, it
clear that A,, By, and F» are the best levels for the respective factors. The choice
of the best level is ‘not as clear for the remaining three factors. However, the curves
for the factor levels C2, D3, and E; lie uniformly lower among the curves for all lev-
els of the respective factors, and these levels must be avoided. Thus, the optimum set-
tings suggested by the analysis are A1B,(C 1/C3) (D1/D2) (E/E) F2, By comparing
Figures 5.5, 5.6, and 4.5 it is apparent that the conclusions based on the ordered
categorical data are consistent with the conclusions based on the actual counts, except
for factors C, D, and E whose effects are rather small.
The next step in the analysis is to predict the distribution of defect counts under
the starting and optimum conditions. This can be achieved analogous to the procedure
described in Chapter 3, except that we must use the omega transform, also known as125
Sec. 5.5 Analysis of Ordered Categorical Data
Silane
Temperature
0.5 05
Probability
a) ai) av) (wy)
oO a a ay)
Pressure tot Setting Time
0.5
Probability
a ay ain avy (vy
way ay ay) (Y)
Cleaning Method
Ni
wvogen 1.0
05 05
Probability
(a aly av) (vy
(ay (lv) (Y)
Cumulative Categories
Cumulative Categories
Figure 5.5 Line plots of the factor effects for the categorized surface defect data.Probability
Probability
Signal-to-Noise Ratios Chap. 5
Nov >1000
Ar Ag As D, Dz, Ds K
Temperature Silane J iv. 301-1000
M 31-300
UN 4-30
1 0-3
Nitrogen Cleaning Method
Figure 5.6 Bar plots of the factor effects for the categorized surface defect data.Sec. 5.5 Analysis of Ordered Categorical Data 127
the logit transform, of the probabilities for the cumulative categories (see
Taguchi [T1]). The omega transform for probability p is given by the following equa-
tion:
@(p) = 10 logie GE
Note that the omega transform is the same as the S/N ratio for the fraction defective
type of static problems.
Let us take the optimum settings recommended in Chapter 4, namely
A1B2C\D3E2F2, to illustrate the computation of the predicted distribution. Since,
according to Figure 5.5, factors C, D, and E have a small effect, we will only use the
effects of factors A, B, and F in the prediction formula.
The average probability for category (I) taken over 18 experiments is
ly = 49/162 = 0.30 (see Table 5.4). Referring to Table 5.5, the predicted omega
value for category (1), denoted by @4,2,c,05£,F,1/)» an be computed as follows:
O42 D3EF A = Ou + {oan - ay]
ee
= (0.30) + [0(0.63) ~ 0(0.30)]
+ [0(0.37) — 0X(0.30)] + [0(0.39) — @(0.30)]
= 3.68 + [2.31 + 3.68] + [-2.31 + 3.68]
+ [-1.94 + 3.68]
= 5.42 dB.
Then, by the inverse omega transform, the predicted probability for category (1) is 0.78.
Predicted probabilities for the cumulative categories (Il), (II) and (IV) can be obtained
analogously. Prediction is obviously 1.0 for category (V). The predicted probabilities
for the cumulative categories for the starting and the optimum settings are listed in128 Signal-to-Noise Ratios Chap. 5
Table 5.6. These probabilities are also plotted in Figure 5.7. It is clear that the recom-
mended optimum conditions give much higher probabilities for the low defect count
categories when compared to the starting conditions. The probability of 0-3 defects,
(category D), is predicted to increase from 0.23 to 0.78 by changing from starting to the
‘optimum conditions. Likewise, the probability for the 1001 and more category reduces
from 0.37 to 0.01.
The predicted probabilities for the cumulative categories should be compared
with those observed under the starting and optimum conditions to verify that the addi-
tive model is appropriate for the case study. This is the same as the Step 8 of the
Robust Design cycle described in Chapter 4. Selection of appropriate orthogonal
arrays for a case study, as well as the confounding of 2-factor interactions with the
main effects, is discussed in Chapter 7, However, we note here that when accumula-
tion analysis is used, 3-level orthogonal arrays should be preferred over 2-level orthog-
onal arrays for minimizing the possibility of misleading conclusions about the factor
effects, Particularly, the orthogonal arrays Lig and Lg are most suitable. In any case,
the verification experiment is important for ensuring that the conclusions about the fac-
tor effects are valid,
5.6 SUMMARY
The quadratic loss function is ideally suited for evaluating the quality level of a
product as it is shipped by a supplier to a customer. It typically has two com-
ponents; one related to the deviation of the product’s function from the target, and
the other related to the sensitivity to noise factors,
* S/N ratio developed by Genichi Taguchi, is a predictor of quality loss after making
certain simple adjustments to the product’s function, It isolates the sensitivity of
the product’s function to noise factors. In Robust Design we use the S/N ratio as
the objective function to be maximized,
* Benefits of using the S/N ratio for optimizing a product or process design are:
— Optimization does not depend on the target mean function, Thus, the design
can be reused in other applications where the target is different.
— Design of subsystems and components can proceed in parallel. Specifications
for the mean function of the subsystems and components can be changed
later during design integration without adversely affecting the sensitivity to
noise factors.
— Additivity of the factor effects is good when an appropriate S/N ratio is used.
Otherwise, large interactions among the control factors may occur, resulting
in high cost of experimentation and potentially unreliable results.Sec. 5.6 Summary 129
TABLE 5.6 PREDICTED PROBABILITIES FOR THE CUMULATIVE CATEGORIES
© Values for the Probabilities for the
Control Cumulative Categories Cumulative Categories
Factor
Settings wm ab (ny avy) | (¥) a ay | amy } av) | (vy)
Optimum:
A\B2C\D3E,F; | 542 | 698 | 1453 | 19.45 | » | 0.78 | 0.83 } 0.97 | 0.99 | 1.00
Starting
A,B,C \D3E\F, | -3.68 | -141 | 0.04 | 2.34 | « | 0.23 | 0.42 | 0.50 | 0.63 | 1.00
1
L _L
Probability
() ay (WY) (V)
Cumulative Categories
Figure 5.7 Predicted probabilities for the cumulative categories.130
Signal-to-Noise Ratios Chap. 5
— Overall productivity of the development activity is improved.
Robust Design problems can be divided into two broad classes: static problems,
where the target value for the quality characteristic is fixed, and dynamic problems,
where the quality characteristic is expected to follow the signal factor.
Common types of static problems and the corresponding S/N ratios are summarized
in Table 5.7.
Common types of dynamic problems and the corresponding S/N ratios are summar-
ized in Table 5.8.
For the problems where an adjustment factor does not exist, the optimization is
done by simply maximizing the S/N ratio.
For the problems where an adjustment factor exists, the problem can be generically
stated as minimize sensitivity to noise factors while keeping the mean function on
target. By using S/N ratio, these problems can be converted into unconstrained
optimization problems and solved by the following two-step procedure:
1. Maximize S/N ratio, without worrying about the mean function.
2. Adjust the mean on target by using the adjustment factor.
The optimization strategy consists of the following steps:
1, Evaluate the effects of control factors under consideration on 7) and on the
mean function.
2. For factors that have a significant effect on 1, select levels that maximize 1).
3. Select any factor that has no effect on 1 but a significant effect on the mean
function as an adjustment factor. Use it to bring the mean function on target.
(In practice, we must sometimes settle for a factor that has a small effect on
1) but a significant effect on the mean function as an adjustment factor.)
4. For factors that have no effect on 1) and the mean function, we can choose
any level that is convenient from other considerations such as cost or other
quality characteristics.Sec. 5.6 ‘Summary 131
TABLE 5.7 S/N RATIOS FOR STATIC PROBLEMS
Problem | Range for the | Ideal Adjust- SIN Ratio
Type Observations Value ment and Comments
Small --better a
paeethebetes | Oy 3.25 micrometers.138 Achieving Additivity Chap. 6
The proper quality characteristic for photolithography is the actual line-width
measurement, for example, 2.8 or 3.1 micrometers. When the line-width distribution is
known, it is an easy task to compute the yield, Note that the line width is directly
related to the amount of energy transferred during exposure and developing. The more
the energy transferred, the larger is the line width. Also, our experience with photo-
lithography has shown that the actual line width measurement is a monotonic charac-
teristic with respect to the control factors. Further, when the target dimensions are
changed, we can use the same experimental data to determine the new optimum set-
tings of the control factor. In this case, the measurement technique is not much of an
issue, although taking categorical data (small, desired, or large) is generally a little
easier than recording actual measurements. A case study where the actual line width
was used to optimize a photolithography process used in VLSI circuit fabrication is
given by Phadke, Kackar, Speeney and Grieco [P5].
In summary, for the photolithography example, yield is the worst quality charac-
teristic, ordered categorical data are better, and actual line width measurement (continu-
ous variable) is the best.
Spray Painting Process
This example vividly illustrates the importance of energy transfer in selecting a quality
characteristic. Sagging is a common defect in spray painting. It is caused by forma-
tion of large paint drops that flow downward due to gravity. Is the distance through
which the paint drops sag a good quality characteristic? No, because this distance is
primarily controlled by gravity, and it is not related to the basic energy transfer in
spray painting. However, the size of the drops created by spray painting is directly
related to energy transfer and, thus, is a better quality characteristic. By taking the size
of the drops as the quality characteristic, we can block out the effect of gravity, an
extraneous phenomenon for the spray painting process.
Yield of a Chemical Process
There are many chemical processes that begin with a chemical A, which after reaction,
becomes chemical B and, if the reaction is allowed to continue, tums into chemical C.
If B is the desired product of the chemical process, then considering the yield of B as a
quality characteristic is a poor choice. As in the case of photolithography, the yield is
not a monotonic characteristic. A better quality characteristic for this experiment is the
concentration of each of the three chemicals. The concentration of A and the concen-
tration of A plus B possess the needed monotonicity property.
6.3 EXAMPLES OF S/N RATIOS
Basic types of Robust Design problems and the associated S/N ratios were described in
Chapter 5. A majority of Robust Design projects fall into one of these basic types ofSec.6.3 Examples of S/N Ratios 139
problems. This section gives three examples to illustrate the process of classification
of Robust Design problems. Two of these examples also show how a complex prob-
lem can be broken down into a composite of several basic types of problems.
Heat Exchanger Design
Heat exchangers are used to heat or cool fluids. For example, in a refrigerator a heat
exchanger coil is used inside the refrigerator compartment to transfer the heat from the
air in the compartment to the refrigerant fluid. This leads to lowering of the tempera-
ture inside the compartment. Outside the refrigerator, the heat from the refrigerant is
transferred to the room air through another heat exchanger.
In optimizing the designs of heat exchangers and other heat-transfer equipment,
defining the reference temperature is critical so that the optimization problem can be
correctly classified.
Consider the heat exchanger shown in Figure 6.1, which is used to cool the fluid
inside the inner tube. The inlet temperature of the fluid to be cooled is T,. As the
fluid moves through the tube, it loses heat progressively to the fluid outside the tube;
its outlet temperature is T2. The inlet and outlet temperature for the coolant fluid are
T3 and T4, respectively. Let the target outlet temperature for the fluid being cooled be
To. Also, suppose the customer's requirement is that |T,-T) | <10 °C. What is
the correct quality characteristic and S/N ratio for this Robust Design problem?
Coolant
+7;
Coolant
Ue
EJ Fluid to i Fluid to
be cooled be cooled
yt Te
Figure 6.1 Schematic diagram of a heat exchanger.
One choice is to take target temperature Ty as the reference temperature and
y= |T,-To| as the quality characteristic. Minimizing y is then the goal of this
experiment; thus, at first it would appear that y should be treated as a smaller-the-better
characteristic where the mean square value of y is minimized. The difficulty with this140 Achieving Additivity Chap. 6
formulation of the problem is that by taking the square of y the positive and negative
deviations in temperature are treated similarly. Consequently, interactions become
important. This can be understood as follows: If y is too large because T> is too large
compared to To, then y can be reduced by increasing the coil length. Note that a
longer coil length leads to more cooling of the fluid and, hence, smaller T7. On the
contrary, if y is too large because T2 is too small, then y can be reduced by decreasing
the coil length. Thus, there are two opposite actions that can reduce y, but they cannot
be distinguished by observing y. Therefore, y is not a good quality characteristic, and
this problem should not be treated as smaller-the-better type.
Here, the proper reference temperature is T, because it represents the lowest tem-
perature that could be achieved by the fluid inside the tube. Thus, the correct quality
characteristic is y’ =T—T. Note that y’ is always positive. Also, when the mean
of y” is zero, its variance must also be zero. Hence, the problem should be classified
as a nominal-the-best type with the target value of y’ equal to Ty—7'3. This formula-
tion does not have the complication of interaction we described with y as the quality
characteristic. Furthermore, if the target temperature Ty were changed, the information
obtained using y’ as the quality characteristic would still be useful. All that is neces-
sary is to adjust the mean temperature on the new target. However, if y were used as
the quality characteristic, the design would have to be reoptimized when Tg is changed,
which is undesirable. This loss of reusability is one of the reasons for lower R&D pro-
ductivity.
Paper Handling in Copying Machines
In a copying machine, a critical customer-observable response is the number of pages
copied before a paper-handling failure. In designing the paper-handling system we
might take A, the number of pages copied before failure, as the quality characteristic.
However, in this case, the number of pages that would have to be copied during the
copier development would be excessively large. Also, decoupling the designs of the
various modules is not possible when A is taken as the quality characteristic.
A close look at the paper-handling equipment reveals that there are two basic
functions in paper handling: paper feeding and paper transport. Paper feeding means
picking a sheet, either the original or a blank sheet. Paper transport means moving the
sheet from one station to another.
A schematic diagram of a paper feeder is shown in Figure 6.2(a). The two main
defects that arise in paper feeding are: no sheet fed or multiple sheets fed. A funda-
mental characteristic that controls paper feeding is the normal force needed to pick up a
sheet. Thus, we can measure the threshold force, F , to pick up just one sheet and the
threshold force, F2, to pick up two sheets. Note that the normal force is a control fac-
tor and that F, and F meet the guidelines listed in Section 6.1 and are better quality
characteristics compared to A. By making F; as small as possible and F, as large as
possible, we can improve the operating window F, — F> [see Figure 6.2(b)], reduce
both types of paper feeding defects, and thus increase 2. The idea of enlarging the
operating window as a means of improving product reliability is due to Clausing [C2].Sec. 6.3 Examples of S/N Ratios 141
Force
(CLITA
Guide
Ms
(a) Schematic Diagram of a Paper Feeder
Threshold Force
for Feeding
‘Two Sheets
Threshold Force
for Feeding
Single Sheet
Feeding Single Sheet Feeding Two Sheets
(b) Threshold Force and Operating Window
Figure 6.2 Design of a paper feeder.
Here, the appropriate S/N ratios for F, and F are, respectively, the smaller-the-
better type and the larger-the-better type. Note that the two threshold forces comprise
a vector quality characteristic. We must measure and optimize both of them. This is
what we mean by completeness of a quality characteristic.142 Achieving Additivity Chap. 6
In a copying machine, paper is moved through a 3-dimensional path using
several paper transport modules. Figure 6.3 shows a planar diagram for paper move-
ment through a single module. The fundamental characteristics in transporting paper
are the (x,y) movements of the center of the leading edge of the paper, the rotation
angle @ of the leading edge, and the time of arrival of the paper at the next module.
The lateral movement (y movement) of the paper can be taken care of by registration
against a guide. The remaining characteristics can then be addressed by placing two
sensors to measure the arrival times of the left and right ends of the leading edge of the
paper. Both of these arrival times have a common nonzero target mean and can be
classified as nominal-the-best type problems.
Beginning and End
- Positions of the
Leading Edge
Paper Position Paper Position
at the Beginning at the End
of a Transport of a Transport
Module Module
Figure 6.3 Planar diagram for paper transport.
Here, also, the two arrival times can be viewed as a vector quality characteristic.
Both times must be measured and optimized. If we omit one or the other, we cannot
guard against failure due to the paper getting twisted during transport. Also, by opti-
mizing for both the arrival times (that is, minimizing the variability of both the arrival
times and making their averages equal to each other), the design of each paper tran-
sport module can be decoupled from other modules. Optimizing each of the paper-
feeding and paper-transport characteristics, described above, automatically optimizes 2.
Thus, the problem of life improvement is broken down into several problems of
nominal-the-best, smaller-the-better, and larger-the-better types. It is quite obvious that
optimizing these separate problems automatically improves A, the number of pages
copied before failure.Sec. 6.3 Examples of S/N Ratios 143
Electrical Filter Design
Electrical filters are used widely in many electronic products, including telecommunica-
tions and audio/video equipment. These circuits amplify (or attenuate) the components
of the input voltage signal at different frequencies according to the specified frequency
response function [see Figure 6.4(a)]. An equalizer, used in high-fidelity audio equip-
ment, is an example of a filter familiar to many people. It is used to amplify or attenu-
ate the sounds of different musical instruments in a symphony orchestra.
Electrical
Filter
(@) Block Diagram of an Electrical Filter
Specified Upper Desired Frequency
and Lower Limits Response Function
‘on the Gain
(b) Frequency Response Function
Figure 6.4 Design of an electrical filter.
Figure 6.4(b) shows an example of a desired frequency response function and the
customer-specified upper and lower limits for the gain. If the customer-specified gain
limits are violated at any frequency, the filter is considered defective. From the
preceding discussion in this chapter, it should be apparent that counting the percentage
of defective filters, though easiest to measure, is not a good quality characteristic.144 Achieving Additivity Chap. 6
'
This problem can be solved more efficiently by dividing the frequencies into
several bands, say five bands as shown in Figure 6.4(b). For each of the middle three
bands, we must achieve gain equal to the gain specified by the frequency response
function. Therefore, we treat these as three separate nominal-the-best type problems.
For each band, we must identify a separate adjustment factor that can be used to set the
mean gain at the right level. Note that a resistor, capacitor, or some other component
in the circuit can serve as an adjustment factor. For any one of these bands, the adjust-
ment factors for the other two bands should be included as noise factors, along with
other noise factors, such as component tolerances and temperature. Then, adjusting the
gain in one band would have a minimal effect on the mean gain in the other bands.
For each of the two end bands, we must make the gain as small as possible.
Accordingly, these two bands belong to the class of smaller-the-better type problems.
Thus, we have divided a problem where we had to achieve a desired curvilinear
response into several familiar problems.
6.4 SELECTION OF CONTROL FACTORS
Additivity of the effects of the control factors is also influenced by the selection of the
control factors and their levels. By definition, the control factors are factors whose
levels can be selected by the designer. Next, it is important that each control factor
influence a distinct aspect of the basic phenomenon affecting the quality characteristic.
If two or more control factors affect the same aspect of the basic phenomenon, then the
ibility of interaction among these factors becomes high. When such a situation is
recognized, we can reduce or even eliminate the interaction through proper transforma-
tion of the control factor levels. We refer to this transformation as sliding levels. The
following examples illustrate some of the important considerations in the selection of
control factors. A qualitative understanding of how control factors affect a product is
very important in their selection.
Treatment of Asthmatic Patients
This rather simplified example brings out an important consideration in the selection of
control factors. Consider three drugs (A, B, and C) proposed by three scientists for
treating wheezing in asthmatic patients. Suppose the drug test results indicate that if
no drug is given, the patient's condition is bad. If only drug A is given, the patients
get somewhat better; if only drug B is given, the patients feel well; and if only drug C
is given, the patients feel moderately good. Can the three drugs be considered as three
separate control factors? If so, then a natural expectation is that by giving all three
drugs simultaneously, we can make the patients very well.
Suppose we take a close look at these drugs to find out that all three drugs con-
tain theophillin as an active ingredient, which helps dilate the bronchial tubes. Drug A
has 70 percent of full dose, drug B has 100 percent of full dose, and drug C has 150Sec. 6.4 Selection of Control Factors 145
percent of full dose. Administering all three drugs simultaneously implies giving 320
percent of full dose of theophillin. This could significantly worsen the patient’s condi-
tion. Therefore, the three drugs interact. The proper way to approach this problem is
to think of the theophillin concentration as a single control factor with four levels: 0
percent (no drug), 70 percent (drug A), 100 percent (drug B), and 150 percent (drug
©). Here the other ingredients of the three drugs should be examined as additional
potential control factors.
Photolithography Process
Aperture and exposure are among the two important control factors in the photolithog-
raphy process used in VLSI fabrication (see Phadke, Kackar, Speeney, and Grieco
[P5]). The width of the lines printed by photolithography depends on the depth of field
and the total light energy falling on. the photoresist. The aperture alone determines the
depth of field. However, both aperture and exposure time influence the total light
energy. In fact, the total light energy for fixed light intensity is proportional to the
product of the aperture and exposure time. ‘Thus, if we chose aperture and exposure
time as control factors, we would expect to see strong interaction between these two
factors. The appropriate control factors for this situation are aperture and total light
energy.
Suppose 1.2N, N, and 0.8N are used as three levels for light energy, where N
stands for the nominal level or the middle level. We can achieve these levels of light
energy for various apertures through the sliding levels of exposure as indicated in
Table 6.3. The level N of total light energy can be achieved by setting exposure at
120 when aperture is 1, exposure at 90 when aperture is 2, and exposure at 50 when
aperture is 3.
TABLE 6.3 SLIDING LEVELS FOR EXPOSURE iN PHOTOLITHOGRAPHY PROCESS
Exposure (PEP-Setting)
12N N 0.8N
1| 96 10 | 144
Aperture 2) 2 90 | 108
3| 40 50 60
The thickness of the photoresist layer is another fundamental characteristic that
influences the line width. The thickness can be changed by controlling the photoresist146 Achieving Additivity Chap. 6
viscosity and the spin speed. Here too, sliding levels of spin speed should be con-
sidered to minimize interactions (see Phadke, Kackar, Speeney, and Grieco [P5]).
6.5 ROLE OF ORTHOGONAL ARRAYS
Matrix experiments using orthogonal arrays play a crucial role in achieving
additivity—they provide a test to see whether interactions are large compared to the
main effects.
Consider a matrix experiment where we assign only main effects to the columns
‘of an orthogonal array so that the interactions (2-factor, 3-factor, etc.) are confounded
with the main effects (see Chapter 7). There are two possibilities for the relative mag-
nitudes of the interactions:
1. If one or more of these interactions are large compared to the main effects, then
the main effects with which these interactions are confounded will be estimated
with large bias or error. Consequently, the observed response under the
predicted optimum conditions will not match the prediction based on the additive
model. Thus, in this case the verification experiment will point out that large
interactions are present.
2. On the contrary, if the interactions are small compared to the main effects, then
the observed response under the predicted optimum conditions will match the
prediction based on the additive model. Thus, in this case the verification exper-
iment will confirm that the main effects dominate the interactions.
Optimization studies where only one factor is studied at a time are not capable of
determining if interactions are or are not large compared to the main effects. Thus, it
is important to conduct multifactor experiments using orthogonal arrays. Dr. Taguchi
considers the ability to detect the presence of interactions to be the primary reason for
using orthogonal arrays to conduct matrix experiments.
Sections 6.2, 6.3, and 6.4 described the engineering considerations in selecting
the quality characteristics, S/N ratios, and control factors and their levels. Matrix
experiments using orthogonal arrays provide a test to see whether the above selections
can successfully achieve additivity. If additivity is indeed achieved, the matrix exper
ment provides simultaneously the optimum values for the control factors. If additivity
is not achieved, the matrix experiment points it out so that one can re-examine the
selection of the quality characteristics, S/N ratios, and control factors and their levels.
6.6 SUMMARY
* Ability to predict the robustness (sensitivity to noise factors) of a product for any
combination of control factor settings is needed so that the best control factor levels
can be selected. The prediction must be valid, not only under the laboratory condi-
tions, but also under manufacturing and customer usage conditions.Sec. 6.6 ‘Summary 147
It is important to have additivity of the effects of the control factors on the sensi-
tivity to noise factors (robustness) for the following reasons:
— Only main effects need to be estimated which takes only a small number of
experiments. However, if the interactions among the control factors are
strong, experiments must be conducted under all combinations of control fac-
tor settings, which is clearly expensive, if not impractical.
— Conditions under which experiments are conducted can also be considered as
a control factor. The conditions consist of three types: laboratory, manufac-
turing, and customer usage. Presence of strong interactions among the con-
trol factors studied in a laboratory is an indication that the experimental con-
ditions are likely to interact with the control factors that have been studied.
This interaction, if present, can make the laboratory results invalid, which
leads to product failure during manufacturing and customer usage.
‘The additivity is influenced greatly by the choice of the quality characteristic, the
S/N ratio, and control factors and their levels.
‘The following guidelines should be used in selecting quality characteristics:
1, The quality characteristic should be directly related to the energy transfer
associated with the basic mechanism or the ideal function of the product.
2. As far as possible, choose continuous variables as quality characteristics.
3. The quality characteristics should be monotonic. Also, the related S/N ratio
should possess additivity.
4. Quality characteristics should be easy to measure.
5. Quality characteristics should be complete—that is, they should cover all
dimensions of the ideal function.
6. For products, having feedback mechanisms, the open loop, sensor and com-
pensation modules should be optimized separately, and the modules should
then be integrated. Similarly, complex products should be divided into suit-
able modules for optimization purposes.
Although the final success of a product or a process may depend on the reliability
or the yield, such responses often do not make good quality characteristics. They
tend to cause strong interactions among the control factors as illustrated by the pho-
tolithography example.
Different types of variables can be used as quality characteristics: the output or the
response variable, and threshold values of suitable control factors or noise factors
for achieving a certain value of the output. When the output is discrete, such as
ON-OFF states, it becomes necessary to use the threshold values.
Additivity of the effects of the control factors is also influenced by the selection of
control factors and their levels. If two or more control factors affect the same148
Achieving Additivity Chap. 6
aspect of the basic phenomenon, then the possibility of interaction among these fac-
tors becomes high. When such a situation is recognized, the interaction can be
reduced or even eliminated through proper transformation of the control factor
levels (sliding levels). A qualitative understanding of how control factors affect a
product is important in their selection.
Selecting a good quality characteristic, S/N ratio, and control factors and their
levels is essential in improving the efficiency of development activities. The selec-
tion process is not always easy. However, when experiments are conducted using
orthogonal arrays, a verification experiment can be used to judge whether the
interactions are severe. When interactions are found to be severe, it is possible to
look for an improved quality characteristic, S/N ratio, and control factor levels, and,
thus, mitigate potential manufacturing problems and field failures.
Matrix experiment based on an orthogonal array followed by a verification experi-
ment is a powerful tool for detecting lack of additivity. Optimizing a product
design one factor at a time does not provide the needed test for additivity.Chapter 7
CONSTRUCTING
ORTHOGONAL ARRAYS
The benefits of using an orthogonal array to conduct matrix experiments as well as the
analysis of data from such experiments are discussed in Chapter 3, The role of orthog-
onal arrays in a Robust Design experiment cycle is delineated in Chapter 4 with the
help of the case study of improving the polysilicon deposition process. This chapter
describes techniques for constructing orthogonal arrays that suit a particular case study
at hand.
Construction of orthogonal arrays has been investigated by many researchers
including Kempthome [K4], Plackett and Burman [P8], Addelman [A1], Raghavarao
IR1J, Sciden [$3], and Taguchi [T1]. The process of fitting an orthogonal array to a
specific project has been made particularly easy by a graphical tool, called linear
graphs, developed by Taguchi to represent interactions between pairs of columns in an
orthogonal array. This chapter shows the use of linear graphs and a set of standard
orthogonal arrays for constructing orthogonal arrays to fit a specific project.
Before constructing an orthogonal array, the following requirements must be
defined:
1. Number of factors to be studied
2. Number of levels for each factor
3. Specific 2-factor interactions to be estimated
4. Special difficulties that would be encountered in running the experiments150
Constructing Orthogonal Arrays Chap. 7
This chapter describes how to construct an orthogonal array to meet these requirements
and consists of the following eleven sections:
Section 7.1 describes how to determine the minimum number of rows for the
matrix experiment by counting the degrees of freedom.
Section 7.2 lists a number of standard orthogonal arrays and a procedure for
selecting one in a specific case study. A novice to Robust Design may wish to
use a standard array that is closest to the needs of the case study, and if neces-
sary, slightly modify the case study to fit a standard array. The remaining sec-
tions in this chapter describe various techniques of modifying the standard
orthogonal arrays to construct an array to fit the case study.
Section 7.3 describes the dummy level method which is useful for assigning a
factor with number of levels less than the number of levels in a column of the
chosen orthogonal array.
Section 7.4 discusses the compound factor method which can be used to assign
two factors to a single column in the array.
Section 7.5 describes Taguchi's linear graphs and how to use them to assign
interactions to columns of the orthogonal array.
Section 7.6 presents a set of rules for modifying a linear graph to fit the needs of
a case study.
Section 7.7 describes the column merging method, which is useful for merging
columns in a standard orthogonal array to create columns with larger number of
levels.
Section 7.8 describes process branching and shows how to use the linear graphs
to construct an appropriate orthogonal array for case studies involving process
branching.
Section 7.9 presents three step-by-step strategies (beginner, intermediate, and
advanced) for constructing an orthogonal array.
Section 7.10 describes the differences between Robust Design and classical sta-
tistical experiment design.
Section 7.11 summarizes the important points of this chapter.
7.1 COUNTING DEGREES OF FREEDOM
The first step in constructing an orthogonal array to fit a specific case study is to count
the total degrees of freedom that tells the minimum number of experiments that must
be performed to study all the chosen control factors. To begin with, one degree of
freedom is associated with the overall mean regardless of the number of control factors
to be studied. A 3-level control factor counts for two degrees of freedom because for a
3-level factor, A, we are interested in two comparisons. Taking any one level, A, asSec. 7. Selecting a Standard Orthogonal Array 151
the base level, we want to know how the response changes when we change the level
to Ay or A3. In general, the number of degrees of freedom associated with a factor is
equal to one less than the number of levels for that factor.
The degrees of freedom associated with interaction between two factors, called A
and B, are given by the product of the degrees of freedom for each of the two factors.
This can be seen as follows. Let m4 and ng be the number of levels for factors A and
B. Then, there are n4 ng total combinations of the levels of these two factors. From
that we subtract one degree of freedom for the overall mean, (n4-1) for the degrees of
freedom of A and (ng—I) for the degrees of freedom of B. Thus,
Degrees of freedom for interaction A x B
= ngng — 1 — (ag-1) — (ng-1)
= (mg-1) (mg-1)
= (degrees of freedom for A) x (degrees of freedom for B) .
Example 1:
Let us illustrate the computation of the degrees of freedom. Suppose a case study has
one 2-level factor (A), five 3-level factors (B, C, D, E, F), and we are interested in
estimating the interaction A x B, The degrees of freedom for this experiment are then
computed as follows:
Factor/Interaction Degrees of freedom
Overall mean 1
A 2-1
B, C, D, E, F 5 x 3-1)=10
AxB (2-1) x G-1)=2
Total 14
So, we must conduct at least 14 experiments to be able to estimate the effect of
each factor and the desired interaction.
7.2 SELECTING A STANDARD ORTHOGONAL ARRAY
Taguchi [T1] has tabulated 18 basic orthogonal arrays that we call standard orthogonal
arrays (see Appendix C). Most of these arrays can also be found in somewhat dif-
ferent forms in one or more of the following references: Addelman [AI], Box, Hunter,152 Constructing Orthogonal Arrays Chap. 7
and Hunter (B3], Cochran and Cox [C3], John [J2], Kempthome [K4], Plackett and
Burman (P8], Raghavarao [RI], Seiden [$3], and Diamond [D3]. In many case stu-
dies, one of the arrays from Appendix C can be used directly to plan a matrix experi-
ment. An array’s name indicates the number of rows and columns it has, and also the
number of levels in each of the columns. Thus, the array L4(2*) has four rows and
three 2-level columns. The array L4g(2'3’) has 18 rows; one 2-level column; and
seven 3-level columns. Thus, there are eight columns in the array L43(2'3’). For
brevity, we generally refer to an array only by the number of rows. When there are
two arrays with the same number of rows, we refer to the second array by a prime.
Thus, the two arrays with 36 rows are referred to as Ly and L45. The 18 standard
orthogonal arrays along with the number of columns at different levels for these arrays
are listed in Table 7.1.
TABLE 7.1 STANDARD ORTHOGONAL ARRAYS
Number | Maximum | Maximum Number of Columns
Orthogonal | of Number of at These Levels
Array* Rows Factors 2 3 4 5
Le 4 3 3 - ~ ~
Ls 8 7 7 7 = -
Ly 9 4 - 4 - :
La 2 u n = 4 7
Li 16 15 15 - _ _
Lie 16 5 = _ 5 -
Lis 18 8 1 7 a“ -
Las 25 6 - - -
Ln 2 3 a 1B - Z
Ly 32 31 31 - 7 -
Ly 32 10 1 - 9 -
Ly 36 B u 12 - =
Lig 36 16 3 13 - =
Ls 50 12 1 _ - it
Ly 54 26 1 28 - -
Le 64 6 63 - - _
Lig 64 2 “ - 2 =
Le al 40 - 40 - _
* Qevel arrays: La, Ley Lia
3eevel arrays: Lo. Loi. Li
Mixed 2- and 3-level array:
Lis. Lyx, Lose
Lips Lae, Liss LoweSec. 7.2 Selecting a Standard Orthogonal Array 153
The number of rows of an orthogonal array represents the number of experiments. In
order for an array to be a viable choice, the number of rows must be at least equal to
the degrees of freedom required for the case study. The number of columns of an
array represents the maximum number of factors that can be studied using that array.
Further, in order to use a standard orthogonal array directly, we must be able to match
the number of levels of the factors with the numbers of levels of the columns in the
array. Usually, it is expensive to conduct experiments. Therefore, we use the smallest
possible orthogonal array that meets the requirements of the case study. However, in
some situations we allow a somewhat larger array so that the additivity of the factor
effects can be tested adequately, as discussed in Chapter 8 in conjunction with the dif-
ferential operational amplifier case study.
Let us consider some examples to illustrate the choice of standard orthogonal
arrays.
Example 2:
A case study has seven 2-level factors, and we are only interested in main effects,
Here, there are a total of eight degrees of freedom—one for overall mean and seven for
the seven 2-level factors. Thus, the smallest array that can be used must have eight or
more rows. The array L has seven 2-level columns and, hence, fits this case study
perfectly—each column of the array will have one factor assigned to it.
Example 3:
A case study has one 2-level factor and six 3-level factors. This case study has 14
degrees of freedom—one for overall mean, one for the 2-level factor and twelve for the
six, 3-level factors. Looking at Table 7.1, we see that the smallest array with at least
14 rows is Lg. But this array has fifteen 2-level columns. We cannot directly assign
these columns to the 3-level factors. The next larger array is Lyg which has one 2-
level and seven 3-level columns. Here, we can assign the 2-level factor to the 2-level
column and the six 3-level factors to six of the seven 3-level columns, keeping one 3-
level column empty. Orthogonality of a matrix experiment is not lost by keeping one
or more columns of an array empty. So, L1 is a good choice for this experiment. In
a situation like this, we should take another look at the control factors to see if there is
an additional control factor to be studied, which we may have ignored as less impor-
tant, If one exists, it should be assigned to the empty column. Doing this allows us a
chance to gain information about this additional factor without spending any more
Tesources.
Example 4:
Suppose a case study has two 2-level and three 3-level factors. The degrees of free-
dom for this case study are nine. However, L cannot be used directly because it has
no 2-level columns. Similarly, the next larger array Lj) cannot be used directly
because it has no 3-level columns. This line of thinking can be extended all the way154 Constructing Orthogonal Arrays Chap. 7
through the array Ly7. The smallest array that has at least two 2-level columns and
three 3-level columns is Lg. However, if we selected Lg, we would be effectively
wasting 36-9=27 degrees of freedom, which would be very inefficient experimenta-
tion. This raises the question of whether these standard orthogonal arrays are flexible
enough to be modified to accommodate various situations. The answer is yes, and the
subsequent sections of this chapter describe the different techniques of modifying
orthogonal arrays.
Difficulty in Changing the Levels of a Factor
The columns of the standard orthogonal arrays given in Appendix C are arranged in
increasing order of the number of changes; that is, the number of times the level of a
factor has to be changed in running the experiments in the numerical order is less for
the columns on the left when compared to the columns on the right. Consequently, we
should assign a factor whose levels are difficult to change to columns on the left and
vice versa,
7.3 DUMMY LEVEL TECHNIQUE
The dummy level technique allows us to assign a factor with m levels to a column that
has n levels where n is greater than m. Suppose a factor A has two levels, Aj and A>.
We can assign it to a 3-level column by creating a dummy level Ay which could be
taken the same as A or Ap.
Example
Let us consider a case study that has one 2-level factor (A) and three 3-level factors (B,
C, and D) to illustrate the dummy level technique. Here we have eight degrees of free-
dom. Table 7.2 (a) shows the Ly array and Table 7.2 (b) shows the experiment layout
generated by assigning the factors A, B, C, and D to columns 1, 2, 3, and 4, respec-
tively, and by using the dummy level technique. Here we have taken A3=A, and
called it A{ to emphasize that this is a dummy level.
Note that after we apply the dummy level technique, the resulting array is still
proportionally balanced and, hence, orthogonal (see Appendix A and Chapter 3).
Also, note that in Example 5, we could just as well have taken Ay=Ay. But to ensure
orthogonality, we must consistently take Ay=A, or Ay=A3 within the matrix experi-
ment. The choice between taking A;=A, or A;=A2 depends on many issues. Some
of the key issues are as follows:
1, If we take A;=A; then the effect of A> will be estimated with two times more
precision than the effect of A;. Thus, the dummy level should be taken to be the
one about which we want more precise information. Thus, if A, is the starting
condition about which we have a fair amount of experience and A> is the new
alternative, then we should choose A 3 =.Sec. 7.3 Dummy Level Technique 155
TABLE 7.2 DUMMY LEVEL AND COMPOUND FACTOR TECHNIQUES
(b) Experiment Layout (©) Experiment Layout
for Dummy Level for Compound Factor
(a) Ly Array Technique (Example 5) ‘Technique (Example 6)
Column Number Column Number Column Number
Expt. Expt. Expt.
102 3 4// No} 1 2 3 4|/]No{ 1 2 3 4
ryioardia 1/4, B Cy, Dy 1 |AE, B,C, Dy
2 12 2 2 2 |A, By Cy Dy 2 | AE: Br Cz Dr
3 ]1 33 3 3 (A, By Cy Ds 3 | AE, By Cy Dy
4 ]2 1 2 3 4 |4, By Cr Dy 4 | AE, By Cr Dy
s ]2 2 3 1 5 |A, By C, D, 5S | AE, B, Cs D
6 |2 3 1 2 6 |A, By C, Dy 6 | A\Er By Cy Da
7 3 1 3 2 7 | AL Br Cy D3 7 | AE, By Cy Dy
8 |3 2 1 3 8 [Ai BC, Dy 8 | Az, By C, Dy
9 }3 3 21 9 | A By Cy Dy 9 | A,£, By Cr D,
A BoC D A BC D AE BC OD
Factor Factor Factor
Assignment Assignment Assignment
2. Availability of experimental resources and ease of experimentation also plays a
role here. Thus, if A; and A are two different raw materials and A, is very
scarce, then we may choose A3=A2 so that the matrix experiment can be
finished in a reasonable time.
One can apply the dummy level technique to more than one factor in a given
case study. Suppose in Example 5 there were two 2-level factors (A and B) and two
3-level factors (C and D). We can assign the four factors to the columns of the orthog-
onal array Ly by taking dummy levels Ay=A‘ (or Ay=A4) and B;=B% (or By=B5).
Note that the orthogonality is preserved even when the dummy level technique is
applied to two or more factors.
The dummy level technique can be further generalized, without losing ortho-
gonality, to assign an m-level factor to an n-level column where m is less than n. For
example, for studying the effect of clearance defects and other manufacturing parame-
ters on the reliability of printed wiring boards (described by Phadke, Swann, and Hill
[P6}, and Mitchell [M1]), a 6-level factor (A) was assigned to a 9-level column by tak-
ing A7=A{, Ag=A} and Ag=A4.156 Constructing Orthogonal Arrays Chap. 7
7.4 COMPOUND FACTOR METHOD
The compound factor method allows us to study more factors with an orthogonal. array
than the number of columns in the array. It can be used to assign two 2-level factors
to a 3-level column as follows. Let A and B be two 2-level factors, There are four
total combinations of the levels of these factors: AyBy, AB), A1Bo, and A>B>. We
pick three of the more important levels and call them as three levels of the compound
factor AB. Suppose we choose the three levels as follows; (AB), = A,By,
(AB), = A,B, and (AB); = AB. Factor AB can be assigned to a 3-level column
and the effects of A and B can be studied along with the effects of the other factors in
the experiment.
For computing the effects of the factors A and B, we can proceed as follows:
the difference between the level means for (AB); and (AB), tells us the effect of
changing from B, to Bz. Similarly, the difference between the level means for (AB);
and (AB); tells us the effect of changing from A, to Az.
In the compound factor method, however, there is a partial loss of orthogonality.
The two compounded factors are not orthogonal to each other. But each of them is
orthogonal to every other factor in the experiment. This complicates the computation
of the sum of squares for the compounded factors in constructing the ANOVA table.
The following examples help illustrate the use of the compound factor method.
Example 6:
Let us go back to Example 4 in Section 7.2 where the case study has two 2-level fac-
tors (A and E) and three 3-level factors (B, C, and D). We can form a compound fac-
tor AE with three levels (AE), = AE), (AE)2 = A,Ep and (AE); = AE\. This leads
us to four 3-level factors that can be assigned to the Lig orthogonal array. See Table
7.2(c) for the experiment layout obtained by assigning factors AE, B, C, and D to
columns 1, 2, 3, and 4, respectively.
Example 7:
The window photolithography case study described by Phadke, Kackar, Speeney and
Grieco [P5] had three 2-level factors (A, B, and D) and six 3-level factors (C, E, F, G,
H, and 1). The total degrees of freedom for the case study are sixteen. The next larger
standard orthogonal array that has several 3-level factors is Lj (2' x 37). The experi-
menters formed a compound factor BD with three levels (BD), =B,D1, (BD): =B 2D,
and (BD);=B D>. This gave them one 2-level and seven 3-level factors that match
perfectly with the columns of the Lig array. Reference [PS] also describes the compu-
tation of ANOVA for the compound factor method.
As a matter of fact, the experimenters had started the case study with two 2-level
factors (A and B) and seven 3-level factors (C through I). However, observing that bySec. 7.5 Linear Graphs and Interaction Assignment 187
dropping one level of one of the 3-level factors, the Lj, orthogonal array would be
suitable, they dropped the least important level of the least important factor, namely
factor D. Had they not made this modification to the requirements of the case study,
they would have needed to use the L7 orthogonal array, which would have amounted
to 50 percent more experiments! As illustrated by this example, the experimenter
should always consider the possibility of making small modifications in the require-
ments for saving the experimental effort.
7.5 LINEAR GRAPHS AND INTERACTION ASSIGNMENT
Sections 7.2 through 7.4 considered the situations where we are not interested in
estimating any interaction effects. Although in most Robust Design experiments we
choose not to estimate any interactions among the control factors, there are situations
where we wish to estimate a few selected interactions. The linear graph technique,
invented by Taguchi, makes it easy to plan orthogonal array experiments involving
interactions.
Confounding of Interactions with Factor Effects
Let us consider the orthogonal array Lg [Table 7.3 (a)] and suppose we assigned fac-
tors A, B, C, D, E, F, and G to the columns 1 through 7, respectively. Suppose we
believe that factors A and B are likely to have strong interaction, What effect would
the interaction have on the estimates of the effects of the seven factors obtained from
this matrix experiment?
The interaction effect is depicted in Figure 7.1. We can measure the magnitude
of interaction by the extent of nonparallelism of the effects shown in Figure 7.1. Thus,
AB interaction = (y4,8,-Y4,8,) — A,B, —YA,B,)
= (a8 +YA,B,) ~ arb, + YA B2) +
From Table 7.3 (a) we see that experiments under level C, of factor C (experiments I,
2,7 and 8) have combinations A,B, and A,B of factors A and B; and experiments
under level Cz of factor C (experiments 3, 4, 5 and 6) have combinations A,B and
A2B of factors A and B. Thus, we will not be able to distinguish the effect of factor
C from the A x B interaction. Inability to distinguish effects of factors and interac-
tions is called confounding. Here we say that factor C is confounded with interaction
AX B. We can avoid the confounding by not assigning any factor to column 3 of the
array Lg.158 Constructing Orthogonal Arrays Chap. 7
Figure 7.1 2-factor interaction. Interaction between factors A and B shows as nonparallel-
ism of the effects of factor A under levels B, and B of factor B.
Interaction Table
The interaction table, shown in Table 7.3 (b), shows in which column the interaction is
confounded with (or contained in) for every pair of columns of the Lg array. Thus, it
can be used to determine which column of the Lg array should be kept empty (that is,
not be assigned to a factor) in order to estimate a particular interaction. From the
table, we see that the interaction of columns 1 and 2 is confounded with column 3, the
interaction of columns 3 and 5 is confounded with column 6, and so on. Note that the
interaction between columns a and b is the same as that between columns 6 and a.
That is, the interaction table is a symmetric matrix. Hence, only the upper triangle is
given in the table, and the lower triangle is kept blank. Also, the diagonal terms are
indicated in parentheses as there is no real meaning to interaction between columns a
and a.
The interaction table contains all the relevant information needed for assigning
factors to columns of the orthogonal array so that all main effects and desired interac-
tions can be estimated without confounding. The interaction tables for all standard
orthogonal arrays prepared by Taguchi [T1] are given in Appendix C, except for the
arrays where the interaction tables do not exist, and for the arrays Le4, Lg, and Ley,
because they are used rather infrequently, The interaction tables are generated directly
from the linear algebraic relations that were used in creating the orthogonal arrays
themselves.Sec. 7.5 Linear Graphs and Interaction Assignment 159
TABLE 7.3 Ly ORTHOGONAL ARRAY AND ITS INTERACTION TABLE
(a) L4(2”) orthogonal array (b) Interaction table for Ly
Column Column
Expt.
No }1 2 3 4 5S 6 7| |Columa
aftourtadd oe
2/1 442 2 2 2 2
3/1 22 112 2 oe
afi 222 211 4
Ss }2 1 2 12 1 2 a
6/2 12 2 121 6
7/2 241221 7 ®
8/2 2 12 11 2 7
Note: Entries in this table show the column with
A BC D EF G| Which the interaction between every pair of
columns is confounded,
Factor Assignment
Linear Graphs
Using the interaction tables, however, is not very convenient, Linear graphs represent
the interaction information graphically and make it easy to assign factors and interac-
tions to the various columns of an orthogonal array. Ina linear graph, the columns of
an orthogonal array are represented by dots and lines. When two dots are connected by
a line, it means that the interaction of the two columns represented by the dots is con-
tained in (or confounded with) the column represented by the line. In a linear graph,
each dot and each line has a distinct column number(s) associated with it. Further,
every column of the array is represented in its linear graph once and only once.
One standard linear graph for the array Lg is given in Figure 7.2 (a). It has four
dots (or nodes) corresponding to columns 1, 2, 4, and 7. Also, it has three lines (or
edges) representing columns 3, 6, and 5. These lines correspond to the interactions
between columns 1 and 2, between columns 2 and 4, and between columns 1 and 4,
respectively. From the interaction table, Table 7.3 (b), we can verify that columns 3,
6, and 5 indeed correspond to the interactions mentioned above.
In general, a linear graph does not show the interaction between every pair of
columns of the orthogonal array. It is not intended to do so; that information is con-
tained in the interaction table. ‘Thus, the interaction between columns | and 3, between
columns 2 and 7, etc., are not shown in the linear graph of L in Figure 7.2 (a).160 Constructing Orthogonal Arrays Chap. 7
1 2
3
3 5 o7 1 = 4
8
4 7
a 6
(a) (b)
Figure 7.2 ‘Two standard linear graphs of Ls.
‘The other standard linear graph for Lg is given in Figure 7.2 (b). It, too, has
four dots corresponding to columns 1, 2, 4, and 7. Also, it has three lines representing
columns 3, 5 and 6. Here, these lines correspond to the interactions between columns
1 and 2, between columns | and 4, and between columns 1 and 7, respectively. Let us
see some examples of how these linear graphs can be used.
In general, an orthogonal array can have many linear graphs. Each linear graph,
however, must be consistent with the interaction table of the orthogonal array. The dif-
ferent linear graphs are useful for planning case studies having different requirements.
Taguchi [T1] has prepared many linear graphs, called standard linear graphs, for each
orthogonal array. Some of the important standard linear graphs are given in
Appendix C. Note that the linear graphs for the orthogonal arrays Lgq and Ly are not
given in Appendix C because they are needed rather infrequently. However, they can
be found in Taguchi [T1]. Section 7.6 describes the rules for modifying linear graphs
to fit them to the needs of a given case study.
Example 8:
Suppose in a case study there are four 2-level factors A, B, C, and D. We want to
estimate their main effects and also the interactions A x B , B x C, and B x D. Here,
the total degrees of freedom are eight, so Lg is a candidate array. ‘The linear graph in
Figure 7.2 (b) can be used directly here. The obvious column assignment is: factor B
should be assigned to column 1, Factors A, C, and D can be assigned in an arbitrary
order to columns 2, 4, and 7. Suppose we assign factors A, C, and D to columns 2, 4,
and 7, respectively. Then the interactions A x B, B x C and B x D can be obtained
from columns 3, 5, and 6, respectively. These columns must be kept empty. Table 7.4
shows the corresponding experiment layout.Sec. 7.5 Linear Graphs and Interaction Assignment
161
TABLE 7.4 ASSIGNMENT OF FACTORS AND INTERACTIONS: EXPERIMENT LAYOUT
USING ARRAY Ls
Column*
Expt.
No] to 2 3 4 § 6 7
1 By AY Cc Dy
2 |B A © Dz
3) B, Ap a Dz
4 |B, A; Ca Dy
5 |B, A CG Dy
6 |B, Ay C2 Dy
7 |B A CG Dd
8 B, Az C2 Dz
BA AxB C BxC BxD D
Factor Assignment
* Note that columns 3, 5, and 6 are left empty (no factors are
assigned) so that interactions A x B, B x C and B x D can be
estimated,
Estimating an interaction means determining the nonparallelism of the factor
effects. To estimate an interaction, we prepare a 2-way table from the observed data. For
example, to estimate A x B interaction in Example 8 we prepare the following table
whose rows correspond to the levels of factor A, columns correspond to the levels of fac-
tor B, and entries correspond to the average response for the particular combination of the
levels of factors A and B:
Level of Factor B
B, B
yitya | yst¥e
A
, 2 2
Level of -—-——+
factor A
yrtya | dotye
Ar | 3162 Constructing Orthogonal Arrays Chap. 7
In the above table y; stands for the response for experiment i. Experiments 1 and
2 are conducted at levels A, and B, of factors A and B (see Table 7.4). Accordingly,
the entry in the A,B, position in (y, + y2) /2. The entries in the other positions of
the table are determined similarly. The data of the above 2-way table can be plotted to
display the AxB interaction. The interactions BxC and B xD can be estimated in the
same manner. If fact, this estimation procedure can be used regardless of the number
of levels of a factor.
Example 9:
Suppose there are five 2-level factors A, B, C, D, and E. We want to estimate their
main effects and also the interactions AxB and BC. Here, also, the needed degrees
of freedom is eight, making L a candidate array. However, none of the two standard
linear graphs of Lg can be used directly. Section 7.6 shows how the linear graphs can
be modified so that a wide variety of experiment designs can be constructed con-
veniently.
Linear Graphs for 3-level Factors
So far in this section we have discussed the interaction between two 2-level factors.
The concept can be extended to situations involving factors with higher number of lev-
els. Figure 7.3 (a) shows an example of no interaction between two 3-level factors,
whereas Figures 7.3 (b) and (c) show examples where interaction exists among two
3-level factors.
7 1
By
J» B,
B, 8,
YN 6, 7
A, Ap As A A, As A Ay 4s
(2) No Interaction (b) Synergistic (©) Antisynergistic
Interaction Interaction
Figure 7.3 Examples of interaction.Sec. 7.6 Modification of Linear Graphs 163
Linear graphs and interaction tables for the arrays Lo, L27, etc., which have
3-level columns, are slightly more complicated than those for arrays with 2-level
columns. Each column of a 3-level factor has two degrees of freedom associated with
it. The interaction between two 3-level columns has four degrees of freedom. Hence,
to estimate the interaction between two 3-level factors, we must keep two 3-level
columns empty, in contrast to only one column needed to be kept empty for 2-level
orthogonal arrays. This fact is reflected in the interaction tables and linear graphs
shown in Appendix C.
As discussed repeatedly in earlier chapters, we generally do not study intetac-
tions in Robust Design. Then why study linear graphs? The answer is because linear
graphs are useful in modifying orthogonal arrays to fit specific case studies. The fol-
lowing three sections describe the rules for modifying linear graphs and their use in
modifying orthogonal arrays.
7.6 MODIFICATION OF LINEAR GRAPHS
‘The previous section showed how linear graphs can be used to assign main effects and
interactions to the columns of standard orthogonal arrays. However, the principal util-
ity of linear graphs is for creating a variety of different orthogonal arrays from the
standard ones to fit real problems. The linear graphs are useful for creating 4-level
columns in 2-level orthogonal arrays, 9-level columns in 3-level orthogonal arrays and
6-level columns in mixed 2- and 3-level orthogonal arrays. They are also useful for
constructing orthogonal arrays for process branching. Sections 7.7 and 7.8 describe
these techniques. Common to all these applications of linear graphs is the need to
modify a standard linear graph of an orthogonal array so that it matches the linear
graph required by a particular problem.
A linear graph for an orthogonal array must be consistent with the interaction
table associated with that array; that is, every line in a linear graph must represent the
interaction between the two columns represented by the dots it connects. In the fol-
lowing discussion we assume that for 2-level orthogonal arrays, the interaction between
columns @ and 6 is contained in column c. Also, the interaction between columns f
and g is contained in column c. If it is a 3-level orthogonal array, we assume that the
interaction between columns a and b is contained in columns c¢ and d. Also, the
interaction between columns f and g is contained in columns c and d. The following
three rules can be used for modifying a linear graph to suit the needs of z specific case
study.
1. Breaking a line. In the case of a 2-level orthogonal array, a line connecting two
dots, a and b, can be removed and replaced by a dot. The column associated
with this dot is same as the column associated with the line it was created from.
In case of linear graphs for 3-level orthogonal arrays, a line has two columns
associated with it and it maps into two dots. Figures 7.4 (a) and (b) show this
rule diagrammatically.ssydesd svauyt jo worwoyipoy pL 2unIyy
q pio 3 6 , 6 9 4 6 y
-—e ° ° —_— e e
q 9.8 aur] © BuAOW
q e q poe q e
° ° — e ° —_—-,
our] e Bunwos
our] 8 Bupyeorg
‘SUUINJOD Jo UORIBJeIU! ‘Os|y “9 UUIN|OD
UL S| g pur B SUUINIOD JO UOROEIEIUI
*SUUUN|OO JOAo}-z O18 9 pue ‘g ‘B
21Ny UONEOYIPOW
“SULUN|OO janeI-€ a1 P pure ‘D
sheuy jeuoboyyiO ‘skeuy jeuoBoyuo.
1eneT-e feneT-2
164Sec. 7.6 Modification of Linear Graphs 165
2. Forming a line. A line can be added in the linear graph of a 2-level orthogonal
array to connect two dots, a and b, provided we remove the dot ¢ associated with
the interaction between a and b. In the case of the linear graphs for a 3-level
orthogonal array, two dots c and d, which contain the interaction of a and b,
must be removed. The particular dot or dots to be removed can be determined
from the interaction table for the orthogonal array. Figures 7.4 (c) and (d) show
this rule diagrammatically.
3. Moving a line. This rule is really a combination of the preceding two rules. A
line connecting two dots a and 6 can be removed and replaced by a line joining
another set of two dots, say f and g, provided the interactions a x b and f x g
are contained in the same column or columns. This rule is diagrammatically
shown in Figures 7.4 (e) and (f).
The following examples illustrate the modification of linear graphs.
Example 10:
Consider Example 9 in Section 7.5. The standard linear graph of Lg, Figure 7.5 (a)
can be changed into the linear graph shown in Figure 7.5 (b) by breaking the line join-
ing dots 1 and 6. This modified linear graph matches the problem perfectly. The fac-
tors A, B, C, D and E should be assigned, respectively, to columns 2, 1, 4, 6, and 7.
The AXB and BXC interactions can be estimated by keeping columns 3 and 5 empty.
2 2
5 7
1 4 a. 4
7 é
7
(a) A Standard Linear (b) Modified Linear Graph
Graph of Ls Obtained by Breaking
Line 6
Figure 7.5 Standard and modified linear graph of Lg.
Example 11:
The purpose of this example is to illustrate the rule 3, namely moving a line. Figure
7.6 (a) shows one of the standard linear graphs of the orthogonal array Ly. It can be
changed into Figure 7.6 (b) by breaking the line connecting columns 6 and 11, and
adding isolated dot for column 13. This can be further turned into Figure 7.6 (c) by
adding a line to connect columns 7 and 10, and simultaneously removing the isolated
dot 13.166 Constructing Orthogonal Arrays Chap. 7
1 4 5 7 6
3 | | | |
2 8 10 9 "
(a) A Standard Linear Graph of Lis
5
| 14
2 10 9
(b) Modified Linear Graph Obtained by Breaking Line 13 in (a)
1 4 5 7
13
3 12 15 14 e e
1
2 8 10 9
(©) Modified Linear Graph Obtained by Forming a Line Between Dots
7 and 10 in (b). Interaction of Columns 7 and 10 is Column 13
7
4
3 12
8
Figure 7.6 An example of linear graph modification.
7.7 COLUMN MERGING METHOD
The column merging method can be used to create a 4-level column in a standard
orthogonal array with all 2-level columns, a 9-level column in a standard orthogonal
array with all 3-level columns, and a 6-level column in a standard orthogonal array
with some 2-level and some 3-level columns.Sec 7.7 Column Merging Method 167
To create a 4-level column in a standard orthogonal array with 2-level columns,
we merge any two columns and their interaction column. For example, in the Lg array
the interaction of columns 1 and 2 lies in column 3. Thus, these three columns can be
merged to form a 4-level column. Note that the three columns that are merged have
one degree of freedom each, thus together they have the three degrees of freedom
needed for the 4-level column.
Suppose columns a, b, and c (the column containing interaction of a and b) are
designated to form a 4-level column, The steps in forming the 4-level column are:
1. Create a new column called abc as follows:
For the combination (1,1) in columns @ and 6 write 1 in column abe
For the combination (1,2) in columns @ and 6 write 2 in column abc
For the combination (2,1) in columns @ and b write 3 in column abc
For the combination (2,2) in columns @ and b write 4 in column abc
2, Remove columns a, b, and ¢ from the array. These columns cannot be used to
study any other factors or interactions.
The creation of a 4-level column using columns 1, 2, and 3 of Lg is shown in
Table 7.5. It can be checked that the resulting array is still balanced and, hence,
orthogonal. It can be used to study one 4-level factor and up to four 2-level factors.
TABLE 7.5 COLUMN MERGING METHOD: CREATION OF A 4-LEVEL COLUMN IN Lo.
(a) Ly array: We designate a=column 1, (b) Modified Ly array: Columns 1, 2, and 3
b=column 2, aXé interaction is in column 3. are merged to form a 4-level column.
Column Column
Ey Expt.
12 3 4567 No. | (123) 4 5 6 7
1 fio. 1 ne 1 1 Poro4uwod
2/11 1 2222 2 1 202 2 2
3 ]1 2 2 1 122 3 2 11 2 2
af[12 2 2211 4 2 2 2 1 1
s |2 1 2 1212 5 3 12 1 2
6 |/2 1 2 2121 6 3 2 1 2 4
7/2 2 1 1221 1 4 12 204
8/2 2 1 214 1 2 8 4 2 1 41 2
TT Tf
a ob c=axb168 Constructing Orthogonal Arrays: Chap. 7
In the linear graph of a 2-level orthogonal array we represent a 4-level factor by
two dots and a line connecting them.
The column merging procedure above generalizes to orthogonal arrays with
columns other than only 2-level columns. Thus, to form a 9-level column in a stan-
dard orthogonal array with 3-level columns, we follow the same procedure as above
except we must merge four columns: two columns from which we form the 9-level
column and the two columns containing their interactions.
7.8 BRANCHING DESIGN
A process for applying covercoat on printed wiring boards consists of (1) spreading the
covercoat material (a viscous liquid) on a board, and (2) baking the board to form a
hard covercoat layer. Suppose, to optimize this process, we wish to study two types of
material (factor A), two methods of spreading (factor B) and two methods of baking
(factor C). The two methods of baking are a conventional oven (C;) and an infrared
oven (C2). For the conventional oven there are two additional control factors, bake
temperature (factor D, two levels) and bake time (factor E, two levels), Whereas for
the infrared oven, there are two different control factors: infrared light intensity (factor
F, two levels) and conveyor belt speed (factor G, two levels).
The factors for the covercoat process are diagrammed in Figure 7.7. Factor C is
called a branching factor because, depending on its level, we have different control
factors for further processing steps. Branching design is a method of constructing
‘orthogonal arrays to suit such case studies.
Linear graphs are extremely useful in constructing orthogonal arrays when there
is process branching. The linear graph required for the covercoat process is given in
Figure 7.8 (a). We need a dot for the branching factor C, and two dots connected with
lines to that dot. These two dots correspond to the factors D and E for the conven-
tional oven branch, and F and G for the infrared oven branch. The columns associated
with the two interaction lines connected to the branching dot must be kept empty. In
the linear graph we also show two isolated dots corresponding to factors A and B.
The standard linear graph for Lg in Figure 7.8(b) can be modified casily to
match the linear graph in Figure 7.8(a). We break the bottom line to form two isolated
dots corresponding to columns 6 and 7. Thus, by matching the modified linear graph
with the required linear graph, we obtain the column assignment for the control factors
as follows:
Factor Column
A 6
B 7
Cc 1
DF 2 (3)
E,G 4(5)Sec. 7.8 Branching Design 169
i,
‘A. Covercoat Material
to
B. Method of Spreading
ee
C. Method of Baking
Conventional
n
Infrared
Oven
D. Bake Temperature F. Light Intensity
Bake Time G. Conveyor Belt Speed
Figure 7.7 Process branching in covercoat process.
Columns 3 and 5, shown in parenthesis, must be kept empty. The factors D and
F are assigned to the same column, namely column 2. Whether a particular experiment
is conducted by using factor D or F depends on the level of factor C, which is deter-
mined by column 1. Thus, the levels of factors D and F are determined jointly by the
columns | and 2 as follows:
For the combination (1,1) in columns
For the combination (1,2) in columns
For the combination (2,1) in columns
For the combination (2,2) in columns
and 2 write Dy in column 2
and 2 write D2 in column 2
and 2 write Fy in column 2
and 2 write F> in column 2
Factors D and F can have quite different effects; that is, mp,—mp, need not be equal
to mp,—mp,. This difference shows up as interaction between columns 1 and 2,
which is contained in column 3. Hence, column 3 must be kept empty. The factors E
and G are assigned to the column 4 in a similar way, and column 5 is kept empty.170 Constructing Orthogonal Arrays Chap. 7
D, F
°
>
2G
(a) Required Linear Graph
2
a
5
4
: §
7
(¢} Modified Standard Linear Graph and
Assignment of Factors to Columns.
Figure 7.8 Covercoat process column assignment through linear graph.
The experiment layout for the covercoat process is given in Table 7.6. Note that
experiments 1 through 4 are conducted using the conventional oven, while experiments
5 through 8 are conducted using the infrared oven.
It is possible that after branching, the process can reunite in subsequent steps.
Thus, in the printed wiring board application, after the covercoat is applied, we may go
through common printing and etching steps that all have a common set of control fac-
tors. Branching can also occur in product design; for example, we may select different
mechanisms to achieve a part of the function. Here, associated with each mechanism,
there would be different control factors.Sec.79 Strategy for Constructing an Orthogonal Array 71
TABLE 7.6 EXPERIMENT LAYOUT FOR THE COVERCOAT PROCESS
Column
Expt
No [1 2 3 4.5 6 7
1 ]c, dD E Ay By
2/a dD E, A, Bp
3 |e Dy E, Ar By
4 /C, Dy E; A, B,
5 |G, Fy G, A, By
6 | C, Fy G, Ar By
7 Cr, Fi Gy Ar By
8 1G, Fy G, Ay Bp
C DF Empty EG Empy A B
Factor Assignment
7.9 STRATEGY FOR CONSTRUCTING AN ORTHOGONAL ARRAY
Up to this point, this chapter discussed many techniques for constructing orthogonal
arrays needed by the matrix experiments. This section focuses on showing how to
orchestrate the techniques for constructing an orthogonal array to suit a particular case
study. The skill needed to apply these techniques varies widely. Accordingly, we
describe three strategies—beginner, intermediate, and advanced—requiring progres-
sively higher levels of skill with the techniques described earlier in this chapter. A
vast majority of case studies can be taken care of by the beginner and intermediate
strategies, whereas a small fraction of the case studies requires the advanced strategy.
The router bit life improvement case study in Chapter 11 is one such case study.
Beginner Strategy
A beginner should stick to the direct use of one of the standard orthogonal arrays.
Table 7.7 is helpful in selecting a standard orthogonal array to fit a given case study.
Because it gets difficult to keep track of data from a larger number of experiments, the
beginner is advised to not exceed 18 experiments, which makes the possible choices of
orthogonal arrays as L4, Lg, Lo, Lz, Lig, Lig, and Lg.172 Constructing Orthogonal Arrays Chap. 7
TABLE 7.7 BEGINNER STRATEGY FOR SELECTING AN ORTHOGONAL ARRAY
(a) All 2-level Factors (b) All 3-tevel Factors
No. of No. of
2level_ | Recommended 3-level | Recommended
Factors | Orthogonal Array! Factors | Orthogonal Array
2-3 Ly 2-4 Ly
4-7 Ly 5-7 Lis*
8-1 Ln *When Lig is used, one 2-level
12-15 Lis factor can be used in addition
to seven 3-evel factors.
A beginner should consider either all 2-level factors or all 3-Level factors (prefer-
ably 3-level factors) and not attempt to estimate any interactions. This may require
him or her to modify slightly the case-study requirements, The rules given in Table
77 can then be used to select the orthogonal array.
The assignment of factors to the columns is straightforward in the cases dis-
cussed above. Any column can be assigned to any factor, except for factors that are
difficult to change, which should be assigned to the columns toward the left.
Among all the arrays discussed above, the array Lg is the most commonly used
array because it can be used to study up to seven 3-level factors and one 2-level factor,
which is the situation with many case studies.
Intermediate Strategy
Experimenters with modest experience in using matrix experiments should use the
dummy level, compound factor, and column merging techniques in conjunction with
the standard orthogonal arrays to broaden the possible combinations of the factor lev-
els. The factors should have preferably two or three levels and the estimation of
interactions should be avoided. Also, as far as possible, arrays larger than Lg should
be avoided. Table 7.8 can be used to select an appropriate standard orthogonal array
depending on the number of 2- and 3-level factors in the case study. The following
rules can then be used to modify the chosen standard orthogonal array to fit the case
study:
1. To create a 3-level column in the attay Lg or Lys, merge three columns in the
array (two columns and the column containing their interaction) to form a 4-level
column. Then use the dummy level technique to convert the 4-level column into
a 3-level column.Sec. 7.9 Strategy for Constructing an Orthogonal Array 173
2. To create two 3-level columns in the array L jg, merge two distinct sets of three
columns in the array (two columns and the column containing their interaction)
to form two 4-level columns. Then use the dummy level technique to convert
the 4-level columns into 3-level columns.
3. When the array Lg is suggested by the Table 7.8 and the total number of factors
ig less than or equal to four, use the dummy level technique to assign a 2-level
factor to a 3-level column.
4. When the array Lg is suggested by the Table 7.8 and the total number of factors
exceeds four, use the compound factor technique to create a 3-level factor from
two 2-level factors until the total number of factors becomes 4.
5. When the array Lyg is suggested by the Table 7.8 and the number of 2-level
columns exceeds one, use the dummy level and compound factor techniques in
the manner similar to rules 3 and 4 above.
TABLE 7.8 INTERMEDIATE STRATEGY FOR SELECTING AN ORTHOGONAL ARRAY*
Number of 3-level factors
Number of
2-level factors 0 1 2 3 4 5 6 7
0 Ly by Lp Lis La Lew
1 ty ky Lie Lig Lie Lae
2 Ly Ly Ly ky Lig Lig Lig
3 te Lay Ly we ig ig Lew
4 Ly Ly kg hg hg Lew
s Ly be big hig Lig Les
6 Ly Lie hig ig Law
7 a
8 Ln bw Lis Law
9 in Le bi Li
10 Ln Lie
avy be Lig
2 Ly bie
13 Lie
14 Ly
1s Lis
* Combination of 2- and 3-level factors not covered by the intermediate strategy
are indicated by a blank,174 Constructing Orthogonal Arrays Chap. 7
Advanced Strategy
‘An experimenter, who has a fair amount of experience in conducting matrix experi-
ments and wishes to have wider freedom in terms of the number of factors and their
levels or wants to estimate interactions, must use linear graphs and rules for their
modification. The advanced strategy consists of the following steps:
1. Use the beginner or intermediate strategy to obtain a simple solution. If that is
not possible, proceed with the following steps.
2. Count the degrees of freedom to determine the minimum size of the orthogonal
array.
3. Select an appropriate standard orthogonal array from among those listed in Table
7.1. If most of the factors are 2- or 4-level factors, then a 2-level array should
be selected. If most of the factors are 3-level factors, then a 3-level array should
be selected.
4, Construct the linear graph required for the case study. The linear graph should
contain the interactions to be estimated and also the appropriate patterns for
column merging and process branching.
5. Select a standard linear graph for the chosen array that is closest to the required
linear graph.
6. Modify the standard linear graph to match the required linear graph by using the
rules in Section 7.6. The column assignment is obvious when the two linear
graphs match. If we do not succeed in matching the linear graphs we must
repeat the procedure above with either a different linear graph for the chosen
standard orthogonal array, or choose a larger standard orthogonal array, or
modify the requirements for the case study.
The advanced strategy needs some skill in using the linear graph modification
rules. The router bit life improvement case study of Chapter 11 illustrates the use of
the advanced strategy. Artificial intelligence programs can be used to carry out the
modifications efficiently as described by Lee, Phadke, and Keny [L1].
7.10 COMPARISON WITH THE CLASSICAL STATISTICAL
EXPERIMENT DESIGN
As mentioned in Chapter 1, both classical statistical experiment design and Robust
Design use the basic principles of planning experiments and data analysis developed by
R. A. Fisher in the 1920s. Thus, there are many common ideas in the two methods.
The differences in the methods come about because the two methods were developed
by people who were concemed with different problems. This section describes the
differences primarily for the benefit of readers familiar with classical statistical experi-
ment design, which is described in many books, such as Box, Hunter, and Hunter [B3],
John [J2], Cochran and Cox [C3], Daniel [D1], and Hicks [H2]. It is hoped that thisSec. 7.10 Comparison with the Classical Statistical Experiment Design 175
section will help such readers understand and apply the Robust Design Method. This
section may be skipped without affecting the readability of the rest of the book.
Any method which was developed over several decades is likely to have varia-
tions in the way it is applied. Here, the term classical statistical experiment design
tefers to the way the method is practiced by the majority of its users. Exceptions to
the majority practice are not discussed here. The term Robust Design, of course,
means the way it is described in this book.
The comparison is made in three areas: problem formulation, experiment layout,
and data analysis. The differences in the areas of experiment layout and data analysis
are primarily a result of the fact that the two methods address different problems.
Differences in Problem Formulation
Emphasis on Variance Reduction
The primary problem addressed in classical statistical experiment design is to model
the response of a product or process as a function of many factors called model factors.
Factors, called nuisance factors, which are not included in the model, can also
influence the response. Various techniques are employed to minimize the effects of the
nuisance factors on the estimates of model parameters. These techniques include hold-
ing the nuisance factors at constant values during the experiments when possible, as
well as techniques called blocking and randomization, The effects of the nuisance fac-
tors not held constant show as variance of the response. Classical statistical experi-
ment design theory is aimed at deriving a mathematical equation relating the mean
Tesponse to the levels of the model factors. As a general rule, it assumes that the vari-
ance of the response remains constant for all levels of the model factors. Thus, it
ignores the problem of reducing variability which is critical for quality improvement.
The primary problem addressed in Robust Design is how to reduce the variance
of a product’s function in the customer's environment. Recall that the variance is
caused by the noise factors and the fundamental idea of Robust Design is to find levels
of the control factors which minimize sensitivity of the product’s function to the noise
factors. Consequently, Robust Design is focused on determining the effects of the con-
trol factors on the robustness of the product's function. Instead of assuming that the
variance of the response remains constant, it capitalizes on the change in variance and
looks for opportunities to reduce the variance by changing the levels of the control fac-
tors.
In Robust Design, accurate modeling of the mean response is not as important as
finding the control factor levels that optimize robustness. This is so because, after the
variance has been reduced, the mean response can be easily adjusted with the help of
only one control factor. Finding a suitable control factor, known as adjustment factor,
which can be used for adjusting the mean response is one of the concerns of the
Robust Design method.176 Constructing Orthogonal Arrays Chap. 7
Selection of Response/Quality Characteristic
Classical statistical experiment design considers the selection of the response to be out-
side the scope of its activities, but Robust Design requires a thorough analysis of the
engineering scenario in selecting the quality characteristic and the S/N ratio. Guide-
lines for the selection of the quality characteristic and S/N ratio are given in Chapters 5
and 6.
Frequently, the final goal of a project is to maximize the yield or the percent of
products meeting specifications. Accordingly, in classical statistical experiment design
yield is often used as a response to be modeled in terms of the model factors. As dis-
cussed in Chapters 5 and 6, use of such response variables could lead to unnecessary
interactions and it may not lead to a robust product design.
Systematic Sampling of Noise
The two methods also differ in the treatment of noise during problem formulation.
Since classical statistical experiment design method is not concerned with minimizing
sensitivity to noise factors, the evaluation of the sensitivity is not considered in the
method. Instead, noise factors are considered nuisance factors. They are either kept at
constant values during the experiments, or techniques called blocking and randomiza-
tion are used to block them from having an effect on the estimation of the mathemati-
cal model describing the relationship between the response and the model factors.
On the contrary, minimizing sensitivity to noise factors (factors whose levels
cannot be controlled during manufacturing or product usage, which are difficult to con-
trol, or expensive to control) is a key idea in Robust Design. Therefore, noise factors
are systematically sampled for a consistent evaluation of the variance of the quality
characteristic and the S/N ratio, Thus, in the polysilicon deposition case study of
Chapter 4, the test wafers were placed in specific positions along the length of the reac-
tor and the quality characteristics were measured at specific points on these wafers.
This ensures that the effect of ngise factors is equitable in all experiments. When there
exist many noise factors whose levels can be set in the laboratory, an orthogonal array
is used to select a systematic sample, as discussed in Chapter 8, in conjunction with
the design of a differential operational amplifier. Use of an orthogonal array for sam-
pling noise is a novel idea introduced by Robust Design and it is absent in classical
statistical experiment design.
Transferability of Product Design
Another important consideration in Robust Design is that a design found optimum dur-
ing laboratory experiments should also be optimum under manufacturing and customer
environments. Further, since products are frequently divided into subsystems for
design purposes, it is critically important that the robustness of a subsystem be
minimally affected by changes in the other subsystems. Therefore, in Robust DesignSec. 7.10 Comparison with the Classical Statistical Experiment Design 17
interactions among control factors, especially the antisynergistic interactions, are con-
sidered highly undesirable. Every effort is made during problem formulation to select
the quality characteristic, S/N ratio, and control factor levels to minimize the interac-
tions. If antisynergistic interactions are discovered during data analysis or verification
experiments, the experimenter is advised to go back to Step 1 of the Robust Design
cycle and re-examine the choice of quality characteristics, S/N ratios, control factors,
noise factors, etc.
On the other hand, classical statistical experiment design has not been concemed
with transferability of product design. Therefore, presence of interactions among the
model factors is not viewed as highly undesirable, and information gained from even
antisynergistic interactions is utilized in finding the factor levels that predict the best
average response.
Differences in Experiment Layout
Testing for Additivity
Additivity means absence of all interactions—2-factor, 3-factor, etc. Achieving addi-
tivity, though considered desirable, is usually not emphasized in classical statistical
experiment design; and orthogonal arrays are not used to test for additivity. Interac-
tions are allowed to be present; they are appropriately included in the model; and
experiments are planned so that they can be estimated.
Achieving additivity is very critical in Robust Design, because presence of large
interactions is viewed as an indication that the optimum conditions obtained through a
matrix experiment may prove to be non-optimum when levels of other control factors
(other than those included in the matrix experiment at hand) are changed in subsequent
Robust Design experiments, Additivity is considered to be a property that a given
quality characteristic and S/N ratio possess or do not possess. Matrix experiment based
‘on an orthogonal array, followed by a verification experiment is used as a tool to test
whether the chosen quality characteristic and S/N ratio possess the additivity property.
Efficiency Resulting from ignoring Interactions among Control Factors
Let us define some terms commonly used in classical statistical experiment design.
Resolution V designs are matrix experiments where all 2-factor interactions can be
estimated along with the main effects. Resolution IV designs are matrix experiments
where no 2-factor interaction is confounded with the main effects, and no two main
effects are confounded with each other, Resolution III designs (also called saturated
designs) are matrix experiments where no two main effects are confounded with each
other, In a Resolution III design, 2-factor interactions are confounded with main
effects. In an orthogonal array if we allow assigning a factor to each column, then it
becomes a Resolution III design. It is possible to construct a Resolution IV design178 Constructing Orthogonal Arrays Chap. 7
from an orthogonal array by allowing only specific columns to be used for assigning
factors.
It is obvious from the above definitions that for a given number of factors, Reso-
lution III design would need the smallest number of experiments, Resolution IV would
need somewhat more experiments and Resolution V would need the largest number of
experiments, Although heavy emphasis is placed in classical statistical experiment
design on ability to estimate 2-factor interactions, Resolution V designs are used only
very selectively because of the associated large experimental cost. Resolution IV
designs are very popular in classical statistical experiment design, Robust Design
almost exclusively uses Resolution III designs, except in some situations where estima-
tion of a few specific 2-factor interactions is allowed.
The relative economics of Resolution II and Resolution IV designs can be
understood as follows. By using the interaction tables in Appendix C one can see that
Resolution IV designs can be realized in 2-level standard orthogonal arrays by assign-
ing factors to selected columns as shown in Table 7.9.
TABLE 7.9 RESOLUTION III AND IV DESIGNS
Resolution II Design Resolution IV Design
Maximum Maximum
Orthogonal | Number of | Columns to be | Number of | Columns to be
Array Factors Used Factors Used
La 3 1-3 2 12
La 7 1-7 4 1.247
Le 15 1-15 8 1,2,4,7,8,11,13,14
Ly 31 1-31 16 1.2.4.7,8,11,13.
16,19,21,22,25,26,
2831
From the above table it is apparent that for a given orthogonal array roughly twice as
many factors can be studied with Resolution III design compared to Resolution IV
design
Screening Experiments
Classical statistical experiment design frequently uses the following strategy for build-
ing a mathematical model for the response:
1. Screening. Use Resolution III designs to conduct experiments with a large
number of model factors for determining whether each of these factors should be
included in the mathematical model.Sec. 7.10 Comparison with the Classical Statistical Experiment Design 179
2. Modeling. Use Resolution IV (and occasionally, Resolution V) designs to con-
duct experiments with the factors found important during screening to build the
mathematical model.
Robust Design considers screening to be an unnecessary step. Therefore it does
not have separate screening and modeling experiments. At the end of every matrix
experiment the factor effects are estimated and their optimum levels identified. Robust
Design advocates the use of Resolution III designs for all matrix experiments with the
exception that sometimes estimation of a few specific 2-factor interactions is allowed.
Flexibility in Experimental Conditions
Because of the heavy emphasis on the ability to estimate interactions and the complex-
ity of the interactions between 3-level factors, classical statistical experiment design is
frequently restricted to the use of 2-level fractional factorial designs. Consequently, the
number of possible types of experiment conditions is limited. For example, it is not
possible to compare three or four different types of materials with a single 2-level frac-
tional factorial experiment. Also, the curvature effect of a factor (see Figure 4.4) can-
not be determined with only two levels. However, as discussed earlier in this chapter,
the standard orthogonal arrays and the linear graphs used in Robust Design provide
excellent flexibility and simplicity in planning multifactor experiments.
Central Composite Designs
Central composite designs are commonly used in classical experiment design, especially
in conjunction with the response surface methodology (see Myers [M2]) for estimating the
curvature effects of the factors. Although some research is needed to compare the central
composite designs with 3-level orthogonal arrays used in Robust Design, the following
main differences between them are obvious: the central composite design is useful for
only continuous factors, whereas the orthogonal arrays can be used with continuous as
well as discrete factors, As discussed in Chapter 3, the predicted response under any
combination of the control factor levels has the same variance when an orthogonal array is
used. However, this is not true with central composite designs.
Randomization
Running the experiments in a random order is emphasized in classical statistical experi-
ment design to minimize the effect of the nuisance factors on the estimated model fac-
tor effects. Running the experiments in an order that minimizes changes in the levels
of factors that are difficult to change is considered more important in Robust Design.
Randomization is advised to the extent it can be convenient for the experimenter.
In Robust Design, we typically assign control factors to all or most of the
columns of an orthogonal array. Consequently, running the experiments in a random180 Constructing Orthogonal Arrays Chap. 7
order does not scramble the order for all factors effectively. That is, even after arrang-
ing the experiments in a random order, it looks as though the experiments are in a
nearly systematic order for one or more of the factors.
Nuisance factors are analogous to noise factors in Robust Design terminology.
Since robustness against the noise factors is the primary goal of Robust Design, we
introduce the noise factors in a systematic sampling manner to permit equitable evalua-
tion of sensitivity to them.
Before we describe the differences in data analysis, we note that many of the lay-
out techniques described in this book can be used beneficially for modeling the mean
response also.
Differences in Data Analysis
Two Step Optimization
As mentioned earlier, the differences in data analysis arise from the fact that the two
methods were developed to address different problems. One of the common problems
in Robust Design is to find control factor settings that minimize variance while attain-
ing the mean on target. In solving this problem, provision must be made to ensure that
the solution can be adapted easily in case the target is changed. This is a difficult,
multidimensional, constrained optimization problem. The Robust Design method
solves it in two steps. First, we maximize the S/N ratio and, then, use a control factor
that has no effect on the S/N ratio to adjust the mean function on target. This is an
unconstrained optimization problem, much simpler than the original constrained optimi-
zation problem. Robust Design addresses many engineering design optimization prob-
lems as described in Chapter 5
Classical statistical experiment design has been traditionally concerned only with
modeling the mean response. Some of the recent attempts to solve the engineering
design optimization problems in the classical statistical experiment design literature are
discussed in Box [BI], Leon, Shoemaker, and Kackar [L2], and Nair and Pregibon
[N2].
Significance Tests
In classical statistical experiment design, significance tests, such as the F test, play an
important role. They are used to determine if a particular factor should be included in
the model. In Robust Design, F ratios are calculated to determine the relative impor-
tance of the various control factors in relation to the error variance. Statistical
significance tests are not used because a level must be chosen for every control factor
regardless of whether that factor is significant or not. Thus, for each factor the best
level is chosen depending upon the associated cost and benefit.Sec. 7.11 Summary 181
7.11 SUMMARY
* The process of fitting an orthogonal array to a specific project has been made
particularly easy by the standard orthogonal arrays and the graphical tool, called
linear graphs, developed by Taguchi to represent interactions between pairs of
columns in an orthogonal array. Before constructing an orthogonal array, one
must define the requirements which consist of:
1, Number of factors to be studied
2. Number of levels for each factor
3. Specific 2-factor interactions to be estimated
4. Special difficulties that would be faced in running the experiments
‘The first step in constructing an orthogonal array to fit a specific case study is to
count the total degrees of freedom that tells the minimum number of experiments
that must be performed to study the main effects of all control factors and the
chosen interactions.
Genichi Taguchi has tabulated 18 standard orthogonal arrays. In many problems,
one of these arrays can be directly used to plan a matrix experiment. The arrays
are presented in Appendix C,
Orthogonality of a matrix experiment is not lost by keeping empty one or more
columns of the array.
* The columns of the standard orthogonal arrays are arranged in the increasing
order of number of changes; that is, the number of times the level of a factor
must be changed in running the experiments in the numerical order is smaller for
the columns on the left than those on the right. Consequently, factors whose lev-
els are difficult to change should be assigned to columns on the left.
Although in most Robust Design experiments one chooses not to estimate any
interactions among the control factors, there are situations where it is desirable to
estimate a few sclected interactions. The linear graph technique makes it easy to
plan orthogonal array experiments that involve interactions.
Linear graphs represent interaction information graphically and make it casy to
assign factors and interactions to the various columns of an orthogonal array. In
a linear graph, the columns of an orthogonal array are represented by dots and
lines. When two dots are connected by a line, it means that the interaction of the
two columns represented by the dots is contained in (or confounded with) the
column(s) represented by the line. In a linear graph, each dot and each line has a
distinct column number(s) associated with it. Furthermore, every column of the
array is represented in its linear graph once and only once.182
Constructing Orthogonal Arrays Chap. 7
+ The principal utility of linear graphs is for creating a variety of different orthogo-
nal arrays from the standard orthogonal arrays to fit real problems.
* Techniques described in this chapter for modifying orthogonal arrays are sum-
marized in Table 7.10.
* Depending on the needs of the case study and experience with matrix experi-
ments, the experimenter should use the beginner, intermediate, or advanced stra-
tegy to plan experiments. The beginner strategy (see Table 7.7) involves the use
of a standard orthogonal array. The intermediate strategy (see Table 7.8)
involves minor but simple modifications of the standard orthogonal arrays using
the dummy level, compound factor, and column merging techniques. A vast
majority of case studies can be handled by the beginner or the intermediate stra-
tegies. The advanced strategy requires the use of the linear graph modification
rules and is needed relatively infrequently. In complicated case studies, the
advanced strategy can greatly simplify the task of constructing orthogonal arrays.
* Although Robust Design draws on many ideas from classical statistical experi-
ment design, the two methods differ because they address different problems.
Classical statistical experiment design is used for modeling the mean response,
whereas Robust Design is used to minimize the variability of a product's func-
tion,
TABLE 7.10 TECHNIQUES FOR MODIFYING ORTHOGONAL ARRAYS:
Needed Linear
‘Technique Application Graph Pattern
Dummy level Assign an m-level factor to a n-level column NA
(m or R3, can be used to
adjust the value of Ry at which the bridge balances. We decide to use R3, and, thus, it
is our signal factor for deciding the temperature setting. The resistance Ry by itself is
the signal factor for the ON-OFF operations.
The purpose of the Zener diode (nominal voltage = E,) in the circuit is to regu-
late the voltage across the terminals a and b (see Figure 9.2). That is, when the Zener
diode is used, the voltage across the terminals a and b remains constant even if the
power supply voltage, Eo, drifts or fluctuates. Thus it reduces the dependence of the
threshold values Ry-oy and Ry-ogr on the power supply voltage Eo.
As a general rule, the nominal values of the various circuit parameters are poten-
tial control factors, except for those completely defined by the tracking rules. In the
temperature control circuit, the control factors are R,, R2, R4 and £,. Note that we do218 Design of Dynamic Systems Chap. 9
not take Eo as a control factor. As a rule, the design engineer has to make the deci-
sion about which parameters should be considered control factors and which should
not. The main function of Eo is to provide power for the operation of the relay, and
its nominal value is not as important for the ON-OFF operation. Hence, we do not
include Eo as a control factor. The tolerances on Ry, R>, R4, E;, and Eq are the noise
factors.
For proper operation of the circuit, we must have E, < Ep. Also, R4 must be
much bigger than Ry or R>. These are the tracking relationships among the circuit
parameters,
9.3 QUALITY CHARACTERISTICS AND S/N RATIOS
Selection of Quality Characteristics
The resistances Rr_gy and Ry_gpp are continuous variables that are obviously directly
related to the ON-OFF operations; together, they completely characterize the circuit
function. Through standard techniques of circuit analysis, one can express the values
of Ry-ow and Ry_opp as following simple mathematical functions of the other circuit
parameters:
RyRYER stEoR 1)
R =o A
7-08 ~ BER ER, ERD op
R3R2Ry
Ryo = BARD” (9.2)
Thus, by the criteria defined in Chapter 6, Rr_oy and Rr_ogr are appropriate choices
for the quality characteristics.
Suppose Equations (9.1) and (9.2) for the evaluation of Ry_gy and Ry_opp were
not available and that hardware experiments were needed to determine their values.
Measuring Rp_oy and Rr_orr would still be easy. It could be accomplished by incre-
menting the values of Ry through small steps until the heater turns ON and decrement-
ing the values of Ry until the heater turns OFF.
Selection of S/N Ratio
The ideal relationship of Ry_oy and Rr-gpr with R3 (the signal factor) is linear, pass-
ing through the origin, as shown in Figure 9.5. So for both quality characteristics, the
appropriate S/N ratio is the C-C type S/N ratio, described in Chapter 5. Suppose for
some particular levels of the control factors and particular tolerances associated withSec. 9.3 Quality Characteristics and S/N Ratios 219
Ry(kQ)
0 0s 1.0 15
Aa (kQ)
Figure 9.5 Plot of Ry_ow and Ry_orr VS. R3 for the starting design.
the noise factors we express the dependence of Rr-gy on R3 by the following equation
obtained by the least squares fit:
Rr-on = BR3 +e (9.3)
where B is the slope and ¢ is the error. Note that any nonlinear terms in R (such as R3 or
R3) are included in the error e. The S/N ratio for R7_w is given by
B?
10 logy =. (9.4)
6;
Similarly, let the dependence of R ror on R 3 be given by
Rr-orr = B’R3 + e” (9.5)220 Design of Dynamic Systems Chap. 9
where B’ is the slope and e” is the error. Then, the corresponding S/N ratio for
Ry_ore is given by
10 logo J . (0.6)
Evaluation of the S/N Ratio
Let us first see the computation of the S/N ratio for Ry_oy. The nominal values of the
circuit parameters under the starting conditions, their tolerances (three-standard-
deviation limits), and the three levels for testing are shown in Table 9.1. These levels
were computed by the procedure described in Chapter 8; that is, for each noise factor,
the levels 1 and_3 are displaced from level 2, which is equal to its mean value, on
either side by ¥3/2 6, where o is one-third the tolerance.
TABLE 9.1 NOISE AND SIGNAL FACTORS FOR TEMPERATURE CONTROL CIRCUIT
4
Levels
(Multiply by Mean
for Noise Factors)
Tolerance
Factor Mean* (%) 1 2 3
AR, 40 kQ 5 1.0204 | 1.0 | 0.9796
B. Ry 8.0 kQ 5 1.0204 | 1.0 | 0.9796
c. Ry 40.0 kQ 5 1.0204 | 1.0 | 0.9796
D. Eo 10.0 V 5 1.0204 | 1.0 | 0.9796
FE, 60V 5 10208 | 1.0 | 0.9796
M. Ry (signal) | - - 05kQ | 1.02 | 1542
* Mean values listed here correspond to the nominal values for the
starting design.
The ideal relationship between R3 and Ry_oy is a straight line through the origin
with the desired slope. Second- and higher-order terms in the relationship between R
and Ry_oy should be, therefore, minimized. Thus, we take three levels for the signal
factor (R3): 0.5 kQ, 1.0 kQ, and 1.5 kQ. Here Rr_ow must be zero when R33 is zero.
So, with three levels of R3, we can estimate the first-, second-, and third-order terms in
the dependence of Ry_oy on R3. The first order, or the linear effect, constitutes theSec. 9.3 Quality Characteristics and S/N Ratios 221
desired signal factor effect. We include the higher-order effects in the noise variance
so they are reduced with the maximization of n. [It is obvious from Equations (9.1)
and (9.2) that the second- and higher-order terms in R3 do not appear in this circuit.
Thus, taking only one level of R3 would have been sufficient. However, we take three
levels to illustrate the general procedure for computing the S/N ratio.]
As discussed in Chapter 8, an orthogonal array (called noise orthogonal array)
can be used to simulate the variation in the noise factors. In addition to assigning
noise factors to the columns of an orthogonal: array, we can also assign one of the
columns to the signal factor. From the values of Rr_gy corresponding to each row of
the noise orthogonal array, we can perform least squares regression (see Section 5.4, or
Hogg and Craig [H3], or Draper and Smith [D4]) to estimate B and o? and then the
SIN ratio, 1.
Chapter 8 pointed out that the computational effort can be reduced greatly by
forming a compound noise factor. For that purpose, we must first find directionality of
the changes in Ry_y caused by the various noise factors. By studying the derivatives
of Rr-oy with respect to the various circuit parameters, we observed the following
relationships: Ry_oy increases whenever R; decreases, R> increases, Ry decreases, Eo
increases, or E, decreases. (If the formula for Ry_gy were complicated, we could have
used the noise orthogonal array to determine the directionalities of the effects.) Thus,
we form the three levels of the compound noise factor as follows:
(CN),: (R13, (Rais (Ras, Eo)» (E2)3
(CN)2: (Ry)2s R2)2, Rada» (Eod2» (Ez)2
(CN): Ridis R2)s, Ra), ods: Edn -
For every level of the signal factor, we calculate Rr_oy with the noise factor lev-
els set at the levels (CN);, (CN), and (CN);. Thus, we have nine testing conditions
for the computation of the S/N ratio. The nine values of Ry_gy corresponding to the
starting values of the control factors (R, = 4.0 kQ, R2 = 8.0 kQ, Ry = 40.0 kQ, and
E, = 6.0 KQ) are tabulated in Table 9.2. Let y; denote the value of Ry_oy for the i”
testing condition; and let Ru) be the corresponding value of R3. Then, from standard
least squares regression analysis (see Section 5.4), we obtain
Rai + Rayo t°'" + Rao
Rhy + Rha) ++" + Ro)
B 0.7)
The error variance is given by
a=
Me
£E Or BRI? 08)222 Design of Dynamic Systems Chap. 9
Substituting the appropriate values from Table 9.2 in Equations (9.7) and (9.8) we
obtain, B = 2.6991 and 7 = 0.030107. Thus, the S/N ratio for Rr-oy corresponding
to the starting levels of the control factors is
B2
a = 23.84 dB.
o
1 = 10 logio
The S/N ratio, 1/, for Rr_ore can be computed in exactly the same manner as
we computed the S/N ratio for Rr_oy-
Note that for dynamic systems, one must identify the signal factor and define the
SIN ratio before making a proper choice of testing conditions. This is the case with
the temperature control circuit.
TABLE 9.2 TESTING CONDITIONS FOR EVALUATING 1)
Ry CN
(Signal Factor) | (Compound Noise | y = Ry-ov*
‘Test No. (ka) Factor) 2)
1 05 (CN), 1.2586
2 05 (CN, 1.3462
3 05 (CN), 1.4440
4 10 (CN), 25171
5s 10 (CN); 2.6923
6 10 (CN); 2.8880
7 15 (CN), 3.7757
8 15 (CN); 4.0385
9 Ls (CN)3, 43319
* These Rr-ow values correspond to the starting levels for the
control factors.
9.4 OPTIMIZATION OF THE DESIGN
Control Factors and Their Levels
The four control factors, their starting levels, and the altemate levels are listed in Table
9.3. For the three resistances (Rj, Ro, and R,), level 2 is the starting level, level 3 is
1.5 times the starting level, and level i is 1/1.5 = 0.667 times the starting level. Thus,Sec. 9.4 Optimization of the Design 223
we include a fairly wide range of values with the three levels for each control factor.
Since the available range for E, is restricted, we take its levels somewhat closer. Level
3 of E, is 1.2 times level 2, while level | is 0.8 times level 2.
TABLE 9.3 CONTROL FACTORS FOR THE TEMPERATURE CONTROL CIRCUIT*
Levels*
Factor 1 2] 3
A. Ry (kQ) | 2.668 | 40 | 60
B. R,(kQ) | 5336 | 80 | 120
C. Ry (kQ) | 26.68 60.0
D. E) | 48 | 60 | 72
* Starting levels are indicated by an
underscore.
Control Orthogonal Array
The orthogonal array Ly, which has four 3-level columns, is just right for studying the
effects of the four control factors. However, by taking a larger array, we can also get a
better indication of the additivity of the control factor effects. Further, computation is
very inexpensive for this circuit. So, we use the Lyg array to construct the control
orthogonal array. The Lyg array and the assignment of the control factors to the
columns are given in Table 9.4. The control orthogonal array for this study is the sub-
matrix of Lg formed by the columns assigned to the four control factors.
Data Analysis and Optimum Conditions
For each row of the control orthogonal array, we computed B and n for Rr-oy. The
values of 1) and B* are shown in Table 9.4 along with the control orthogonal array.
The possible range for the values of B is 0 to oo and we are able to get a better additive
model for B in the log transform. Therefore, we study the values of B in the decibel
scale, namely 20 logyoB. The results of performing the analysis of variance on 1 and
20 logioB are tabulated in Tables 9.5 and 9.6. The control factor effects on 1) and
20 logioB are plotted in Figure 9.6(a) and (b). Also shown in the figure are the control
factor effects on ty’ and 20 logigB” corresponding to the quality characteristic Rr_orr.
The following observations can be made from Figure 9.6(a):
* For the ranges of control factor values listed in Table 9.3, the overall S/N ratio
for the OFF function is higher than that for the ON function. This implies thatDesign of Dynamic Systems
Chap. 9
the spread of Rr_gy values caused by the noise factors is wider than the spread
of Rr_ore values.
* The effects of the control factors on 1)’ are much smaller than the effects on 1.
* Ry has negligible effect on 1 or 1.
* 1) can be increased by decreasing R2; however, it leads to a small reduction in
y.
* 1 can be increased by increasing R4; however, it too leads to a small reduction
inty.
* n can be increased by increasing E,, with no adverse effect on 11’.
TABLE 9.4 CONTROL ORTHOGONAL ARRAY AND DATA FOR Ay ow
Column Numbers
and Factor Assignment*
Expt. 12345678 a
No. eeABCe De | (dB) BP
1 Pit dott dd | 2241 9.59
2 112222 2 2 | 23.84 7.29
3 113 3 3 3 3 3 | 2479 6.12
4 121 1 22:3 3 | 2585 5.33
5 1222331 1 | 2419 712
6 1233 112 2 | 1947 15.66
7 131 2 1 3 2 3 | 22.25 19.27
8 1323213 1 | 2361 15.04
9 13 3°13 2 1 2 | 2493 1,42
10 2411332 2:1 | 2423 31,22
i 212 1 1 3 3 2 | 2450 3.07
12 213221 1 3 ] 2213 5.03
B 22123 :1 3 2 | 2602 31
4 22231213] 1619 | 6102
15 223 123 2:1 | 2460 149
16 2313 231 2 | 2026 58.36
0 23213123) 2594 | 249
18 23°32 12 3:1 | 2305 3.95
* Empty column is indicated by e.Sec, 9.4 Optimization of the Design
BLE 9.5 ANALYSIS OF S/N RATIO DATA FOR Aron”
Average n by
Level*
Degrees of | Sum of | Mean
Factor | 1 2 3 | Freedom | Squares | Square | F
A. R, | 23.50 | 23.04 | 23.16 2 0.7 04 -
Ry | 24.70 | 23.58 | 21.42 2 33.33 16.67 22
Ry | 2131 | 2338 | 25.02 2 41.40 | 20.70 | 27
D. &, | 21.68 | 23.39 | 24.64 2 2637 | 13.19 | 17
Error 9 6.87 0.76
Total 7 108.67
* Overall mean 1) = 23.23 dB.
+ Starting levels are indicated by an underscore.
TABLE 9.6 ANALYSIS OF 20 log,. 8 FOR Rr-on*
Average 20 log,oB by |
Levelt
Degrees of | Sum of | Mean
Factor | 1 | 2 | 3 | Freedom | Squares | Square | F
R, | 1218 | 927 | 601 2 420) 571 | 94
B. Rz |} 486 | 892 | 1368 2 2335 | 1168 | 191
C. R, | 1055 | 901 | 789 2 24a | 107 | 18
D. £, | 10.40 | 9.01 | 8.05 2 16.8 84 | 14
Enor 9 55 061
Total 17 3914
* Overall mean value of 20 logioB = 9.15 dB.
+ Starting levels are indicated by an underscore.226 Design of Dynamic Systems Chap. 9
4
G0 --=—- ape -- of -- -epe-- OFF
g
F
g
a ON
3
&
OFF
AA, A, 8, 8,8, ©, 0,0, D, D, D,
7A. 3 8.0 12.0(26.7 40.0 60.0) 48 6.0 7.2
27 40 6.0) (53 80 129
AKO) — Re(kQ) — Aa(kQ) £2 (V)
Figure 9.6 Plots of factor effects. Underscore indicates starting level. Two-standard-
deviation confidence limits are also shown for the starting level. Estimated confidence limits
for 20 logy are too small to show.
Thus, the optimum settings of the control factors suggested by Figure 9.6(a) are
Ry = 4.0 kQ, Ry = 5.336 KQ, Ry = 60.0 KQ, and E, = 7.2 V. The verification experi-
ment under the optimum conditions gave 1) = 26.43 dB, compared to 23.84 dB underSec. 9.5 Hterative Optimization 227
the starting design. Similarly, under the optimum conditions we obtained 1’ = 29.10
dB, compared to 29.94 dB under the starting design. Note that the increase in 1 is
much larger than the reduction in 1’.
From Figure 9,6(b), it is clear that both R, and Rz have a large effect on 6 and
B’. The control factors R4 and E, have a somewhat less effect on B and B’. Since Ry
has no effect on the S/N ratios, it is an ideal choice for adjusting the slopes B and f’.
This ability to adjust B and B gives the needed freedom to: (1) match the values of
Rr-on and Ry_org with the chosen thermistor for serving the desired temperature
range, and (2) obtain the desired hysteresis. As discussed earlier, the needed separation
(hysteresis) between Rp_oy and Ry_opr is determined by the thermal analysis of the
heating system, which is not discussed here.
9.5 ITERATIVE OPTIMIZATION
The preceding section showed one cycle of Robust Design, It is clear from Figure 9.6
that the potential exists for further improvement. By taking the optimum point as a
starting design, one can repeat the optimization procedure to achieve this potential—
that was indeed done for the temperature control circuit. For each iteration, we took
the middle level for each control factor as the optimum level from the previous itera-
tion, and then took levels | and 3 to have the same relationship with the level 2 for
that factor as in the first iteration. However, during these iterations, we did not let the
value of E, exceed 7.2 V so that adequate separation between E, and Eo obtained
through three iterations are shown in Table 9.7. Of course, some additional improve-
ment is possible, but by the third iteration, the rate of improvement has clearly slowed
down, so one need not proceed further.
The foregoing discussion points to the potential of using orthogonal arrays to
optimize nonlinear functions iteratively. Although we would expect this to be a topic
of active research, the following advantages of using orthogonal arrays over many com-
monly used nonlinear programming methods are apparent:
* No derivatives have to be computed
* Hessian does not have to be computed
* Algorithm is less sensitive to starting conditions
* Large number of variables can be handled easily
* Combinations of continuous and discrete variables can be handled easily
While most standard nonlinear programming methods are based on the first- and
second-order derivatives of the objective function at a point, the orthogonal array
method looks at a wide region in each iteration. That is, while the nonlinear program-
ming methods constitute point approaches, the orthogonal array method is a region
approach. Experience in using the orthogonal array method with a variety of problems228
Design of Dynamic Systems Chap. 9
indicates that, because of the region approach, it works particularly well in the early
stages, that is, when the starting point is far from the optimum, Once we get near the
optimum point, some of the standard nonlinear programming methods, such as the
Newton-Ralphson method, work very well. Thus one may wish to use the orthogonal
array method in the beginning and then switch to a standard nonlinear programming
method.
TABLE 9.7 ITERATIVE OPTIMIZATION
n ney
Iteration Number | (dB) (dB) (4B)
Starting condition | 23.84 | 29.94 | 53.78
Iteration 1 26.43 | 29.10 | 55.53
Iteration 2 27.30 | 28.70 | 56.00
Iteration 3 27.77 | 28.51 | 56.28
9.6 SUMMARY
Dynamic systems are those in which we want the system’s response to follow
the levels of the signal factor in a prescribed manner. The changing nature of
the levels of the signal factor and the response make the design of a dynamic
system more complicated than designing a static system, Nevertheless, the eight
steps of Robust Design (described in Chapter 4) still apply.
A temperature controller is a feedback system and can be divided into three main
modules: (1) temperature sensor, (2) temperature control circuit, and (3) a heat-
ing (or cooling) element. For designing a robust temperature controller, the three
modules must be made robust separately and then integrated together.
The temperature control circuit is a doubly dynamic system. First, for a particu-
lar target temperature of the bath, the circuit must tum a heater ON or OFF at
specific threshold temperature values. Second, the target temperature may be
changed by the user.
Four circuit parameters (R;, R2, R4, and E,) were selected as control factors.
The resistance R3 was chosen as the signal factor. The tolerances in the control
factors were the noise factors.Sec. 9.6 Summary 229
© The threshold resistance, Roy, at which the heater turns ON and the threshold
resistance, Rr_ogf, at which the heater tums OFF were selected as the quality
characteristics. The variation of Rr-oy and Rp-orp as a function of R3 formed
two C-C type dynamic problems.
To evaluate the S/N ratio (n) for the ON function, a compound noise factor, CN
was formed. Three levels were chosen for the signal factor and the compound
noise factor. Rr_oy was computed at the resulting nine combinations of the sig-
nal and noise factor levels and the S/N ratio for the ON function was then com-
puted. (An orthogonal array can be used for computing the S/N ratio when
engineering judgment dictates that multiple noise factors be used), The S/N ratio
for the OFF function (1’) was evaluated in the same manner,
‘The Lys array was used as the control orthogonal array. Through one cycle of
Robust Design, the sum 7 + 7)’ was improved by 1.75 dB. Iterating the Robust
Design cycle three times led to 2.50 dB improvement in ) + 1)’ .
Orthogonal arrays can be used to optimize iteratively a nonlinear function. ‘They
provide a region approach and perform especially well when the starting point is
far from the optimum and when some of the parameters are discrete.Chapter 10
TUNING COMPUTER
SYSTEMS FOR
HIGH PERFORMANCE
‘A computer manufacturer invariably specifies the “best" operating conditions for a
computer. Why then is tuning a computer necessary? The answer lies in the fact that
manufacturers specify conditions based on assumptions about applications and loads.
However, actual applications and load conditions might be different or might change
‘over a period of time, which can lead to inferior computer performance. Assuming a
system has tunable parameters, two options are available to improve performance: (1)
buying more hardware or (2) finding optimum settings of tunable parameters. One
should always consider the option of improving the performance of a computer system
through better administration before investing in additional hardware. In fact, this
option should be considered for any system that has tunable parameters that can be set
by the system administrator. Determining the best settings of tunable parameters by
the prevailing trial-and-error method may, however, prove to be excessively time con-
suming and expensive. Robust Design provides a systematic and cost efficient method
‘of experimenting with a large number of tunable parameters, thus yielding greater
improvements in performance quicker.
This chapter presents a case study to illustrate the use of the Robust Design
method in tuning computer performance. A few details have been modified for
pedagogic purposes. The case study was performed by T, W. Pao, C. S. Shetterd, and
M. S. Phadke [P1] who are considered to be the first to conduct such a study to opti-
mize a hardware-sofiware system using the Robust Design method.
231232 Tuning Computer Systems for High Performance Chap. 10
There are nine sections in this chapter:
Section 10.1 describes the problem formulation of the case study (Step 1 of the
Robust Design steps described in Chapter 4).
Section 10.2 discusses the noise factors and testing conditions (Step 2).
Section 10.3 describes the quality characteristic and the signal-to-noise (S/N)
ratio (Step 3).
Section 10.4 discusses control factors and their alternate levels (Step 4).
Section 10.5 describes the design of the matrix experiment and the experimental
procedure used by the research team (Steps 5 and 6).
Section 10.6 gives the data analysis and verification experiments (Steps 7 and 8).
Section 10.7 describes the standardized S/N ratio that is useful in compensating
for variation in load during the experiment.
Section 10.8 discusses some related applications.
Section 10.9 summarizes the important points of this chapter.
10.1 PROBLEM FORMULATION
The case study concerns the performance of a VAX* 11-780 machine running the
UNIX operating system, Release 5.0. The machine had 48 user terminal ports, two
remote job entry links, four megabytes of memory, and five disk drives. The average
number of users logged on at a time was between 20 and 30.
Before the start of the project, the users’ perceptions were that system perfor-
mance was very poor, especially in the afternoon. For an objective measurement of the
response time, the experimenters used two specific, representative commands called
standard and trivial. The standard command consisted of creating, editing, and remov-
ing a file; the trivial command was the UNIX system date command, which does not
involve input/output (1/0). Response times were measured by submitting these com-
mands via the UNIX system crontab facility and clocking the time taken for the com-
puter to respond using the UNIX system timex command, both of which are automatic
system processes.
For the particular users of this machine, the response time for the standard and
trivial commands could be considered representative of the response time for other
* VAX is a registered trademark of Digital Equipment Corporation.Sec. 10.1 Problem Formulation 233
various commands for that computer. In some other computer installations, response
time for the compilation of a C program or the time taken for the troff command (a
text processing command) may be more representative.
Figure 10.1(a)—(b) shows the variation of the response times as functions of
time of day for the standard and trivial commands. Note that at the start of the study,
the average response time increased as the afternoon progressed (see the curves marked
"Initial" in the figure), The increase in response time correlated well with the increase
in the work load during the afternoon. The objective in the experiment was to make
the response time uniformly small throughout the day, even when the load increased as,
usual.
There are two broad approaches for optimizing a complex system such as a com-
puter: (1) micro-modeling and (2) macro-modeling. They are explained next.
£1/2 Std. Dev.
Initial fires
10 1.0
g g
o @
é E
= =
2s Zoos
a a0
8 s
« Optimized ©
Optimized
° 0
9 10 11 12 13 14 15 16 9 10 11 12°13 14 15 16
Time of Day Time of Day
(a) For the Standard Command (0) For the Trivial Command
Figure 10.1 Comparison of response times.
Micro-Modeling
As the name suggests, micro-modeling is based on an in-depth understanding of the
system. It begins by developing a mathematical model of the system, which, in this
case, would be of the complex internal queuing of the UNIX operating system. When
systems are complex, as in the case study, we must make assumptions that simplify the234 Tuning Computer Systems for High Performance Chap. 10
operation, as well as put forth considerable effort to develop the model. Furthermore,
the more simplifying we do, the less realistic the model will be, and, hence, the less
adequate it will be for precise optimization. But once an adequate model is con-
structed, a number of well-known optimization methods, including Robust Design, can
be used to find the best system configuration.
Macro-Modeling
In macto-modeling, we bypass the step of building a mathematical model of the sys-
tem, Our concem is primarily with obtaining the optimum system configuration, not
with obtaining a detailed understanding of the system itself. As such, macro-modeling
gives faster and more efficient results. It gives the specific information needed for
optimization with a minimum expenditure of experimental resources.
The UNIX system is viewed as a "black box," as illustrated in Figure 10.2. The
parameters that influence the response time are identified and divided into two classes:
noise factors and control factors. The best settings of the control factors are deter-
mined through experiments. Thus, the Robust Design method lends itself well for
optimization through the macro-modeling approach.
Noise Factors
(system load)
Response Time
UNIX System
yt (xiz)
Control Factors
(system configuration)
Figure 10.2 Block diagram for the UNIX system.
10.2 NOISE FACTORS AND TESTING CONDITIONS
Load variation during use of the machine, from day-to-day and as a function of the
time of day, constitutes the main noise factor for the computer system under study.Sec. 10.3 ‘Quality Characteristic and S/N Ratio 235
The number of users logged on, central processor unit (CPU) demand, I/O demand, and
memory demand are some of the more important load measures. Temperature and
humidity variations in the computer room, as well as fluctuations in the power supply
voltage, are also noise factors but are normally of minor consequence.
The case study was conducted live on the computer. As a result, the normal
variation of load during the day provided the various testing conditions for evaluating
the S/N ratio.
At the beginning of the study, the researchers examined the operating logs for
the previous few weeks to evaluate the day-to-day fluctuation in response time and
load. ‘The examination revealed that the response time and the load were roughly simi-
lar for all five weekdays. This meant that Mondays could be treated the same as Tues-
days, ete. If the five days of the week had tured out to be markedly different from
each other, then those differences would have had to be taken into account in planning
the experiment.
10.3 QUALITY CHARACTERISTIC AND S/N RATIO
Let us first consider the standard command used in the study. Suppose it takes to
seconds to execute that command under the best circumstances—that is, when the load
is zero, fy is the minimum possible time for the command. Then, it becomes obvious
that the actual response time for the standard command minus fg is a quality charac-
teristic that is always nonnegative and has a target value of zero—that is, the actual
response time minus ty belongs to the smaller-the-better type problems. In the case
study, the various measurements of response time showed that ty was much smaller
than the observed response time. Hence, tg was ignored and the measured response
time was treated as a smaller-the-better type characteristic. The corresponding S/N
ratio to be maximized is
mean square response time
N= 10 logio | for the standard command 10.)
Referring to Figure 10.1, it is clear that at the beginning the standard deviation
of the response time was large, so much so that it is shown by bars of length 1/2
standard deviation, as opposed to the standard practice of showing +2 standard devia
tions. From the quadratic loss function considerations, reducing both the mean and
variance is important. It is clear that the S/N ratio in Equation (10.1) accomplishes
this goal because mean square response time is equal to sum of the square of the mean
and the variance.236 Tuning Computer Systems for High Performance Chap. 10
For the response time for the trivial command, the same formulation was used.
That is, the S/N ratio was defined as follows:
mean square response time
W = —10 logio | for the trivial command
(10.2)
10.4 CONTROL FACTORS AND THEIR ALTERNATE LEVELS
The UNIX operating system provides a number of tunable parameters, some of which
relate to the hardware and others to the software. Through discussions with a group of
system administrators and computer scientists, the experiment team decided to include
the eight control factors listed in Table 10.1 for the tuning study. Among them, factors
A, C, and F are hardware related, and the others are software related. A description of
these parameters and their alternate levels is given next. The discussion about the
selection of levels is particularly noteworthy because it reveals some of the practical
difficulties faced in planning and carrying out Robust Design experiments.
TABLE 10.1 CONTROL FACTORS AND LEVELS
Levels*
Factor 1 2 3
A. Disk drives (RMOS & RPO6) | 4&1 | 4&2 | 4&2
B. File distribution alo ©
C. Memory size (MB) 4 | 3 | 35
D. System buffers us | 14 | 3
E. Sticky bits 3 8
F. KMCs used
G. INODE table entries
8 |r |e
3 .
lo |g
H_ Other system tables a b
* The starting levels are indicated by an underscore.
The number and type of disk drives (factor A) is an important parameter that
determines the I/O access time, At the start, there were four RMOS disks and one
RPO6 disk. The experimenters wanted to see the effect of adding one more RPO6Sec. 10.4 Control Factors and Their Alternate Levels 237
disk (level A>), as well as the effect of adding one RPO7 disk and a faster memory
controller (level A3). However, the RPO7 disk did not arrive in time for the experi-
ments. So, level A3 was defined to be the same as level A for factor A. The next
section discusses the care taken in planning the matrix experiment, which allowed the
experimenters to change the plan in the middle of the experiments.
The file system distributions (factor B) a, b and ¢ refer to three specific algo-
rithms used for distributing the user and system files among the disk drives. Obvi-
ously, the actual distribution depends on the number of disk drives used in a particular
system configuration, Since the internal entropy (a measure of the lack of order in
storing the files) could have a significant impact on response time, the team took care
to preserve the intemal entropy while changing from one file system to another during
the experiments.
One system administrator suggested increasing the memory size (factor C) to
improve the response time. However, another expert opinion was given that stated
additional memory would not improve the response for the particular computer system
being studied. Therefore, the team decided not to purchase more memory until they
were reasonably sure its cost would be justified. They took level C; as the existing
memory size, namely 4 MB, and disabled some of the existing memory to form levels
C and C3. They decided to purchase more memory only if the experimental data
showed that disabling a part of the memory leads to a significant reduction in perfor-
mance.
Total memory is divided into two parts: system buffers (factor D) and user
memory. The system buffers are used by the operating system to store recently used
data in the hope that the data might be needed again soon. Increasing the size of the
system buffers improves the probability (technically called hit ratio) of finding the
needed data in the memory. This can contribute to improved performance, but it also
reduces the memory available to the users, which can lead to progressively worse sys-
tem performance. Thus, the optimum size of system buffers can depend on the particu-
lar load pattern, We refer to the size of the system buffers as a fraction of the total
memory size. Thus, the levels of the system buffers are sliding with respect to the
memory size.
Sticky bit (factor E) is a way of telling an operating system to treat a command
in a special way. When the sticky bit for a command such as rm or ed is set, the exe-
cutable module for that command is copied contiguously in the swap area of a disk
during system initialization, Every time that command is needed but not found in the
memory, it is brought back from the swap area expeditiously in a single operation.
However, if the sticky bit is not set and the command is not found in the memory, it
must be brought back block by block from the file system. This adds to the execution
time.
In this case study, factor E specifies how many and which commands had their
sticky bits set. For level Ey, no command had its sticky bit set. For level E>, the
three commands that had their sticky bits set were sh, ksh and rm. These were the
three most frequently used commands during the month before the case study238 Tuning Computer Systems for High Performance Chap. 10
according to a 5-day accounting command summary report, For the level £3, the eight
commands that had their sticky bits set were the three commands mentioned above,
plus the next five most commonly used commands, namely, the commands Is, cp, expr,
chmod, and sade (a local library command).
KMCs (factor F) are special devices used to assist the main CPU in handling the
terminal and remote job entry traffic. They attempt to reduce the number of interrupts
faced by the main CPU. In this case study, only the KMCs used for terminal traffic
were changed, Those used for the remote job entry links were left alone. For level
F, two KMCs were used to handle the terminal traffic, whereas for level F2 the two
KMCs were disabled.
The number of entries in the NODE table (factor G) determines the number of
user files that can be handled simultaneously by the system. The three levels for the
factor G are 400, 500, and 600. The three levels of the eighth factor, namely, the other
system tables (factor H), are coded as a, b, and c.
Note that the software factors (B, D, E, G, and H) can affect only response time.
However, the three hardware factors (A, C, and F) can affect both the response time
and the computer cost. Therefore, this optimization problem is not a pure parameter
design problem, but, rather, a hybrid of parameter design and tolerance design.
10.5 DESIGN OF THE MATRIX EXPERIMENT AND THE
EXPERIMENTAL PROCEDURE
This case study has seven 3-level factors and one 2-level factor. There are
7x 3-1) + 1 x (2-1) + 1=16 degrees of freedom associated with these factors. The
orthogonal array Lg is just right for this project because it has seven 3-level columns
and one 2-level column to match the needs of the matrix experiment. The Lg array
and the assignment of columns to factors are shown in Table 10.2. Aside from assign-
ing the 2-level factor to the 2-level column, there is really no other reason for assigning
a particular factor to a particular column. The factors were assigned to the columns in
the order in which they were listed at the time the experiment was planned. Some
aspects of the assignment of factors to columns are discussed next,
The experiment team found that changing the level of disk drives (factor A) was
the most difficult among all the factors because it required an outside technician and
took three to four hours. Consequently, in conducting these experiments, the team first
conducted all experiments with level A, of the disk drives, then those with level A> of
the disk drives, and finally those with level A3 of the disk drives. The experiments
with level A3 of the disk drives were kept for iast to allow time for the RPOT disk to
arrive. However, because the RPO7 disk did not arrive in time, the experimenters
redefined level A of the disk drives to be the same as level A, and continued with the
rest of the plan. According to the dummy level technique discussed in Chapter 7, this
redefinition of level does not destroy the orthogonality of the matrix experiment, This
arrangement, however, gives 12 experiments with level A» of the disk drives; hence,Sec. 10.5
Design of the Matrix Experiment and the Experimental Procedure
239
more accurate information about that level is obtained when compared to level Ay.
This is exactly what we should look for because level A2 is the new level about which
we have less prior information.
TABLE 10.2 Lj, ORTHOGONAL ARRAY AND FACTOR ASSIGNMENT
Column Number and
Factor Assignment
Expt. |1 0203 4 5 6 7 8
No |F BC DE A GH
pprorororrari a
2/1 12 2 2 2 2 2
3/1 1 3 3 3 3 3 3
4 [12 1 12 2 3 3
$s f]1 222 3 3 11
6 }1 2 3 3 1 1 2 2
7 }1 3 1 2 1 3 2 3
8 ]1 3 2 3 2 13 1
13°93 13 2 1 2
to }2 1 1 3 3 2 2 1
m]2 6 2 1 1 3 3 2
m2 1 32 2 11 3
wil22 12 3 1 3 2
w}2 22 3 12 1 3
is ]2 2 3 1 2 3 2 1
te |2 3 1 3 2 3 1 2
7 {2 3 2 1 3 1 2 3
wo}2 3 3 2 12 3 1
File distribution (factor B) was the second most difficult factor to change.
Therefore, among the six experiments with level A, of the disk drives, the experiment
team first conducted the two experiments with level B,, then those with level B, and
finally those with level B3. The same pattern was repeated for level A> of the disk
drives and then level A of the disk drives. The examination of the Lj array given in
Table 10.2 indicates that some of the bookkeeping of the experimental conditions could
have been simplified if the team had assigned factor A to Column 2 and factor B to240 Tuning Computer Systems for High Performance Chap. 10
Column 3. However, the inconvenience of the particular column assignment was
unimportant.
For each experiment corresponding to each row of the Ls array, the team ran
the system for two days and collected data on standard response time and trivial
response time once every 10 minutes from 9:00 a.m. to 5:00 p.m. While measuring
the response times, they made sure that the UNIX system cron facility did not schedule
some routine data collection operations at the same exact moments because this would
affect the measurement of response times.
Running experiments on a live system can invite a number of practical problems.
Therefore, the first thing the experimenters did was to seek the cooperation of users.
One problem of great concem to the users was that for a particular combination of con-
trol factor settings, the system performance could become bad enough to cause a major
disruption of their activities. To avoid this, the system administrator was instructed to
make note of such an event and go back to the pre-experiment settings of the various
factors. This would minimize the inconvenience to the users. Fortunately, such an
event did not occur, but had it happened, the experiment team would still have been
able to analyze the data and determine optimum conditions using the accumulation
analysis method described in Chapter 5 (see also Taguchi [T1], and Taguchi and Wu
(17).
Under the best circumstances, the team could finish two experiments per week.
For 18 experiments, it would then take nine weeks. However, during the experiments,
a snowstorm arrived and the Easter holiday was observed, both events causing the sys
tem load to drop to an exceptionally low level. Those days were eliminated from the
data, and the team repeated the corresponding combinations of control factor settings to
have data for every row of the matrix experiment,
10.6 DATA ANALYSIS AND VERIFICATION EXPERIMENTS
From the 96 measurements of standard response time for each experiment, the team
computed the mean response time and the S/N ratio, The results are shown in Table
10.3. Similar computations were made for the trivial response time, but they are not
shown here. The effects of the various factors on the S/N ratio for the standard
response time are shown, along with the corresponding ANOVA in Table 10.4. The
factor effects are plotted in Figure 10.3. Note that the levels C,, C2, and C3 of
memory size are 4.0 MB, 3.0 MB, and 3.5 MB, respectively, which are not in a mono-
tonic order. While plotting the data in Figure 10.3, the experimenters considered the
correct order.
It is apparent from Table 10.4 that the factor effects are rather small, especially
when compared to the error variance. The problem of getting large error variance is
more likely with live experiments, such as this computer system optimization experi-
ment, because the different rows of the matrix experiment are apt to see quite different
noise conditions, that is, quite different load conditions. Also, while running liveSec. 10.6 Data Analysis and Verification Experiments 241
n(d8)
A, A, 3 8, B, Gy c, ¢, D, D,
Wi 4/2 40 3.5 3.0, 1/5 1/4 is
a BD 6 Ah 98 Oe
Disk File Memory System
Drives Distr. Size Buffer
n (0B)
£8 & FF, GG GH, H, Hy
038 490 600 600
28 82 9, 400600600, 2 Pe |
Sticky KMCs Inode Other
Bits Table System
Entries Tables
Figure 10.3 Factor effects for S/N ratio for standard response. Underscore indicates start-
ing level. One-standard-deviation limits are also shown.
experiments, the tendency is to choose the levels of control factors that are not far
apart. However, we can still make valuable conclusions about optimum settings of
control factors and then see if the improvements observed during the verification exper-
iment are significant or not.242
L
Tuning Computer Systems for High Performance
TABLE 10.3 DATA SUMMARY FOR STANDARD RESPONSE TIME
Expt. | Mean | 9)
No. | (see) | (4B)
x 4.65 14.66
2 5.28 ~16.37
3 3.06 10.49
4 4,53 14.85
5 3.26 10.94
6 455 14.96
3.37 11.77
5.62 16,72
487 -14.67
10 413 -13.52
Ww 4.08 TS
RR 445 -14.19
13 | 381 | -1289
14 | 587 | -16.75
1s 3.42 11.65
16 3.66 12.23
7 3.92 1281
18 442 13.71
Chap. 10
The following observations can be made from the plots in Figure 10.3 and Table
10.4 (note that these conclusions are valid only for the particular load characteristics of
the computer being tuned):
Going from not setting any sticky bits to setting sticky bits on the three most
used commands does not improve the response time. This is probably because
these three commands tend to stay in the memory as a result of their very fre-
quent use, regardless of setting sticky bits. However, when sticky bits are set on
the five next most used commands, the response time improves by 1.69 dB.
This suggests that we should set sticky bits on the eight commands, and in future
experiments, we should consider even more commands for setting sticky bits.
KMCs do not help in improving response time for this type of computer environ-
ment. Therefore, they may be dropped as far as terminal handling is concerned,
thus reducing the cost of the hardware.Sec. 10.6 Data Analysis and Verification Experiments 243
TABLE 10.4 ANALYSIS OF SIN RATIOS FOR STANDARD RESPONSE*
Average 1 by Factor Level
Degrees of | Sumof | Mean
Factor 1 2 3 | Freedom | Squares | Square | F
A. Disk drives -14.37 | -13.40 t 3.76 3.76 1
B. File distribution 13.84 | -13.67 | -13.65 2 0.12t 0.06
C. Memory size 713.32 | -14.56 | -13.28 2 640 | 3.20
D, System buffers 13.74 | -13.31 | -14.01 2 1,92 0.97
E, Sticky bits =14.27 | -14.34 2 1227 | 614 | 2
F. KMCs used =13.94 | -13.50 1 ost | 0.42
G. INODE table -13.91 | -13.51 | -13.74 2 oa7t | 0.24
entries
Hi. Other system tables | -13.53 | -14.15 | -13.48 2 168 | 1.34
i - 3 32.60 | 1087
Total a — a 60.06 3.53
(Emror) o (34.03) | G78)
* Overall mean 1) = -13.72 AB; underscore indicates starting conditions.
+ Indicates the sum of squares that were added to form the pooled error sum of squares
shown in parentheses.
3. Adding one more disk drive leads to better response time. Perhaps even more
disks should be considered for improving the response time. Of course, this
would mean more cost, so proper trade-offs would have to be made.
4, The S/N ratio is virtually the same for 4 MB and 3.5 MB memory. It is
significantly lower for 3 MB memory. Thus, 4 MB seems to be an optimum
value—that is, buying more memory would probably not help much in improv-
ing response time.
5. There is some potential advantage (0.8 dB) in changing the fraction of system
buffers from 1/3 to 1/4,
6. The effects of the remaining three control factors are very small and there is no
advantage in changing their levels.244 Tuning Computer Systems for High Performance Chap. 10
Optimum Conditions and Verification Experiment
The optimum system configuration inferred from the data analysis above is shown in
Table 10.5 along with the starting configuration. Changes were recommended in the
settings of sticky bits, disk drives, and system buffers because they lead to faster
response. KMCs were dropped because they did not help improve response, and drop-
ping them meant saving hardware. The prediction of the S/N ratio for the standard
Tesponse time under the starting and optimum conditions is also shown in Table 10.5
Note that the contributions of the factors, whose sum of squares were among the small-
est and were pooled, are ignored in predicting the S/N ratio. Thus, the S/N ratio
predicted by the data analysis under the starting condition is 14.67 dB, and under the
optimum conditions it is -11.22 dB. The corresponding, predicted rms response times
under the starting and optimum conditions are 5.41 seconds and 3.63 seconds, respec
lively.
TABLE 10.5 OPTIMUM SETTINGS AND PREDICTION OF STANDARD RESPONSE TIME
Starting Condition ‘Optimum Condition
Contributiont Contributiont
Factor Setting (aB) Setting (aB)
A. Disk drives* Ay -0.65 Ay 0.32
B. File distribution B - By -
C. Memory size CQ 0.40 c 0.40
D. System buffers* Dy 0.39 Dz 041
E, Sticky bits* E, -055 Ey Liq
F. KMCs used* Fy - Ei -
G. INODE table entries | G3 - G; -
H. Other system tables | 024 Hy 024
Overall mean 13.72
Total 1467 “122
* Indicates the factors whose levels are changed from the starting to the
optimum conditions.
+ By contribution we mean the deviation from the overall mean caused by the
particular factor level.Sec. 10.6 Data Analysis and Verification Experiments 245
As noted earlier, here the error variance is large. We would also expect the vari-
ance of the prediction error to be large. The variance of the prediction error can be
computed by the procedure given in Chapter 3 {see Equation (3.14)]. The equivalent
sample size for the starting condition, n,, is given by
Ak, ft Ae (pL A] [LR =
mn \m, a} (nc, 2)” mp,
toi 1 7
+|—--|+ |—-—
me, 7) (my, 2
~t ft ia 14 toa
-4+(t +}+[2 18] * [6 ~ 18
a (ee Alle [be ae
6 18J [6 18
= 0.61.
Thus, 7, = 1.64. Correspondingly, the two-standard-deviation confidence limits
for the predicted S/N ratio under the starting conditions are - 14.67 +2 V3.78/1.64,
which simplify to —14.67 +3.04 dB. Similarly, the two-standard-deviation confidence
limits for _the predicted S/N ratio under the optimum conditions are
11.22 £2V3.78/1.89, which simplify to — 11.22 +2.83 dB. Note that there is a slight
difference between the widths of the two confidence intervals. This is because there
are 12 experiments with level A> and only six with Ay.
Subsequently, the optimum configuration was implemented. The average
response times for the standard and trivial commands for over two weeks of operation
under the optimum conditions are plotted in Figure 10.1(a)—(b). Comparing the
response time under the starting (initial) configuration with that under the optimum
configuration, we see that under the optimum conditions the response time is small,
even in the afternoon when the load is high. In fact, the response time is uniformly
low throughout the day.
The data from the confirmation experiment are summarized in Table 10.6. For
standard response time we see that the mean response time was reduced from 6.15 sec.
to 2.37 sec., which amounts to a 61-percent improvement. Similar improvement was
seen in the rms response time. On the S/N ratio scale, the improvement was 8.39 dB.
Similarly, for the trivial response time, the improvement was seen to be between 70
percent and 80 percent of the mean value, or 13.64 dB, as indicated in Table 10.6.246 Tuning Computer Systems for High Performance Chap. 10
TABLE 10.6 RESULTS OF VERIFICATION EXPERIMENT
Standard Response Time Trivial Response Time
Starting | Optimum Starting | Optimum
Measure | Levels | Levels | Improvement | Levels | Levels | Improvement
Mean (sec) 615 237 61% 0.521 0.148 N%
ms (sec) 7.59 2.98 61% 0.962 0.200 79%
1 (6B) -1760 | -921 839dB | +034 | +1398 13.64 dB
Standardized 15.88 8.64 7.24 dB +3.26 +15.71 12.45 dB
1 (dB)
Note that the observed S/N ratios for standard response time under the starting
and optimum conditions are within their respective two-standard-deviation confidence
limits. However, they are rather close to the limits. Also, the observed improvement
(8.39 dB) in the S/N ratio is quite large compared to the improvement predicted by the
data (3.47 dB). However, the difference is well within the confidence limits. Also,
observe that the S/N ratio under the optimum conditions is better than the best among
the 18 experiments.
Thus, here we achieved a 60- to 70-percent improvement in response time by
improved system administration. Following this experiment, two similar experiments
were performed by Klingler and Nazaret [K5], who took extra care to ensure that pub-
lished UNIX system tuning guides were used to establish the starting conditions. Their
experiments still led to a 20- to 40-percent improvement in response time. One extra
factor they considered was the use of PDQs, which are special auxiliary processors for
handling text processing jobs (troff). For their systems, it turned out that the use of
PDQs could hurt the response time rather than help.
10.7 STANDARDIZED S/N RATIO
When running Robust Design experiments with live systems, we face the methodologi-
cal problem that the noise conditions, which are the load conditions for our computer
system optimization, are not the same for every row of the control orthogonal array.
This can lead to inaccuracies in the conclusions. One way to minimize the impact of
changing noise conditions is to construct a standardized S/N ratio, which we describe
next.
As noted earlier, some of the more important load measures for the computer
system optimization experiment are: number of users, CPU demand, 1/O demand, andSec. 10.7 ‘Standardized S/N Ratio 247
memory demand. After studying the load pattern over the days when the case study
was conducted, we can define low and high levels for each of these load measures, as
shown in Figure 10.4. These levels should be defined so that for every experiment we
have a reasonable number of observations at each level. The 16 different possible
combinations of the levels of these four load measures are listed in Table 10.7 and are
nothing more than 16 different noise conditions.
No. of Users
‘CPU Demand
VO Demand
Memory Demand
Average Value
Range
Figure 10.4 Selection of levels for load measures.
In a live experiment, the number of observations for each noise condition can
change from experiment to experiment. For instance, one day the load might be heavy
while another day it might be light. Although we cannot dictate the load condition for
the different experiments, we can observe the offered load. The impact of load varia-
tion can be minimized as follows: We first compute the average response time for
each experiment in each of the 16 load conditions. We then treat these 16 averages as
raw data to compute the S/N ratio for each experiment. The S/N ratio computed in
this manner is called standardized S/N ratio because it effectively standardizes the load
conditions for each experiment. The system can then be optimized using this standard-
ized S/N ratio.
Note that for the standardized S/N ratio to work, we must have good definitions
of noise factors and ways of measuring them. Also, in each experiment every noise
condition must occur at least once to be able to compute the average. In practice,208 Tuning Computer Systems for High Performance Chap. 10
TABLE 10.7 COMPUTATION OF STANDARDIZED S/N RATIO”
Expt. | No.of | CPU VO | Memory | Average Response
No. | Users | Demand | Demand | Demand | Time for Experiment
1 1 1 1 1 va
2 1 1 1 2 Yan
3 1 1 2 1 Ys
4 1 1 2 2 Jia
5 1 2 1 1 Ys
6 1 2 1 2 Ye
7 1 2 2 1 Yin
8 1 2 2 2 Yis
9 | 2 1 1 1 Ye
w | 2 1 1 2 Yaw
u 2 1 2 1 on
n | 2 1 2 2 yin
3 | 2 2 1 1 is
4 | 2 2 1 2 Jina
| 2 2 2 1 Yass
w | 2 2 2 2 Yire
* Standardized S/N ratio for experiment i,
1
w= 10 loBi0 Lag vi
however, if one or two conditions are missing, we may compute the S/N ratio with the
available noise conditions without much harm.
In the computer system optimization case study discussed in this chapter, the
experimenters used the concept of standardized S/N ratio to obtain better comparison of
the starting and optimum conditions. Some researchers expressed a concem that part
of the improvement observed in this case study might have been due to changed load
conditions over the period of three months that it took the team to conduct the matrix
experiment. Accordingly, the experimenters computed the standardized S/N ratio for
the two conditions and the results are shown in Table 10.6 along with the other results
of the verification experiment. Since the improvement in the standardized S/N ratio is
quite close to that in the original S/N ratio, the experiment team concluded that the
load change had a minimal impact on the improvement.Sec. 10.9 Summary 249
10.8 RELATED APPLICATIONS
Computer System Parameter Space Mapping
In practice, running live experiments to optimize every computer installation is obvi-
ously not practicable. Instead, more benefits can be achieved by performing off-line
matrix experiments to optimize for several different load conditions. Such load condi-
tions can be simulated using facilities like the UNIX system benchmarking facility.
The information gained from these off-line experiments can be used to map the operat-
ing system parameter space. This catalog of experimental results can then be used to
improve the performance of different machines.
Optimizing Operations Support Systems
Large systems, such as the traffic in a telecommunications network, are often managed
by software systems that are commonly known as operations support systems. The
load offered to a telephone network varies widely from weekday to weekend, from
morning to evening, from working days to holidays, etc. Managing the network
includes defining strategies for providing the needed transmission and switching facili-
ties, routing a call end to end on the network, and many other tasks. Steve Eick and
Madhav Phadke applied the Robust Design method in AT&T to improve the strategy
for adapting the routing algorithm to handle unusually high demands. Studying eight
parameters of the strategy using the Lj orthogonal array, they were able to increase
the excess load carried during unusually high demands by as much as 70 percent. This
not only improved the revenue, but also improved network performance by reducing
the number of blocked calls, These experiments were done on a simulator because of
the high risks of the live experiments.
International telephone traffic is carried over several different types of transmis-
sion facilities. Seshadri [S4] used the Robust Design method to determine the pre-
ferred facility for transmitting facsimile data by conducting live experiments on the
network.
‘The success in improving the telephone network traffic management suggests that
the Robust Design approach could be used successfully to improve other networks, for
example, air traffic management. In fact, it is reported that the method has been used
in Japan to optimize runway and air-space usage at major airports.
10.9 SUMMARY
+ The performance of complex systems (such as a computer), which have tunable
parameters, can often be improved through better system administration. Tuning
is necessary because the parameter settings specified by the system manufacturerTuning Computer Systems for High Performance Chap. 10
pertain to specific assumptions regarding the applications and load, However,
the actual applications and load could be different from those envisioned by the
manufacturer.
There are two systematic approaches for optimizing a complex system: (1)
micro-modeling and (2) macro-modeling. The macro-modeling approach can
utilize the Robust Design method to achieve more rapid and efficient results.
Load variation during use of the computer, from day to day and as a function of
the time of day, constitutes the main noise factor for the computer system. The
number of users logged on, CPU demand, I/O demand, and memory demand are
some of the more important load measures.
The response time for the standard (or the trivial) command minus the minimum
possible time for that command was the quality characteristic for the case study.
It is a smaller-the-better type characteristic. ‘The minimum possible response
time was ignored in the analysis as it was very small compared to the average
response time.
Eight control factors were chosen for the case study: disk drives (A), file distri-
bution (B), memory size (C), system buffers (D), sticky bits (E), KMCs used (F),
INODE table entries (G), and other system tables (H), Factors A, C, and F are
hardware related, whereas the others are software related. Factor F had two lev-
els while the others had three levels.
The Lig orthogonal array was used for the matrix experiment. Disk drives was
the most difficult factor to change, and file distribution was the next most
difficult factor to change. Therefore, the 18 experiments were conducted in an
order that minimized changes in these two factors. Also, the experiments were
ordered in a way that allowed changing level 3 of disk drives, as anticipated dur-
ing the planning stage.
The experiments were conducted on a live system. Each experiment lasted for
two days, with eight hours per day. Response time for the standard and the
trivial commands was observed once every 10 minutes by using an automatic
measurement facility.
Running Robust Design experiments on live systems involves a methodological
problem that the noise conditions (which are the load conditions in the computer
Tesponse time case study) are not the same for every row of the matrix experi-
ment, Consequently, the estimated error variance can be large compared to the
estimated factor effects. Also, the predicted response under the optimum condi-
tions can have large variance. Indeed, this was observed in the present case
study.
Standardized S/N ratios can be used to reduce the adverse effect of changing
noise conditions, which is encountered in running Robust Design experiments on
live systems,Sec. 10.9 ‘Summary 251
Data analysis indicated that levels of four control factors (A, D, E, and F) should
be changed and that the levels of the other four factors should be kept the same.
It also indicated that the next round of experiments should consider setting sticky
bits on more than eight commands.
The verification experiment showed a 61-percent improvement in both the mean
and rms response time for the standard command. This corresponds to an 83-
percent reduction in variance of the response time. It also showed a 71-percent
improvement in mean response time and a 79-percent improvement in rms
response time for the trivial command.
Computer response time optimization experiments can be conducted off-line
using automatic load generation facilities such as the UNIX system benchmark-
ing facility. Off-line experiments are less disruptive for the user community.
Orthogonal arrays can be used to determine the load conditions to be simulated.Chapter 11
RELIABILITY
IMPROVEMENT
Increasing the longevity of a product, the time between maintenance of a manufactur-
ing process, or the time between two tool changes are always important considerations
in engineering design. These considerations are often lumped under the term reliability
improvement. There are three fundamental ways of improving the reliability of a pro-
duct during the design stage: (1) reduce the sensitivity of the product’s function to the
variation in product parameters, (2) reduce the rate of change of the product parame-
ters, and (3) include redundancy. The first approach is nothing more than the parame-
ter design described earlier in this book. The second approach is analogous to toler-
ance design and it typically involves more expensive components or manufacturing
processes. Thus, this approach should be considered only after sensitivity has been
minimized. The third approach is used when the cost of failure of the product is high
compared to the cost of providing redundant components or even the whole product.
The most cost-effective approach for reliability improvement is to find appropri-
ate continuous quality characteristics and reduce their sensitivity to all noise factors.
Guidelines for selecting quality characteristics related with product reliability were dis-
cussed in Chapter 6 in connection with the design of the paper handling system in
copying machines. As noted in Chapter 6, determining such quality characteristics is
not always easy with existing engineering know-how. Life tests, therefore, must be
performed to identify settings of control factors that lead to longer product life. This
chapter shows the use of the Robust Design methodology in reliability improvement
through a case study of router bit life improvement conducted by Dave Chrisman and
Madhav Phadke, documented in Reference [P3).
253254 Reliability Improvement Chap. 11
This chapter has nine sections:
Section 11.1 describes the role of signal-to-noise (S/N) ratios in reliability
improvement.
Section 11.2 describes the routing process and the goal for the case study (Step 1
of the eight Robust Design steps described in Chapter 4),
Section 11.3 describes the noise factors and quality characteristic (Steps 2 and 3).
Section 11.4 gives the control factors and their alternate levels (Step 4).
Section 11.5 describes the construction of the control orthogonal array (Step 5).
The requirements for the project were such that the construction of the control
orthogonal array was a complicated combinatoric problem,
Section 11.6 describes the experimental procedure (Step 6).
Section 11.7 gives the data analysis, selection of optimum conditions, and the
verification experiment (Steps 7 and 8).
Section 11.8 discusses estimation of the factor effects on the survival probability
curves,
Section 11.9 summarizes the important points of this chapter.
11.1 ROLE OF S/N RATIOS IN RELIABILITY IMPROVEMENT
First, let us note the difference between reliability characterization and reliability
improvement. Reliability characterization refers to building a statistical model for the
failure times of the product. Log-normal and Weibull distributions are commonly used
for modeling the failure times. Such models are most useful for predicting warranty
cost. Reliability improvement means changing the product design, including the set-
tings of the control factors, so that time to failure increases,
Invariably, it is expensive to conduct life tests so that an adequate failure-time
model can be estimated. Consequently, building adequate failure-time models under
various settings of control parameters, as in an orthogonal array experiment, becomes
impractical and, hence, is hardly ever done. In fact, it is recommended that conducting
life tests should be reserved as far as possible only for a final check on a product.
Accelerated life tests are well-suited for this purpose.
For improving a product’s reliability, we should find appropriate quality charac-
teristics for the product and minimize its sensitivity to all noise factors. This automati-
cally increases the product's life. The following example clarifies the relationship
between the life of a product and sensitivity to noise factors.
Consider an electrical circuit whose output voltage, y, is a critical characteristic.
If it deviates too far from the target, the circuit’s function fails. Suppose the variation
in a resistor, R, plays a key role in the variation of y. Also, suppose, the resistance R
is sensitive to environmental temperature and that the resistance increases at a certainSec. 11.1 Role of S/N Ratios in Reliability Improvement 255
rate with aging. During the use of the circuit, the ambient temperature may go too
high or too low, or sufficient time may pass leading to a large deviation in R. Conse-
quently, the characteristic y would go outside the limits and the product would fail
Now, if we change the nominal values of appropriate control factors, so that y is much
less sensitive to variation in R, then for the same ambient temperatures faced by the
circuit and the same rate of change of R due to aging, we would get longer life out of
that circuit.
Sensitivity of the voltage y to the noise factors is measured by the S/N ratio.
Note that in experiments for improving the S/N ratio, we may use only temperature as
the noise factor. Reducing sensitivity to temperature means reducing sensitivity to
variation in R and, hence, reducing sensitivity to the aging of R also. Thus, by
appropriate choice of testing conditions (noise factor settings) during Robust Design
experiments, we can improve the product life as well.
Estimation of Life Using a Benchmark Product
Estimating the life of a newly designed product under customer-usage conditions is
always a concern for the product designer. It is particularly important to estimate the
life without actually conducting field studies because of the high expense and the long
delay in getting results. Benchmark products and the S/N ratio can prove to be very
useful in this regard. A benchmark product could be the earlier version of that product
with which we have a fair amount of field experience.
Suppose we test the current product and the benchmark product under the same
set of testing conditions. That is, we measure the quality characteristic under different
levels of the noise factors. For example, in the circuit example above, we may meas-
ure the resistance R at the nominal temperature and one or more elevated temperatures.
The test may also include holding the circuit at an elevated temperature for a short
period of time and then measuring R. By analyzing the test data, suppose we find the
S/N ratio for the current product as 4, and for the benchmark product as qj. Then,
the sensitivity of the quality characteristic for the current product is 7, — Tz dB lower
than that for the benchmark product—that is, the variance for the current product is
smaller by a factor of r, where r is given by
[aoe |
r=10' ©
It is often the case that the rate of drift of a product’s quality characteristic is
proportional to the sensitivity of the quality characteristic to noise factors. Also, the
drift in the quality characteristic as a function of time can be approximated reasonably
well by the Wiener process. Then, through standard theory of level crossing, we can
infer that the average life of the current product would be r times longer than the life256 Reliability Improvement Chap. 11
of the benchmark product whose average life is known through past experience. Thus,
the S/N ratio permits us to estimate the life of a new product in a simple way without
conducting expensive and time-consuming life tests.
This section described the role of S/N ratios in reliability improvement. This is
a more cost-effective and, hence, preferred way to improve the reliability of a product
or a process. However, for a variety of reasons (including lack of adequate engineering
know-how about the product) we are forced, in some situations, to conduct life tests to
find a way of improving reliability. In the remaining sections of this chapter, we
describe a case study of improving the life of router bits by conducting life studies.
11.2 THE ROUTING PROCESS
Typically, printed wiring boards are made in panels of 18 x 24 in. size. Appropriate
size boards, say 8 X 4 in,, are cut from the panels by stamping or by the routing pro-
cess. A benefit of the routing process is that it gives good dimensional control and
smooth edges, thus reducing friction and abrasion during the circuit pack insertion pro-
cess. When the router bit gets dull, it produces excessive dust which then cakes on the
edges of the boards and makes them rough. In such cases, a costly cleaning operation
is necessary to smooth the edges. However, changing the router bits frequently is also
expensive. In the case study, the objective was to increase the life of the router bits,
primarily with regard to the beginning of excessive dust formation.
The routing machine used had four spindles, all of which were synchronized in
their rotational speed, horizontal feed (X-Y feed), and vertical feed (in-feed). Each
spindle did the routing operation on a separate stack of panels. Typically, two to four
panels are stacked together to be cut by each spindle. The cutting process consists of
lowering the spindle to an edge of a board, cutting the board all around using the X-Y
feed of the spindle, and then lifting the spindle, This is repeated for each board on a
panel.
11.3 NOISE FACTORS AND QUALITY CHARACTERISTICS
Some of the important noise factors for the routing process are the out-of-center rota-
tion of the spindle, the variation from one router bit to another, the variation in the
material properties within a panel and from panel to panel, and the variation in the
speed of the drive motor.
Ideally, we should look for a quality characteristic that is a continuous variable
related to the energy transfer in the routing process, Such a variable could be the wear
of the cutting edge or the change in the cutting edge geometry. However, these vari-
ables are difficult to measure, and the researchers wanted to keep the experiment sim-
ple. Therefore, the amount of cut before a bit starts to produce an appreciable amount
of dust was used as the quality characteristic. This is the useful life of the bit.Sec. 11.4 Control Factors and Their Levels 257
11.4 CONTROL FACTORS AND THEIR LEVELS
The control factors selected for this project are listed in Table 11.1. Also listed in the
table are the control factors’ starting and alternate levels. The rationale behind the
selection of some of these factors and their levels is given next.
TABLE 11.1 CONTROL FACTORS AND THEIR LEVELS
Sf Levee
Factors 1} 2|]3 ]4
A. Suction (in of Hg) | 1 | 2
B. X-Y feed (inmin) | 60 | 80
C. In-feed (in/min) 10 | 50
D. Type of bit rf2),3]4
E. Spindle positiont 1} 2]3]4
F, Suction foot sR | BB
G. Stacking height (in) | 3/16 | 174
H. Depth of slot (thou) | 60 | 100
1. Speed (rpm) 30K | 40K,
* Starting levels are indicated by an underscore.
+ Spindle position is not a control factor. It is a
noise factor.
Suction (factor A) is used around the router bit to remove the dust as it is gen-
erated, Obviously, higher suction could reduce the amount of dust retained on the
boards. The starting suction was two inches of mercury (Hg). However, the pump
used in the experiment could not produce more suction, So, one inch of Hg was
chosen as the alternate level, with the plan that if the experiments showed a significant
difference in the dust, a more powerful pump would be obtained.
Related to the suction are suction foot and the depth of the backup slot. The suc-
tion foot determines how the suction is localized near the cutting point. Two types of
suction foot (factor F) were chosen: solid ring (SR) and bristle brush (BB). A backup
panel is located undemeath the panels being routed. Slots are precut in this backup
panel to provide air passage and a place for dust to accumulate temporarily. The depth
of these slots was a control factor (factor H) in the case study.258 Reliability Improvement Chap. 11
Stack height (factor G) and X-Y feed (factor B) are control factors related to the
productivity of the process—that is, they determine how many boards are cut per hour.
The 3/16-in. stack height meant three panels were stacked together while 1/4-in. stack
height meant four panels were stacked together. The in-feed (factor C) determines the
impact force during the lowering of the spindle for starting to cut a new board. Thus,
it could influence the life of the bit regarding breakage or damage to its point. Four
different nypes of router bits (factor D) made by different manufacturers were investi-
gated in this study. The router bits varied in cutting geometry in terms of the hel
angle, the number of flutes, and the type of point.
Spindle position (factor E) is not a control factor. The variation in the state of
adjustment of the four spindles is indeed a noise factor for the routing process. All
spindle positions must be used in actual production; otherwise, the productivity would
suffer. The reason it was included in the study is that in such situations one must
choose the settings of control factors that work well with all four spindles, The
rationale for including the spindle position along with the control factors is given in
Section 11.5.
11.5 DESIGN OF THE MATRIX EXPERIMENT
For this case study, the goal was to not only estimate the main effects of the nine fac-
tors listed in the previous section, but also to estimate the four key 2-factor interac-
tions. Note that there are 36 distinct ways of choosing two factors from among nine
factors, ‘Thus, the number of two-factor interactions associated with nine factors is 36.
An attempt to estimate them all would take excessive experimentation, which is also
unnecessary anyway. ‘The four interactions chosen for the case study were the ones
judged to be more important based on the knowledge of the cutting process:
4. (X-Y feed) x (speed), that is, B x 1
2. (in-feed) x (speed), that is, C x I
3. (stack height) x (speed), that is, G x I
4, (X-Y feed) x (stack height), that is, Bx G
The primary purpose of studying interactions in a matrix experiment is to see if
any of those interactions are strongly antisynergistic. Lack of strong antisynergistic
interactions is the ideal outcome. However, if a strong antisynergistic interaction is
found, we should look for a better quality characteristic. If we cannot find such a
characteristic, we should look for finding levels of some other control factors that cause
the antisynergistic interaction to disappear. Such an approach leads to a stable design
compared to an approach where one picks the best combination when a strong antisyn-
ergistic interaction is found.
In addition to the requirements listed thus far, the experimenters had to consider
the following aspects from a practical viewpoint:Sec. 11.5 Design of the Matrix Experiment 259
* Suction (factor A) was difficult to change due to difficult access to the pump.
* All four spindles on a machine move in identical ways—that is, they have the
same X-Y feed, in-feed, and speed. So, the columns assigned to these factors
should be such that groups of four rows can be made, where each group has a
common X-Y feed, in-feed, and speed. This allows all four spindles to be used
effectively in the matrix experiment.
Construction of the Orthogonal Array
‘These requirements for constructing the control orthogonal array are fairly complicated.
Let us now see how we can apply the advanced strategy described in Chapter 7 to con-
struct an appropriate orthogonal array for this project.
First, the degrees of freedom for this project can be calculated as follows:
Source Degrees of Freedom
Two 4-level factors 2x(4- 1I)=6
Seven 2-level factors 7x(2-1)=7
Four 2-factor interactions 4x(@2-D)x2-D=4
between 2-level columns
Overall mean 1
Total 18
Since there are 2-level and 4-level factors in this project, it is preferable to use
an array from the 2-level series. Because there are 18 degrees of freedom, the array
must have 18 or more rows. The next smallest 2-level array is Lp.
The linear graph needed for this case study, called the required linear graph, is
shown in Figure 11.1(a). Note that each 2-level factor is represented by a dot, and
interaction is represented by a line connecting the corresponding dots. Each 4-level
factor is represented by two dots, connected by a line according to the column merging
method in Chapter 7.
The next step in the advanced strategy for constructing orthogonal arrays is to
select a suitable linear graph of the orthogonal array L3) and modify it to fit the
required linear graph. Here we take a slightly different approach. We first simplify the
required linear graph by taking advantage of the special circumstances and then
proceed to fit a standard linear graph.260 Reliability Improvement Chap. 11
xB IxG | l
e e
D E H
F
(a) Required Linear Graph
vo | ]
e e
Cc D E F
H
ve—en
(b) Modified Required Linear Graph
1 4 5 7 6
3 | ] aI ‘|
2 8 10 9 "
(¢) Selected Standard Linear Graph of Ly,
G
15 7 4
a 7 “] J] 10
2 9 8 e
8 D E
(d) Modified Standard Linear Graph
Figure 11.1 Linear graph for the routing project.
We notice that we must estimate the interactions of the factor I with three other
factors. One way to simplify the required linear graph is to teat I as an outer
‘factor—that is, to first construct an orthogonal array by ignoring 1 and its interactions.Sec. 11.5 Design of the Matrix Experiment 261
Then, conduct each row of the orthogonal array with the two levels of I. By so doing,
we can estimate the main effect of I and also the interactions of I with all other factors.
The modified required linear graph, after dropping factor I and its interactions is
shown in Figure 11.1(b). This is a much simpler linear graph and, hence, easier to fit
to a standard linear graph. Dropping the 2-level factor I and its interactions with three,
2evel factors is equivalent to reducing the degrees of freedom by four. Thus, there
are 14 degrees of freedom associated with the linear graph of Figure 11.1(b). There-
fore, the orthogonal array Ly, can be used to fit the linear graph of Figure 11.1(b).
This represents a substantial simplification compared with having to use the array L3
for the original required linear graph. The linear graph of Figure 11.1(b) has three
lines, connecting two dots each, and four isolated dots. Thus, a standard linear graph
that has a number of lines that connect pairs of dots seems most appropriate. Such a
linear graph was selected from the standard linear graphs of L jg given in Appendix C
and it is shown in Figure 11.1(c). It has five lines, each connecting two distinct dots.
The step-by-step modification of this linear graph to make it fit the one in Figure
11.1(b) is discussed next.
The requirement that there should be as few changes as possible in factor A
implies that factor A should be assigned to column 1. Therefore, we break the line
connecting dots 1 and 2, giving us three dots 1, 2, and 3. Now, we want to make
groups of four rows which have common X-Y feed (factor B) and in-feed (factor C).
Therefore, these two factors should be assigned to columns that have fewer changes.
Hence, we assign columns 2 and 3 to factors B and C, respectively. Referring to the
columns 2 and 3 of the array Lg in Table 11.2, it is clear that each of the four groups
of four rows (1-4, 5-8, 9-12, and 13-16) has a common X-Y feed and in-feed. Now we
should construct a 4-level column for spindle position (factor E) so that all four spindle
positions will be present in each group of the four rows mentioned above. We observe
that this can be achieved by merging columns 4 and 8 according to Section 7.7. Of
course, column 12, which represents the interaction between columns 4 and 8, must be
kept empty. Note that we could have used any of the other three lines in Figure
11.1(6) for this purpose.
Next, from among the remaining three lines in the standard linear graph, we arbi-
trarily chose columns 7 and 9 to form a 4-level column for factor D. Of course, the
interaction column 14 must be kept empty.
The two remaining lines are then broken to form six isolated dots corresponding
to columns 5, 10, 15, 6, 11, and 13. The next priority is to pick a column for factor G
so that the interaction B x G would be contained in one of the remaining five columns.
For this purpose, we refer to the interaction table for the Ly array given in Appendix
C. We picked column 15 for factor G. Column 13 contains interaction between
columns 2 and 15, so it can be used to estimate the interaction B x G. We indicate
this in the linear graph by a line joining the dots for the columns 2 and 15.
From the remaining four columns, we arbitrarily assign columns 10 and 5 to fac-
tors F and H. Columns 6 and 11 are kept empty.Chap. 11
Reliability Improvement
TABLE 11.2 Lis ORTHOGONAL ARRAY
Column Number
7 8 9 10 ML 12 13 14 15
6
1
Expt.
No.
10
u
2
3
14
15
16
The final, modified standard linear graph along with the assignment of factors to
columns is shown in Figure 11.1(d). The assignment of factors to the columns of the
Lig array is as follows:
Column
Factor
10
15
13
BxG
Column
Factor
7,9, 14
D
E
4, 8, 12Sec. 11.5 Design of the Matrix Experiment 263
The 4-level columns for factors D and E were formed in the array Lys by the column
merging method of Section 7.7. The resulting 16 row orthogonal array is the same as
the first 16 rows of Table 11.3, except for the column for factor I. Because I is an
outer factor, we obtain the entire matrix experiment as follows: make rows 17-32 the
same as rows 1-16. Add a column for factor I that has 1 in the rows 1-16 and 2 in the
rows 17-32. Note that the final matrix experiment shown in Table 11.3 is indeed an
orthogonal array——that is, in every pair of columns, all combinations occur and they
occur an equal number of times. We ask the reader to verify this claim for a few pairs
of columns, Note that the matrix experiment of Table 11.3 satisfies all the require-
ments set forth earlier in this section.
The 32 experiments in the control orthogonal array of Table 11.3 are arranged in
eight groups of four experiments such that:
a, For each group there is a common speed, X-Y feed, and in-feed
b. The four experiments in each group correspond to four different spindles
Thus, each group constitutes a machine run using all four spindles, and the 32 experi-
ments in the control orthogonal array can be conducted in eight runs of the routing
machine.
Observe the ease with which we were able to construct an orthogonal array for a
very complicated combinatoric problem using the standard orthogonal arrays and linear
graphs prepared by Taguchi.
Inclusion of a Noise Factor in a Matrix Experiment
As a tule, noise factors should not be mixed in with the control factors in a matrix
experiment (orthogonal array experiment). Instead, noise factors should be used to
form different testing conditions so that the S/N ratio can accurately measure sensi-
tivity to noise factors. According to this rule, we should have dropped the spindled
position column in Table 11.3 and considered the four spindle positions as four testing
conditions for each row of the orthogonal array, which would amount to a four times
larger experimental effort.
To save the experimental effort, in some situations we assign control factors as
well as noise factors to the columns of the matrix experiment. In the router bit life
improvement case study, we included the spindle position in the matrix experiment for
the same reason, Note that in such matrix experiments noise is introduced systemati-
cally and in a balanced manner. Also, results from such matrix experiments are more
dependable than the results from a matrix experiment where noise is fixed at one level
to save experimental effort. Control factor levels found optimum through such experi-
ments are preferred levels, on average, for all noise factor levels in the experiment. In
the router bit example, results imply that the control factor levels found optimum are,
on average, preferred levels for all four spindle positions.Reliability Improvement Chap. 11
TABLE 11.3 MATRIX EXPERIMENT AND OBSERVED LIFE
264
B 2S3e2/saes/2asa| sass Sess kaka | sage
3
Ej----|----|----|----[aane
é I
3 |
Br - re [ere anne loa en ne
tbo oa a lwewe | saw et eee
BRe |- 00 an
?
Belinea lanana|----janaale-n-e
=
%
Bg|-eeti[ecce[seusimzne (sees | RARS/NARA|R RAS
gz
* Life was measured in hundreds of inches of movement in X-Y plane, Tests were terminated at 1,700 inches,Sec.11.7 Data Analysis 265
11.6 EXPERIMENTAL PROCEDURE
In order to economize on the size of the experiment, the experimenters took only one
observation of router bit life per row of the control orthogonal array. Of course, they
realized that taking two or three noise conditions per row of the control orthogonal
array would give them more accurate conclusions, However, doing this would mean
exceeding the allowed time and budget. Thus, a total of only 32 bits were used in this
project to determine the optimum settings of the control factors.
During each machine run, the machine was stopped after every 100 in. of cut
(that is, 100 in. of router bit movement in the X-Y plane) to inspect the amount of
dust. If the dust was beyond a certain minimum predetermined level, the bit was
recorded as failed. Also, if a bit broke, it was obviously considered to have failed.
Otherwise, it was considered to have survived.
Before the experiment was started, the average bit life was around 850 in. Thus,
each experiment was stopped at 1,700 in. of cut, which is twice the original average
life, and the survival or failure of the bit was recorded.
Usually, measuring the exact failure time of a product is very difficult. There-
fore, in practice, it is preferable to make periodic checks for survival as was done every
100 in. of cut in this case study. Also, running a life test beyond a certain point costs
a lot, and it does not add substantially to the information about the preferred level of a
control factor. Therefore, truncating the life test at an appropriate point is recom-
mended. In the reliability analysis or reliability engineering literature, determining an
interval in which a product fails is called interval censoring, whereas terminating a life
test at a certain point is called right censoring.
11.7 DATA ANALYSIS
Table 11.3 gives the experimental data in hundreds of inches. A reading of 0.5 means
that the bit failed prior to the first inspection at 100 in. A reading of 3.5 means that
the bit failed between 300 and 400 in. Other readings have similar interpretation,
except the reading of 17.5 which means survival beyond 1,700 in., the point where the
experiment was terminated. Notice that for 14 experiments, the life is 0.5 (50 in.),
meaning that those conditions are extremely unfavorable. Also, there are eight cases of
life equal to 17.5, which are very favorable conditions. During experimentation, it is
important to take a broad range for each control factor so that a substantial number
of favorable and unfavorable conditions are created. Much can be learned about the
optimum settings of control factors when there is such diversity of data.
Now we will show two simple and separate analyses of the life data for deter-
mining the best levels for the control factors. The first analysis is aimed at determin-
ing the effect of each control factor on the mean failure time. The second analysis,
described in the next section, is useful for determining the effect of changing the level
of each factor on the survival probability curve.Reliability Improvement
Chap. 11
The life data was analyzed by the standard procedures described in Chapter 3 to
determine the effects of the control factors on the mean life. The mean life for cach
factor level and the results of analysis of variance are given in Table 11.4. These
results are plotied in Figure 11.2. Note that in this analysis we have ignored the effect
‘of both types of censoring. The following conclusions are apparent from the plots in
Figure 11.2:
© 1-in. suction is as good as 2-in. suction. Therefore, it is unnecessary to increase
suction beyond 2 in.
* Slower X-Y feed gives longer life.
* The effect of in-feed is small.
TABLE 11.4 FACTOR EFFECTS AND ANALYSIS OF VARIANCE FOR ROUTER BIT LIFE*
Level Meanst Sum of | Degrees of | Mean
Factor 1 2 3 4 ‘Squares Freedom Square F
A. Suction 5.94 5.94 0 1 oO 0.0
B, XY feed 1543 105.13 1 | aos.s | 4
C. In-feed 5446.44 8.00 1 8.00 | 03
D. Type of bit 6.03 2.63 3.63 11.38 367.38 % 122.46 | 48
E. Spindle position | 7.25 4.13 8.00 438 93.63, 3 aiai | 12
F. Suction foot 7.69 4.19 98.00 1 98.00 | 39
G, Stack height 8.56 3.31 220.50 1 220.50 | 8.7
H, Depth of slot 563 625 33 1 3.43 | OL
L Speed 3.56 831 180.50 1 180.50 | 7.1
IxB 10.50 1 10.50 | 0.4
Ixc 10.13 1 10.13 | 04
IxG 171.13 1 A7NA3 | 6.7
BxG 4.50 | 450 | 02
Error 355.37 14 25.38
Total 1627.90 31
* Life in hundreds of inches.
+ Overall mean life = 5.94 hundreds of inches; starting conditions are identified by an
underscore.Sec.11.7 Data Analysis 267
The starting bit is the best of the four types.
The difference among the spindles is small. However, we should check the
centering of spindles 2 and 4.
Solid ring suction foot is better than the existing bristle brush type.
Lowering the stack height makes a large improvement. ‘This change, however,
raises a machine productivity issue.
The depth of the slot in the backup material has a negligible effect.
Higher rotational speed gives improved life. If the machine stability permits,
even higher speed should be tried in the next cycle of experiments.
The only 2-factor interaction that is large is the stack height versus speed interac-
tion. However, this interaction is of the synergistic type. Therefore, the
optimum settings of these factors suggested by the main effects are consistent
with those suggested by the interaction.
Optimum Control Factor Settings
The best settings of the control factors, called optimum 1, suggested by the results
above, along with their starting levels, are displayed side by side in Table 11.5.
Using the linear model and taking into consideration only the terms for which
the variance ratio is large (that is, the factors B, D, F, G, I and interaction I x G), we
can predict the router bit life under the starting, optimum, or any other combination of
control factor settings. The predicted life under the starting and optimum conditions
are 888 in. and 2,225 in., respectively. The computations involved in the prediction
are displayed in Table 11.5. Note that the contribution of the I x G interaction under
starting conditions was computed as follows:
(Contribution at 2, G2 due to I x G interaction)
= (71,6, -W) ~ (M1, —W) ~ Omg, -B)
= (3.38-5.94) — (8.31 —5.94) — (3.31-5.94)
=-2.30
Note that m),¢, is the mean life for the experiments with speed / and stack height
Gz. The contribution of the I x G interaction under the optimum conditions was com-
puted in a similar manner. Because of the censoring during the experiment at 1,700268 Reliability Improvement Chap. 11
15
g
6
£
310
8
s
a5
§
2
0
A, A, 8 8, © 0, D0, D, D, D
2, 6080, 1050/1 23 4
2 OO; OO A 8 8
Suction X-Y Feed In Feed Type of Bit
(in) (invmin) —_(irvmin)
B10
3
g
8
=6
2
=
: 0
~ EEE, FF, GG HM, I te
2 3 4, SR BB 316 1/4 60 100, 30K 40K
ao“
gL 2 8 8 SRB, ee, SM,
Spindle Suction Stack Slot. Speed
Position Foot Height Depth (rpm)
{in} (thou)
Figure 11.2 Main effects of control factors on router bit life and some 2-factor interac-
tions. Two-standard-deviation confidence limits on the main effect for the starting level are
also shown.Sec.11.7 Data Analysis.
1564
|x B Interaction
é
&
3
3107
= 8,
2
§
5
257
B,
0
30K 40K
I. Speed (rpm)
15-4
e 1x C Interaction
3
2
3 10-4
g
8
°
§ &
11 ZK
0
30K 40K
\. Speed (rpm)
18
Ix G Interaction
2
2
.
10
g 6,
2
5
i
$5
ny
A
°
30K 40K
|. Speed (rpm)
18
GxB Interaction
t
2
3
10
3
= B,
2
2
§
3 5
8,
0
36 4
G. Stack Ht (in)
Figure 11.2 (Continued) Main effects of control factors on router bit life and some
2-factor interactions.270 Reliability Improvement Chap. 11
TABLE 11.5 PREDICTION OF LIFE USING ADDITIVE MODEL”
Starting Condition Optimum 1 ‘Optimum 2
Factor Setting | Contribution | Setting | Contribution | Setting ) Contribution
A. Suction A? - A - Ay -
B. X-Y feed By 181 By 181 B, -1.81
C. infeed Ga - Oi ~ C -
D. Type of bit Ds 5.44 Ds 344 Ds 5.44
E. Spindle position | £,-E, - E\-E, - E\-Es -
F, Suction foot F, -115 Fy 1.75 F, 115
G. Stack height G, 2.63 G 2.63 G 2.63
H. Depth of slot Hy - Hy - ay -
1. Speed b 2.37 It 2.37 h 237
1xG interaction | 1,6; -230 G, | 231 LG, 231
Overall mean 5.94 5.94 5.94
Total 8.88 22.25 18.63
* Life in hundreds of inches.
. these predictions, obviously, are likely to be on the low side, especially the predic-
tion under optimum conditions which is likely to be much less than the realized value.
From the machine logs, the router bit life under starting conditions was found to
be 900 in., while the verification (confirmatory) experiment under optimum conditions
yielded an average life in excess of 4,150 in.
In selecting the best operating conditions for the routing process, one must con-
sider the overall cost, which includes not only the cost of router bits but also the cost
of machine productivity, the cost of cleaning the boards if needed, etc. Under the
‘optimum conditions shown in Table 11.5, the stack height is 3/16 in. as opposed to 1/4
in. under the starting conditions. This means three panels are cut simultaneously
instead of four panels. However, the lost machine productivity caused by this change
can be made up by increasing the X-Y feed. If the X-Y feed is increased to 80 in. per
minute, the productivity of the machine would get back approximately to the starting
level. The predicted router bit life under these alternate optimum conditions, calledSec. 11.8 Survival Probability Curves 2m
optimum 2, is 1,863 in., which is about twice the predicted life for starting conditions.
Thus, a 50-percent reduction in router bit cost can be achieved while still maintaining
machine productivity. An auxiliary experiment typically would be needed to estimate
precisely the effect of X-Y feed under the new settings of all other factors. This would
enable us to make an accurate economic analysis.
In summary, orthogonal array based matrix experiments are useful for finding
optimum control factor settings with regard to product life. In the router bit example,
the experimenters were able to improve the router bit life by a factor of 2 to 4.
Accelerated Life Tests
Sometimes, in order to see any failures in a reasonable time, life tests must be con-
ducted under stressed conditions, such as higher than normal temperature or humidity.
Such life tests are called accelerated life tests. An impomant concem in using
accelerated life tests is how to ensure that the control factor levels found optimum dur-
ing the accelerated tests will also be optimum under normal conditions, This can be
achieved by including several stress levels in the matrix experiment and demonstrating
additivity. For an application of the Robust Design method for accelerated life tests,
see Phadke, Swann, and Hill [P6] and Mitchell [M1].
11.8 SURVIVAL PROBABILITY CURVES
The life data can also be analyzed in a different way (refer to the minute analysis
method described in Taguchi [T1] and Taguchi and Wu [T7}) to construct the survival
probability curves for the levels of each factor. To do so, we look at every 100 in. of
cut and note which router bits failed and which survived. Table 11.6 shows the sur
vival data displayed in this manner, Note that a 1 means survival and a 0 means
failure. The survival data at every time point can be analyzed by the standard method
described in Chapter 3 to determine the effects of various factors. Thus, for suction
levels A, and A, the level means at 100 in, of cut are 0.4375 and 0.6875, respec-
tively. These are nothing but the fraction of router bits surviving at 100 in. of cut for
the two levels of suction. The survival probabilities can be estimated in a similar
manner for each factor and each time period—100 in., 200 in., etc. These data are
plotted in Figure 11.3. These plots graphically display the effects of factor level
changes on the entire life curve and can be used to decide the optimum settings of the
control factors. In this case, the conclusions from these plots are consistent with those
from the analysis described in Section 11.6.
Plots similar to those in Figure 11.3 can be used to predict the entire survival
probability curve under a new set of factor level combinations such as the optimum
combination, The prediction method is described in Chapter 5, Section 5.5 in conjunc-
tion with the analysis of ordered categorical data (see also Taguchi and Wu [T7].Reliability Improvement Chap. 11
Survival
Probability
Probability
0 1000 2000
0 1000 2000
Inches of Router Bit Movement
Figure 11.3. Effects of control factors on survival probability curves.Sec. 11.8 Survival Probabllity Curves
1.0;
F. Suction Foot
Survival
0 1000 2000
1.0
G. Stack Height
G
$3 \"
e
2 ane Ke He KeXeNeKONeKeNe Kee
° 1000 2000
1.0
H. Depth of Slot
s
BO dex,
S805 Hy
3 2 (HKONKHNKKANK,
0 1000 2000
1.0
. |, Speed
32
$
28
ae
0 1000 2000
Inches of Router Bit Movement
Figure 11.3. (Continued) Effects of control factors on survival probability curves.Chap. 11
Reliability Improvement
274
TABLE 11.6 SURVIVAL DATA AT VARIOUS TIME POINTS
Survival at Various Time Points*
(100s of inches)
16. 17.
4.15,
% 1 WL 12 13.
B45 8
2
L
No.
10
u
2
B
“4
15
16
”
18
19
a
B
7
29
31
2
* Entry 1 means survival of the bit; entry 0 means failure.Sec. 11.9 Summary 25
Note that in this method of determining life curves, no assumption was made
regarding the shape of the curve—such as Weibull or log-normal distribution. Also,
the total amount of data needed to come up with the life curves is small. In this exam-
ple, it took only 32 samples to determine the effects of eight control factors. For a sin-
gle good fit of a Weibull distribution, one typically needs several tens of observations.
So, the approach used here can be very beneficial for reliability improvement projects.
The approach of determining survival probability curves described here is similar
to the Eulerian method of analyzing fluid flows. In the Eulerian method, one looks at
fixed boundaries in space and examines the fluid masses crossing those boundaries. In
the life study example, we look at fixed time points (here, measured in inches of cut)
and observe which samples survive past those time points. The survival probability
curves are constructed from these observations. In the Lagrangian method of analyzing
fluid flows, one tracks a fluid particle and examines the changes in velocities, pressure,
etc., experienced by the particle. Fluid flow equations are derived from this examina-
tion. In the reliability study, the analogous way would be to observe the actual failure
times of each sample and then analyze the failure data. The analysis of Section 11.6
was based on this approach. However, when there are many censored observations (in
this example eight router bits did not fail by the time the experiment was stopped and
readings were taken only at 100-in. intervals), the approach of this section, where sur-
vival probability curves are estimated, gives more comprehensive information.
11.9 SUMMARY
© Reliability improvement can be accomplished during the design stage in one of
the three ways: (1) reduce sensitivity of the product’s function to the variation in
product parameters, (2) reduce the rate of change of the product parameters, and
(3) include redundancy. The first way is most cost-effective and it is the same as
parameter design. The second way is analogous to tolerance design and it
involves using more expensive components or manufacturing processes. The
third way is used when the cost of failure is very large compared to the cost of
providing spare components or the whole product.
Finding an appropriate continuous quality characteristic and reducing its sensi-
tivity to noise factors is the best way of improving reliability. When an
appropriate quality characteristic cannot be found, then only life tests should be
considered for life improvement. Matrix experiments using orthogonal arrays
can be used to conduct life tests efficiently with several control factors.
The S/N ratio calculated from a continuous quality characteristic can be used to
estimate the average life of a new product. Let 7); be the S/N ratio for the new
product and 1) be the S/N ratio for a benchmark product whose average life is276
Reliability Improvement Chap. 11
known. Then the average life of the new product is r times the average life of
the benchmark product, where
The goal of the router bit case study was to reduce dust formation. Since there
existed no continuous quality characteristic that could be observed conveniently,
the life test was conducted to improve the router bit life.
r=10
Effects of nine factors, eight control factors and spindle position, were studied
using an orthogonal array with 32 experiments, Out of the nine factors, two fac-
tors had four levels, and the remaining seven had two levels, Four specific
2-factor interactions were also studied, In addition, there were several physical
restrictions regarding the factor levels. Use of Taguchi's linear graphs made it
easy to construct the orthogonal array, which allowed the estimation of desired
factor main effects and interactions while satisfying the physical restrictions.
Only one router bit was used per experiment. Dust formation was observed
every 100 in. of cut in order to judge the failure of the bit. The length of cut
prior to formation of appreciable dust or breakage of the bit was called the bit
life and it was used as the quality characteristic. Each experiment was ter-
minated at 1,700 in. of cut regardless of failure or survival of the bit. Thus the
life data were censored.
Effects of the nine factors on router bit life were computed and optimum levels
for the control factors were identified. Under a set of optimum conditions, called
optimum 1, a 4-fold increase in router bit life was observed, but with a
12.5-percent reduction in throughput. Under another set of optimum conditions,
called optimum 2, a 2-fold increase in router bit life was observed, with no drop
in throughput.
The life data from a matrix can also be analyzed by the minute analysis method
to determine the effects of the control factors on the survival probability curves.
This method of analysis does not presume any failure time distribution, such as
log-normal or Weibull distribution, Also, the total amount of data needed to
determine the survival probability curves is small.Appendix A
ORTHOGONALITY OF A
MATRIX EXPERIMENT
Before defining the orthogonality of a matrix experiment we need to review the follow-
ing definitions from linear algebra and statistics. Recall that N,, Nz, --., Tg are the
observations for the nine rows of the matrix experiment given by Table 3.2. Consider
the linear form, L;, given by
ZL; =wiit + with + - °° + woh (A.1)
which is a weighted sum of the nine observations. The linear form L; is called a con-
trast if the weights add up to zero—that is, if
wit twig to7 > twig = 0 (A.2)
Two contrasts, L; and L3, are said to be orthogonal if the inner product of the
vectors corresponding to their weights is zero—that is, if
WW + WW + 77° + WigW29 (A3)
Let us consider three weights w 11, w 12, and w 43 corresponding to the three lev-
ls in the column 1 of the matrix experiment given by Table 3.2 (Chapter 3), Then we
call the following linear form, Ly, the contrast corresponding to the column 1
277278 ‘Appendix A
Li =wuth +wt2 + ws + wate + Wis + Wie
twisty + wang + Wisto (AA)
provided all weights add up to zero. In this case, it implies
Wi tw +13 = 0. (AS)
Note that in Equation (A.4) we use the weight wy, whenever the level is 1, weight W412
whenever the level is 2, and weight w13 whenever the level is 3.
An array used in a matrix experiment is called an orthogonal array if the con-
trasts corresponding to all its columns are mutually orthogonal. Let us consider
columns 1 and 2 of the matrix experiment given by Table 3.2. Equation (A.4) is the
contrast corresponding to column 1. Let w2,, w2, and w 3 be the weights
corresponding to the three levels in column 2. Then, the contrast corresponding to
column 2 is L2, where
Lz = wath + wate + wats + waits + Wats + W23Tle
twat + Wats + Wot. (A6)
Of course, the weights in Equation (A.6) must add up to zero—that is
wa + W722 +W3 = 0. (A.7)
The inner product of the vectors corresponding to the weights in the two contrasts L ,
and L» is given by
WiiWar + Wiiwo2 + Wiha + WiaWar + Wiawe + Wi2Wrs
+ Wi3Wap + 1322 + 1323
= [wn +02 +003] [war tm +m)
=0.
Hence, columns 1 and 2 are mutually orthogonal. The orthogonality of all pairs of
columns in the matrix experiment given by Table 3.2 can be verified in a similar
manner. In general it can be shown that the balancing property is a sufficient condition
for a matrix experiment to be orthogonal.Appendix A 279
In column 2 of the matrix experiment given by Table 3.2, suppose we replace
level 3 by level 2. Then looking at columns 1 and 2 it is clear that the two columns
do not have the balancing property. However, for level | in column | there is one row
with level 1 in column 2 and two rows with level 2 in column 2. Similarly, for level 2
in column 1, there is one row with level 1 in column 2 and two rows with level 2 in
column 2. The same can be said about level 3 in column 1. This is called propor-
tional balancing. It can be shown that proportional balancing is also a sufficient condi-
tion for a matrix experiment to be orthogonal.
Among the three weights corresponding to column 1, we can independently
choose the values of any two of them. The third weight is then determined by Equa-
tion (A.5). Hence, we say that column 1 has two degrees of freedom. In general, a
column with n levels has n — 1 degrees of freedom.Appendix B
UNCONSTRAINED
OPTIMIZATION
Here we define in precise mathematical terms the problem of minimizing the variance
of thickness while keeping the mean on target and derive its solution.
Let 2 = (21, 22)... , 29)” be the vector formed by the control factors; x be the
vector formed by the noise factors; and y(x; z) denote the observed quality characteris-
tic, namely the polysilicon layer thickness, for particular values of the noise and control
factors. Note that y is nonnegative. Let [1(z) and o7(z) denote the mean and variance
of the response. Obviously, and o? are obtained by integrating with respect to the
probability density of x, and hence are functions of only z.
Problem Statement:
The optimization problem can be stated as follows:
Minimize 0°(z)
z 1)
Subject to (Z) = Ho -
This is a constrained optimization problem and very difficult to solve experimentally.
281282 Appendix B
Solution:
We postulate that one of the control factors is a scaling factor, say, z. It, then,
implies that
yx; 2) = 2,A(x; 2’) (B.2)
for all x and z, where z’ = (22, 23, . . 4 2g)", and h(x; 2’) does not depend on 2.
It follows that
B@) =2, Baz’) (B.3)
and
(2) = 2? f(z’) (B.4)
where [1, and 6} are, respectively, the mean and variance of A(x, z).
Suppose z" = (z,*, z’") is chosen by the following procedure:
ont’)
Hie’)
(a) z” is an argument that minimizes
+ ant
(b) 2, * is chosen such that p(z)*, 2") = Ho.
We will now show that z* is an optimum solution to the problem defined by Equation
B.1).
First, note that z” is a feasible solution since 4(z,", 2”) = Ho. Next consider
any feasible solution z = (z, 7”). We have
1, 7)
OG), 7) = We), 7)- SAP
fen, Py = Wer 2) a
oi’)
Hi@’)
= up - (Bs)Appendix B 283
Combining the definition of z* in (a) above and Equation (B.5), we have for all feasi-
ble solutions,
(2), 7) 2 (1% 2").
Thus, z” = (21, 2”) is an optimal solution for the problem defined by Equation (B.1).
Referring to Equations (B.3) and (B.4) it is clear that
(21, 2) _ of(2')
Weer Z) whe’)
for any value of 2). Therefore, step (a) can be solved as follows:
(a) choose any value for z, then
1, 2’)
(a”) find 2’ that minimizes ———
we v)
In fact, it is not even necessary to know which control factor is a scaling factor. We
can discover the scaling factor by examining the effects of all control factors on the
signal-to-noise (S/N) ratio and the mean, Any factor that has no effect on the S/N
ratio, but a significant effect on the mean can be used as a scaling factor.
In summary, the original constrained optimization problem can be solved as an
unconstrained optimization problem, step (a), followed by adjusting the mean on target,
step (b). For obvious reasons, this procedure is called a 2-step procedure. For further
discussions on the 2-step procedure, see Taguchi and Phadke [T6]; Phadke and Dehnad
[P4]; Leon, Shoemaker and Kackar (L2]; Nair and Pregibon [N2}; and Box [B1]. The
particular derivation given in this appendix was suggested by M. Hamami.
Note that the derivation above is perfectly valid if we replace 7; in Equation
(B.2) by an arbitrary function g(z,) of z;. This represents a generalization of the com-
mon concept of linear scaling.Appendix C
STANDARD ORTHOGONAL
ARRAYS
AND
LINEAR GRAPHS*
* The orthogonal arrays and linear graphs are reproduced with permission from Dr. Genichi Taguchi
and with help from Mr. John Kennedy of American Supplier Institute, Inc. For more details of the
orthogonal arrays and linear graphs, see Taguchi (T1] and Taguchi and Konishi [T5].
285266
Appendix C
Ly
L, (2?) Orthogonal Array
Expt. | Column
Na ft 2 3
rjrag
2 {122
3 f2 12
a[223
Lo}
Linear Graph for L,
Lg (2’) Orthogonal Array
3
1e-———e2
[129]
Interaction Table for Ly
Expt Column
weft 234 8 67
rfrrrarada
air 112222
3 fi 221 122
afr 222204
sj2121212
6 [2.22124
ale 201221
e{z212112
Linear Graphs for L,
a) (2)
1
3 22
3 5 7 1 Ses
e 8
2 4 7Appendix C 287
Ly 3") Orthogonal Array
Expt. | Column
.2a¢
rtraig
2 {1222
31333
a fz 23
s|2231
6 {2312
rls. 32
a {3213
o |33 21
Linear Graphs for L,
34
1e——e2
Ln @")
Ly; (2"") Orthogonal Array
Expt. Colusa
Nw. |/1 23 45 67 8 9 wou
aftrrrarrraidgs
2frrri 122222 2
3]. 12221112 2 2
afi2122 1224 1 2
sjt224.22 424 21
@ Ji 2221221211
ri2122412 21 21
a {2021222 14 1 2
9jz211222122 01
wil222171122 1 2
u}2212121 112 2
wl[z211242422 1
‘Note: The interaction between any two columns is confounded partially with the remaining nine columns.
Do not use this array if the interactions must be estimated.Lis 2")
Ly (2'°) Orthogonal Array
Expt, Columa
No |/1 23456789 nee wos
ajriag paodorarada.
afrrrrrtr222 2 22 22
gf. t1222 213 1 122 22
ajiar2222222 21 1149
s[i22112 2141 2 rol 2 2
6 j1 22 122223 12 2 15
pj) 222210412 2 2 214
a f12222 122 1 1 11 2 2
9 f21212121212 1279 2
wl2r2i2r2212 1 2 1 2 1
muf212212112 12 2 1 24
mBi2122121212 3 12 12
wf221 122 122 1 12 2 4
wf2 2.12232 1 1 2 2 § 9 2
wf2 212 1212 2 12 41 2
wy2 21241 22112 12 24
Interaction Table for ., (2'5)
Column
Coumnjt 23456789 D1 2B Os
1yy3 2547698 1 0 2 5 4
2 M1674 500 8 9 41S 2 3
3 O@7654u09 § is 4B 2
4 i232 4 Is 8 9 oH
5 93 2B 49 8 1 wo
6 @1Wis 2 3 0 un 8 9
7 OW 2 Uo 9 8
8 M123 4567
9 @3 25 476
10 dol 67 45
u ay7 65 4
2 ay 23
B ay 32
4 ay
[3s as,
Appendix CAppendix C m8
Linear Graphs for Lig
@
1 13 WW
co 12
3 9 1/ \a
0
© ©290 Appendix C
Lie 4)
Lig (4°) Orthogonal Array
Couma
10203 4 5
piroroaoaog
241 2 2 2 2
> ]) 3 3 3 3
a]1 4 4 4 4
s|2 1 2 3 4
6|2 2 1 4 3
7 ]2 3 4 1 2
Biz 4 3 2 0
9 js 1 3 4 2
wls 2 4 3 9
nfs 3 1 2 4
mits @ 2 1 3
nj@ or 4 23
wf4o2 3 1 4
is ]4 3 2 4 4
w leo 4 1 3 2
Note: To estimate the interaction between columns 1 and 2, all other columns must be kept empty.
Linear Graph for L',,Appendix C 291
Lys (2! x 37)
Lye Q! x 3”) Orthogonal Array
Expt. Columa
afrro rao viroa.d
2/11 22 22 22
3 ji) r 33 33°33
|i 2 14 22 33
s ji 2022 33 11
6 ji 2 33 11 22
7J13 12 13 23
eji 3 23 21 31
9] 3 31 32 42
wl2. 13 32 215
m2 i 21 13 32
m]2i1 32 21 13
wl22 12 35 32
wi22 23 12 13
wej22 31 23 21
wl? 3 43 23 12
wiz 3 21 31 23
w)2 3032 12 31
Note: Interaction between columns | and 2 is orthogonal to all columns and hence can be estimated
without sacrificing any column. The interaction can be estimated from the 2-way table of columns 1 and
2. Columns 1 and 2 can be combined to form a 6-level column, Interactions between any other pair of
columns is confounded partially with the remaining columns.
Linear Graph for LgAppendix C
Las (5°)
‘L2s (5°) Orthogonal Array
Columa
4
3
Expt.
No.
10
n
n
3
4
re
6
7
1
a
2
Note: To estimate the interaction between columns 1 and 2, all other columns must be kept empty.
Linear Graph for L,
3,4,5,6
2
4293
Appendix C
Lm @”) Orthogonal Array
po | POR RR 8 pote sain Bins | mH HOR an
Ee [awe cae cae [enw enn ane lame wen en
Ba [enn eee neo [Sateen ene aan AAA RRInteraction Table for Ly, (3)
Cotumn
Gum }1 2 3 4 5 6 7 8 9 © HM RB
alg 3 2 2 6 5s 5 9 8 8 Rou HD
44377 6 0 © 9 B B RB
2 @ i) 8 9 mo $s 6 7 5 6 7
43 0 2p pon 2B 8 9 10
3 o 1 8 © 8 7 5 6 6 7 Ss
2B U2 2k Bn ww 8 9
4 w 0 8 67 8 7 5 6
23 Bou 2 9 0 8
s ® § 12 3 4 2 4 3
7 6 Wo 2 8 w 9
6 14 2 3 3 2 4
© s 5 2 1 0 9 8
7 mi 4 2 4 3 2
Rou Bb 9 8 10
s 1 oto2 3 4
5
wm 9 5 7 6
9 14 203
Oe 7 6 5
30402
10)
oe os 7
4 14
p
nr
2 1
12
ay
13 a3)
AppendixAppendix C
Linear Graph for Lz,
(2)Appendix C
Ln @")
Ly (2") Orthogonal Array
| awn | acca | aca | mtn | eon [anna | area | awe
al asen | aaa | -san|oena |acna | ean fanaa | sane
| ena |acce | ewan | anne | wan [area | wae | anne
BR) -wae | ewan | anne | aenn [anna [anna | maa | ann
Benes fawn [anae lane | mann [ann [acca | anne
| cae | awn | ewan | owe [anne [aren | enw | ene
Z| ewan | onan |anan | onan | naan | ona | eae | ean
RB) awa | a-a- | ana | cane lawn | owen | mace [aren
a | ceca lace |acae | cane | ace | aean | wean | mane
BR | aac laces | cana | anne | saee | acan [ae | oan
gaara | cana | anaes | anae laren | wean | macaw | wen
ge [ISIS | ISTO] DAIS Pare EQS PTSIS PISS fTOrS
82 |)--aa|aan- | anae | sae faw-- | sna | enw panne
Bs none l|ucore |agsa | aaee|[seaa|aaas | sane | asae‘Appendix C
Interaction Table for Lz (2)
cum{t 24s eT ee ee eM
‘ Hiri aoe 8 oO V9 Bw NS wD
> Meas 47 eee ee Hw awe
2 om 1 2 3 te NM Be nee
* Uy OD me wn e
» mia sete ee wR
x @s2bn ee no
2 mies RB ie
» ome Fas
2 ayn
» ws
x on:
207298 Appendix
Linear Graphs for L,Appendix © 299
Linear Graphs for Ls, (Continued)
(6)
18
.
.
2
(7)
462 2% 12 28
Mo 7 13
Afi .
"18.26 16
Te s Ke 2
A %
29300 Appendix C
Linear Graphs for Ly, (Continued)
(10)
(1)301
‘Appendix C
we
Li
(2! x 4°) Orthogonal Array
La
Coloma
10
Expt.
No.
10
u
2
1B
16
”
Note: interaction between columns 1 and 2 is orthogonal to all columns and hence can be estimated
without sacrificing any column. It can be estimated from the 2-way table of these columns. Columns 1
and 2 can be combined to form an 8-level column.
Interactions between any two 4-level columns is con-
founded partially with each of the remaining 4-level columns.302
1 2
oe
3
e
Linear Graph for L'sy
4 5 6
eee
Appendix C303
Appendix C
Ly 2" x 3”)
Ly (2"' x 3") Orthogonal Array
Rl eae man | ana ame lama eam | mn nm lam mann | mam mam
| -se neal ameaca | man ane fame ace [ane can | oan ace
Bi [meena [acter ee | ercsgs a esen [ieee are [imma eee | wegen =
Ea | oan can | man cae | mae cae | cae cae | owe cae | mae man
zg
8
Bg [oso vee [ewe gua [age ees |eassans|aanaas| aaa ane
Interaction between any two columns is partially confounded with the remaining columns.
Note:Appendix C
Lg @ x 3) Orthogonal Array
ek | am anclans malas can [ana canlera can|are ann
Bo | can ean | ane caalana cam|eca manlans amalaan ane
Be | ae one lann am lane den | man maa |awm amen leon ane
° sanlam- aaa |eon en |onm melas ame lene oan
Ba | -s0 eoe[neo san|nan enslaga aaslaan ang|aae ans
Notes: (i) The interactions 1 x 4,2 x 4 and 3 x 4 are orthogonal to all columns and hence can be obtained
factor interaction between columns 1, 2, and 4 can be obtained by
without sacrificing any column. (ii) The 3.
and 4 and
evel factor can be formed by combining columns 1, 2,
by keeping column 3 empty. (iii) Columns 5 through 16 in the array L3,(2° x 3!) are the same as the
keeping only column 3. empty. ‘Thus, a 12
columns 12 through 23 in the array Lag(2!' x 32),Appendix C
Linear Graphs for L's,
a) (2)Appendix C
Esp (2' x 5") Orthogonal Array
Column
‘
son
1
2345
2
Expt
No.
29
0
BERSS
(Continued)Appendix 307
Ls (2! x 5") (Continued)
Expt. Colume
Not 2345678 9 OH
a i[2414s5 412 5 2 3 3
ai[2425152313 4 4
we i243 121342 4 5 5
wi[24423 2453 5 1 1
a [2453 43 514 1 2 2
wi[2s51522534 4 3 1
a i252 133145 5 4 2
w[25 32442 51 1 5 3
ei[25 43 553122 1 4
ol25 5411423 3 2 5
Note: Interaction between columns | and 2 is orthogonal to all columns and hence can be estimated
without sacrificing any column. It can be estimated from the 2-way table of these two columns. Columns
1 and 2 can be combined to form a 10-level column.
Linear Graphs for Len
1 2 3 4 5 6 7 8B 9 10 11 12Appendix C
Ly (2! x 3)
Ls, (2! x 3%) Orthogonal Array
[own oa pame wee [owe woe laes cae |ane cama analans
RB | cam nelera meal ame|aen canlans cas lane acalann
| enn mea lame amelaam ana |aea anna mealana camlann
R | com ane|meea mnalanm melanin |oam ame [aaa cam| ann
RB can mealam= mnalans camlane can|are maalans cam|aan
RB [eon nnalam= mealame= ana laam men lans mela a calan=
Bs [inn conan ncalncn ecaleca ara |mon cam [sam conan
88 | -a wae |-an nna | men analans ans lees ams lans ann |aon
S| ewe -aelean -amlane eanlannan-lannmcalaaa nnalann
Ba |-or ewelnws casi[ose see/saa aaa [aan RAS RAs Ras R RS
8
(Continued)Appendix C 309
Ly, (2! x 3%) (Continued)
Expt Column
No |12345678 12-13 14 18-16 47 18 19 20 24 22 23 24 25 26
w [2223121312313 2233241 231132
[2223121323 10215 301322312203
@f22230203302320022093725520
fa [22302321023 1323 20023322301
42230232023 020313225 101335122
go \22312324 3123212053 t2211253
w f231323 121323027113 223321128
[23132302213 125 2213501322357
i251 3234252023135 20t 2213502
[232131221323 122311F 2012332
so f23 2131232135 1253 12243225103
si j23213123321231123321331221
so [23321231132312322301233200
so )23 321236215125 153122311322
sf23321231321231211233122133
Notes: (i) Interaction between columns 1 and 2 is orthogonal to all columns and hence can be estimated
without sacrificing any column, Also, these columns can be combined to form a 6-level column. (ii) The
interactions | x 9, 2x9, and 1 x 2x 9 appear comprehensively in the columns 10, 11, 12, 13, and 14,
Hence, the aforementioned interactions can be obtained by keeping columns 10 through 14 empty. Also,
columns 1, 2, and 9 can be combined to form a 18-level column by keeping columns 10 through 14
empty.
Linear Graph for Ly,
1 9 25,26‘Appendix C
310
Lea 2°)
Le (2) Orthogonal Array
g | --nwadn- | anne nan | wee enn | Hoe | eon
BR) -canannn | ween ean | ees aanen | wee eee | eee an
Ce en a | enanann
g | oon | Hann lanes eee | anne | een
a | --wann- | eee | eee nae | ane ee cae | ene
wg) -canann- | Hew e | ean | Hee = | een
B| oc nae- | aewanane | eee | mean | eee
ee ee ee eee eres
gq | ocaceccan | aaendaee | aa een | Hanae | nea
g | onan onan | omens [aa ocanee |aaecawns | oewaceae
B= | oenewces | ~ some ree [ote nee | soos ens | og te
By | anand | wade ces | wanwacce | see cewwn | ween eoee
BR) ---eanan | nae ene | eee aan | nanan | nee
Ba |-nnveene |osuasens | cesanaaa | wannaaaa | aERSRARs
(Continued)311
Appendix C
Lea (2%) (Continued)
Columa
1223.4 S 6 7 & 910 AL 12 13 14 45 16 17 18 9 20 21 22 23 24 28 26 27 28 29 301
2
2
1
1
1
1
2
2
'
1
22
22
'
1
2
2
L
'
2
2
1
1
2
2
24
21
20
21
12
12
'
122
'
'
1
t
2
2
2
2
122
122
221
221
1220122402201
r2200221122100
22 22 22
22
1
1
'
'
'
1
1
122
2212
12
24
20
22
it
it
124 1
1201224
212201
212211
221122
221122
1
1
No.
(Continued)1
1
2
2
1
Appendix C
2222
an
ha
2222
i
1
1
22
22
t
22
2
tt
it
22
it
wt
1
22
ra
2222
122222222
1
ba
122222222
tt
12222
12222
1
Mt
122
2222
rid
222
ht
2222
1
1220
it
ha
ni
i
122
'
1
12
1
1
'
1
22
22
1
'
1
2222
1
2222222
1
1
1
1
'
2222
'
1
2222
22222222
'
'
'
'
1
1
1
1
1
1
'
1
1
1
122
1
2a
122
12222
22221
1
12222
2222222222222222
1
'
1
1
1
1
1
1
1
22
22222222
1
2222
2222
2222
Il
2222222222222222
'
'
1
1
1
t
Column
'
222
'
'
1
22
2222
1
1
1
22
22
22
1
2222
2222
1
22
22
1
'
es (2%) (Continued)
rt
11
p12222
22221
rid
'
1224
2210
122
2211
Mt
1122
22
i
1
12222
2222222222222222
1
22222222
2222
22222222
1
221
1
'
t
1
222222221
1
222222221
1
'
1
1
1
1
1
1
2222
1
2222
1
i
22221
Ma
t
122
1
b220
221
1
1
1
re!
1
1
122
1
22221
1
122222222
221
1122
224
tt
22
21i22
221
21122
12222
12222
12222
'
2222
2
22222222222222222222222222222222
2222
r220001
wl22221
wi2222
2
4
‘
a
2
B
8
”
wf 2 2
mw i22
RARAAR AE
[No |32 33 34 38 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 S7 58 59 60 61 62 3
Expt |
312
2
2
(Continued)
1
22120
2
1
2
212212
1313
2
2
22
tt
22
tt
22
tt
tt
2
2d.
22
it
rt
1
220
1
'
22
22
a
12
2
1
ae
12
12
21
Mt
it
22
22
22
22
it
22
22
rt
nt
21
2
2
1
1
22
22
Column
1
220
221
2
20
20
t
1
22
.s4 (2°) (Continued)
'
122
'
22
'
22
122
1
122
1
1
22
'
22
'
22
12
2
1
22
1
1
1
t
22
22
'
122
1
22
1
22
'
4
22
ti
22
2
1
2 ai
22
1
1
2
2
1
ri224
22
L
22
2
1
22
22
2122
12d
'
21
2
1
2
122
122
2
2
122
22112
22
22
1
2a
122
1
12 1
2122
221
2
2
1
2
22
22
'
1
2
1
22
1
1
'
12
2
2
1
a
No. |32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
Expt,
a
“a
6
31
2
33
4
ss
56
37
9
i241
‘Appendix C
1122
122112
121221210
2201 22
122120
1122129
221
2Appendix C
314
Lig 4") Orthogonal Array
a) inne wane |aown meen [aera doen [anaes came | emde oaae
[sate can = | seven wa | eine emetn laren itera | cine sive =
ge | ieee ers cece scar [ati aes | fant acts | anc nein
8s | ancalen| cela soul |caceaces|accasnas|ouas area
Be [-ane were |oeenaens (neaa nana | nana agam | aan BRAS
(Continued)315
Appendix C
Bl mera wcen | amen mere | name waar
ga [ores ceee aces cone [eens enn
Sel asea aan | same acem [aeon enae
Bg |seoseves |eaaa aang l|nass seesAppendix C
316
Ls: 3") Orthogonal Array
a wo can [wow wea nna [amon wa cae saa lens
2 Se can) ane Re we | oe eR aS samen men | aan
as anmnn | aan ann aan lana mne aan [ann ane man | ane
8e naman [ana mam eee fama een e ma eee aan [eee
Ba werea|Sua ase sce | SRT NIE VR AR ARR ABS | RRA
(Continued)317
Appendix C
Q) SSS in peiestet e
Bfoncotc|orn ton art [gee 202 ot IE IED PD
: sag ter [tances tt | Mec 22 Tet Le I LEE
arco rrr IIS ET RED PELLET LET | LIE IND IDDM
ie =n [wnat bam == - cannon cls [aensesnan|ole nanan
PLU TET PSST SET ISE PRIM SS Anais tet seette cs
: wane | aan ann wee DD Lhn [ten col nan [oon lle
o|ananna | aaeiaae saa | aan menad [wae Mam ame |one mee om
‘JSS /fecs[2 2 sseses
Bs a9 939 | ees san aaz nag ees seo |asssesece|czeerees
(Continued)‘Appendix C
3836-37 3B 39 40
q
Column
M22 23 24 28 26 27 9 MH
Expt,
No.
318
(Continued)
3
2233
2
2
2
13
4
Is
16
7
1B
19
Fa
3319
Appendix ©
3
8
z
3
a
a
&
a
a
8
x
a
BSSS3Se2R
B
eR RER RSsAl
A2.
BL.
B2.
B3.
B4.
ot.
2.
REFERENCES
Addelman, S. “Orthogonal Main Effect Plans for Asymmetrical Factorial Experi-
ments.” Technometrics (1962) vol. 4: pp. 21-46.
Anderson, B. "Parameter Design Solution for Analog In-Circuit Testing Prob-
lems." Proceedings of IEEE International Communications Conference, Phi-
ladelphia, Pennsylvania (June 1988) pp. 0836-0840.
Box, G. E. P. "Signal to Noise Ratios, Performance Criteria and Transforma-
tions.” Technometrics (February 1988) vol. 30, no. 1, pp. 1-31
Box, G. E. P, and Draper, N, R. Evolutionary Operations: A Statistical Method
for Process Improvement. New York: John Wiley and Sons, 1969.
Box, G. E. P., Hunter, W. G., and Hunter, J. S$. Statistics for Experimenters—An
Introduction to Design, Data Analysis and Model Building. New York: John
Wiley and Sons, 1978.
Byme, D. M. and Taguchi, S. "The Taguchi Approach to Parameter Design.”
ASQC Transactions of Annual Quality Congress, Anaheim, CA, May, 1986.
Clausing, D. P. "Taguchi Methods Integrated into the Improved Total Develop-
ment." Proceedings of IEEE International Conference on Communications.
Philadelphia, Pennsylvania (June 1988) pp. 0826-0832,
Clausing, D. P. "Design for Latitude.” Intermal Memorandum, Xerox Corp.,
1980.
321322
C4.
&.
D1.
D2.
D3.
D4.
DS.
Fl.
Gl.
G2.
HI
H2.
H3.
a.
w2.
3.
KL.
References
Cochran, W. G. and Cox, G. M. Experimental design. New York: John Wiley
and Sons, 1957,
Cohen, L. "Quality Function Development and Application Perspective from
Digital Equipment Corporation." National Productivity Review, vol. 7, no. 3
(Summer, 1988), pp. 197-208.
Crosby, P. Quality is Free. New York: McGraw-Hill Book Co., 1979.
Daniel, C. Applications of Statistics t0 Industrial Experimentation. New York:
John Wiley and Sons, 1976.
Deming, W. E, Quality, Productivity, and Competitive Position. Cambridge:
Massachusetts Institute of Technology, Center for Advanced Engineering Study,
1982,
Diamond, W. J. Practical Experiment Design for Engineers and Scientists.
Lifetime Learning Publications, 1981.
Draper, N. and Smith, W. Applied Regression Analysis.
Duncan, A. J. Quality Control and Industrial Statistics, 4th Edition. Home-
wood, Illinois: Richard D. Irwin, Inc., 1974.
Feigenbaum, A. V. Total Quality Control, 3rd Edition, New York: McGraw
Hill Book Company, 1983.
Garvin, D, A. "What Does Product Quality Really Mean?" Sloan Management
Review, Fall 1984, pp. 25-43.
Grant, E, L. Statistical Quality Control, 2nd Edition, New York: McGraw Hill
Book Co., 1952.
Hauser, J. R. and Clausing, D. "The House of Quality." Harvard Business
Review (May - June 1988) vol. 66, no. 3, pp. 63-73.
Hicks, C. R. Fundamental Concepts in the Design of Experiments. New York:
Holt, Rinehart and Winston, 1973
Hogg, R. V. and Craig, A. T. Introduction to Mathematical Statistics, 31d Edi-
tion. New York: Macmillan Publishing Company, 1970.
Jessup, P. "The Value of Continuing Improvement." Proceedings of the IEEE
International Communications Conference, CC-85, Chicago, Dlinois (June
1985).
John, P. W. M. Statistical Design and Analysis of Experiments. New York:
Macmillan Publishing Company, 1971.
Juran, J. M. Quality Control Handbook, New York: McGraw Hill Book Co.,
1979.
Kackar, R. N. "Off-line Quality Control, Parameter Design and the Taguchi
Method.” Journal of Quality Technology (Oct. 1985) vol. 17, no. 4, pp.
176-209.References 323
K2.
K3.
K4.
KS.
LL
L3.
MI.
M2.
Nl.
N2.
Pi.
BE.
Kackar, R. N. "Taguchi’s Quality Philosophy: Analysis and Commentary.”
Quality Progress (Dec. 1986) pp. 21-29.
Katz, L. E. and Phadke, M. S. "Macro-quality with Micro-money." AT&T Bell
Labs Record (Nov. 1985) pp. 22-28.
Kempthorne, O. The Design and Analysis of Experiments. New York: Robert E.
Krieger Publishing Co., 1979.
Klingler, W. J, and Nazaret, W. A. "Tuning Computer Systems for Maximum
Performance: A Statistical Approach." Computer Science and Statistics:
Proceedings of the 18th Symposium of the Interface, Fort Collins, Colorado
(March 1985) pp. 390-396.
Lee, N. S., Phadke, M. S., and Keny, R. 8. “An Expert System for Experimental
Design: Automating the Design of Orthogonal Array Experiments." ASQC
Transactions of Annual Quality Congress, Minneapolis, Minnesota (May 1987)
pp. 270-277
Leon, R. V., Shoemaker, A. C., and Kackar, R. N. "Performance Measures
Independent of Adjustments." Technometrics (August 1987) vol. 29, no, 3, pp.
253-265.
Lin, K. M. and Kackar, R, N, "Wave Soldering Process Optimization by Orthog-
onal Array Design Method.” Electronic Packing and Production (Feb. 1985)
pp. 108-115.
Mitchell, J. P. "Reliability of Isolated Clearance Defects on Printed Circuit
Boards." Proceedings of Printed Circuit World Convention IV (June 1987)
Tokyo, Japan, pp. 50.1-50.16.
Myers, R. H. Response Surface Methodology. Blacksburg, Virginia: R.H.
Myers, Virginia Polytechnic Institute and State University, 1976
Nair, V. N. "Testing in Industrial Experiments with Ordered Categorical Data."
Technometrics (November, 1986) vol. 28, no. 4, pp. 283-291.
Nair, V. N. and Pregibon, D. "A Data Analysis Strategy for Quality Engineering
Experiments." AT&T Technical Journal (May/June 1986) vol. 65, No. 3, pp.
3-84,
Pao, T. W., Phadke, M. S., and Sherrerd, C. S. "Computer Response Time
Optimization Using Orthogonal Array Experiments.” EEE International Com-
munications Conference. Chicago, IL (June 23-26, 1985) Conference Record,
vol. 2, pp. 890-895.
Phadke, M. S, "Quality Engineering Using Design of Experiments." Proceedings
of the American Statistical Association, Section on Statistical Education (August
1982) Cincinnati, OH, pp. 11-20.
Phadke, M. S. "Design Optimization Case Studies." AT&T Technical Journal
(March/April 1986) vol. 65, no. 2, pp. 51-68.324
P4.
PS.
P6.
P8.
RI
R2,
R3.
S1
$2.
$3.
S4.
SS.
S6.
References
Phadke, M. S. and Dehnad, K. “Optimization of Product and Process Design for
Quality and Cost." Quality and Reliability Engineering International (April-June
1988) vol. 4, no. 2, pp. 105-112.
Phadke, M. S., Kackar, R. N., Speeney, D. V., and Grieco, M. J. "Off-Line Quality
Control in Integrated Circuit Fabrication Using Experimental Design.” The Bell
System Technical Journal, (May-June 1983) vol. 62, no. 5, pp.
1273-1309.
Phadke, M. S., Swann, D. W., and Hill, D. A. “Design and Analysis of an
Accelerated Life Test Using Orthogonal Arrays." Paper presented at the 1983
Annual Meeting of the American Statistical Association, Toronto, Canada.
Phadke, M. S. and Taguchi, G. "Selection of Quality Characteristics and S/N
Ratios for Robust Design." Conference Record, GLOBECOM 87 Meeting, IEEE
Communications Society. Tokyo, Japan (November 1987) pp. 1002-1007.
Plackett, R. L, and Burman, J. P. "The Design of Optimal Multifactorial Experi-
ments." Biometrika, vol. 33, pp. 305-325.
Proceedings of Supplier Symposia on Taguchi Methods, April 1984, November
1984, October 1985, October 1986, October 1987, and October 1988, American
Supplier Institute, Inc., 6 Parklane Blvd., Suite 411, Dearborn, MI 48126,
Raghavarao, D. Construction of Combinatorial Problems in Design Experiments.
New York: John Wiley and Sons, 1971
Rao, C. R. "Factorial Experiments Derivable from Combinatorial Arrangements of
Arrays." Journal of Royal Statistical Society (1947) Series B, vol. 9, pp. 128-139.
Rao, C. R. Linear Statistical Inference and Its Applications, 2nd Edition. New
York: John Wiley and Sons, Inc., 1973.
Schefté, H. Analysis of Variance. New York: John Wiley and Sons, Inc,, 1959,
Searle, S. R. Linear Models. New York: John Wiley and Sons, 1971
Sciden, E. “On the Problem of Construction of Orthogonal Arrays." Annals of
Mathematical Statistics (1954) vol. 25, pp. 151-156,
Seshadri, V. "Application of the Taguchi Method for Facsimile Performance Char-
acterization on AT&T International Network.” Proceedings of IEEE International
Conference on Communications, Philadelphia, Pennsylvania (June 1988) pp.
0833-0835.
Sullivan, L. P. "Reducing Variability: A New Approach to Quality.” Quality
Progress, (July 1984) pp. 15-21.
Sullivan, L. P. "Quality Function Deployment." Quality Progress (June 1986) pp.
39-50.References 325
Th
T2.
73,
74.
TS.
T6.
“SL.
Taguchi, G, Jikken Keikakuho, 3rd Edition. Tokyo, Japan: Maruzen, vol. 1 and
2, 1977 and 1978 (in Japanese). English translation: Taguchi, G. System of
Experimental Design, Edited by Don Clausing. New York: UNIPUB/Kraus
International Publications, vol. 1 and 2, 1987.
Taguchi, G. "Off-line and On-Line Quality Control System." International
Conference on Quality Control. Tokyo, Japan, 1978.
Taguchi, G. On-line Quality Control During Production. Tokyo, Japan:
Japanese Standards Association, 1981. (Available from the American Supplier
Institute, Inc., Dearborn, MI).
Taguchi, G. Introduction to Quality Engineering. Asian Productivity Organiza-
tion, 1986. (Distributed by American Supplier Institute, Inc., Dearborn, M1).
Taguchi, G. and Konishi, S. Orthogonal Arrays and Linear Graphs. Dearborn,
ML: ASI Press, 1987.
Taguchi, G. and Phadke, M. S. "Quality Engineering through Design Optimiza-
tion.” Conference Record, GLOBECOM 84 Meeting, IEEE Communications
Society, Atlanta, GA (November 1984) pp. 1106-1113.
Taguchi G. and Wu, Yu-In. Introduction to Off-Line Quality Control. Central
Japan Quality Control Association, Meieki Nakamura-Ku Magaya, Japan, 1979.
(Available from American Supplier Institute, Inc., Dearborn, MI).
The Asahi, Japanese language newspaper, April 15, 1979. Reported by Genichi
Taguchi during lectures at AT&T Bell Laboratories in 1980.
Tomishima, A. "Tolerance Design by Performance Analysis Design—An Exam-
ple of Temperature Control Device." Reliability Design Case Studies for New
Product Development, Edited by G. Taguchi, Japanese Standards Assoc., 1984,
pp. 213-220 (in Japanese).
Yokoyama, Y. and Taguchi, G. Business Data Analysis: Experimental Regres-
sion Analysis. Tokyo: Maruzen, 1975.INDEX
A
Accelerated life tests, 271
Accumulation analysis method, for
determining optimum control
factor settings, 122-128
Additive model, 42, 48-50
failure of, 90
Adgitivity, 133-134
tole of orthogonal arrays in ensuring
Additivity, 146
Adjustment factor, 107
Advanced strategy, orthogonal array
construction, 174, 182
Aluminum etching application, 8
American Supplier Institute, 9
Analysis of Means (ANOM), 46
Analysis of Variance (ANOVA), 3, 42, 51-59
ANOVA tables, interpretation of, 58-59
confidence intervals for factor effects, 58
degrees of freedom, 56
error variance, estimation of, $7
Fourier analysis, analogy with, 51-52
sum of squares, computation of, 53-55
variance ratios, 58
The Asahi (newspaper) 15
Asthmatic patient treatment, control factors,
selection of, 144-145,
Asymmetric loss function, 21
Automobiles, noise factors, 23
Average quality loss, 25-26, 29
reducing variance, methods of, 26
Balancing property, 44
proportional, 154, 279
Beginner strategy, orthogonal array construction,
171-172, 182
Blaine, Gary, 183
Branching design
linear graphs, 168-171
branching factor, 168
Breaking a line, linear graphs, 163
Cc
Central composite designs, Robust Design versus.
classical statistical experiment
design, 179
Chemical vapor deposition (CVD) process, 41
matrix experiment, 42-44
327928
balancing property, 44
finding frequency response function, 44
starting levels, 42
Classical statistical experiment design
‘compared to Robust Design, 174-175
data analysis differences, 180
experiment layout differences, 177-180
problem formulation differences, 175-177
Column merging method, 150, 168
router bit life improvement
application, 261
Compound factor method, 150, 156-157
Compound noise factor, 72, 206-207
Computer-aided Robust Design, 183-211
Computer system parameter space mapping, 249
Computer systems tuning of, 231-251
control factors/levels, 236-238
data analysis, 240-243,
experimental procedure, 238-240
macro modeling, 234
matrix experiment, 238-240
micro modeling, 233-234
need for, 231
noise factors/testing conditions, 234-235
problem formulation, 232-234
‘quality characteristic, 235-236
related applications, 249
signal-to-noise ratio, 235-236
verification experiments, 240-246
Concept design, 33, 39
Confidence intervals, for factor effects, 58
Confirmation experiment, 60
Confounding, definition of, 157
Construction of orthogonal arrays, See
orthogonal arrays
Continuous-continuous (C-C) problems,
ignal-to-noise ratios for, 114-116
signal-to-noise ratios for, 116
Continuous variables, as quality characteristics, 135
Control factors, 27, 31, 39
adjustment factor, 107
differential operational amplifier (op-amp), 194
dynamic systems, 216-218
design, 222-223
ignoring interactions among, 176-177
polysilicon deposition process, 74-76
selecting levels of, 75
selection of, 144-146
asthmatic patient treatment, 144-145
Index
computer system tuning, 236-238
photolithography process, 145-146
Control orthogonal array, 195
dynamic systems design, 222-224
Cost, elements of, 4, 11
‘manufacturing cost, 5, 11
operating cost, 5, 11
R&D cost, 5, 1
of Robust Design followed by
tolerance design, 205
Covercoat process, branching design and,
168-170
Curve/vector response, quality characteristic, 113,
Customer usage, as life cycle stage, 38
D
Data analysis
computer system tuning, 240-248
differential operational amplifier (op-amp), 198
dynamic systems design, 223-227
polysilicon deposition process, 80
reliability improvement, 265-271
Data analysis difference, Robust Design versus
classical statistical experiment
design, 180
significance tests, 180
two-step optimization, 180
Data analysis plan, 76-79, 80-90
Degrees of freedom, 56
counting of, 150-151
Designed experiments, 42 See also Matrix
experiments,
Design optimization, 194-202, 222-227
differential operational amplifier (op-amp),
194.202,
control factors/levels 194-195
control orthogonal array. 195
data analysis, 198
multiple quality charact
optimum settings, 198-202
simulation algorithm, 196
dynamic systems, 222-227
control factors/levels, 222-223,
control orthogonal array, 223
data analysis/optimum condition, 223-227
Design parameters, estimated effects of, 6
Deterioration, noise factors, 23-24
Developing process, photographs, noise factors, 24
Differential operational amplifier (op-amp), 9
fics, 202Index
analysis of nonlinearity, 207
compound noise factor, 206-208
control factors, 194
control orthogonal array, 195
design optimization, 194-202
deviations in circuit parameter values, 186-188
correlation, 186
shape of distribution, 186
noise factors, 186-188
offset voltage, 184
orthogonal array based simulation, 190-193
quality characteristic, 194
selecting major noise factors, 206
signal-to-noise ratio, 194, 208-209
simulation effort, reduction of, 205-207
tolerance design, 202-205
igital-continuous (D-C) problems, signal-to-noise
ratios for, 116-117
Digital-digital (D-D) problems, signal-to-noise
ratios for, 117-121
Dots, linear graphs, 159
Dummy level technique, 150, 154-155
computer system tuning
application, 238
Dynamic problems, 32, 114-121
Robust Design, 32
signal-to-noise ratios for, 114-121
continuous-digital (C-D) problems, 116
continuous-continyous (C-C) problems,
114-116
digital-continuous (D-C) problems, 116-117