3/3/2024
Reliability Engineering
Introduction (Chapter 1)
Things Fail!
• 1940- the Tacoma Narrows Bridge, five months old, collapsed from
vibrations caused by high winds
• Metal fatigue induced by several months of oscillations led to the
failure
2
2
3/3/2024
Things Fail!
• 1978 - Ford Pinto, was recalled for modifications to the fuel tank to
reduce fuel leakage and fires resulting from rear-end collisions
• Numerous reported deaths, lawsuits, and the negative publicity
eventually contributed to Ford discontinuing production of the Pinto
More Things Fail!
• 1979 - the left engine of a DC-10 broke away from the aircraft during
take-off killing 271 people
• Poor maintenance procedures and a bad design lead to the crash
Engine removal procedures introduced unacceptable stresses
4
3/3/2024
Things Keep Failing!
• 1986 - Explosion of the space shuttle Challenger was a result of the
failure of the rubber O-rings which were used to seal the four sections
of the booster rockets
• The below freezing temperatures prior to launch contributed to the
failure by making the rubber brittle
Things Keep Failing!
• 2003 - Space Shuttle Columbia disintegrated over Texas during re-
entry into the Earth's atmosphere, with the loss of all seven crew
members
• loss was a result of damage sustained during launch when a piece of
foam insulation the size of a small briefcase broke off the Space Shuttle
external tank under the aerodynamic forces of launch
6
3/3/2024
What is Failure? I’m a
failure!
• Failure: an item unable to perform predefined functions
• System has inherent capacity to withstand the challenges. A
failure may occur when challenges surpass the capacity of the
system
• Include both internal and external challenges
• Failure is a random event!!!!
Objective of Reliability Engineering?
Reliability engineering attempts to study, characterize,
measure, and analyze the failure in order to:
• Improve upon their operational use by increasing their design
life
• Eliminating or reducing the likelihood of failures and safety
risks
• And reducing downtime thereby increasing available
operating time
8
3/3/2024
Deterministic vs. Random Failures
• Traditional approach to safety in engineering is to design into
a product a high safety margin or safety factor
• a deterministic method in which a safety of factor of perhaps 4 to
10 times the expected load or stress would be allowed for in the
design
• Safety factors often result in overdesign thus increasing costs
Deterministic vs. Random Failures
• Approach taken in reliability is to treat failures as random or
probabilistic occurrences
• In theory, if we were able to comprehend the exact physics and
chemistry of a failure process, many internal failures of a component
could be predicted with certainty
• With limited data on the physical state of a component, and an
incomplete knowledge of the physical, chemical (and perhaps
biological) processes which cause failures, failures will appear to occur
at random over time
• This random process may exhibit a pattern which can be modeled by
some probability distribution
10
10
3/3/2024
Random Phenomena
• Observed in practice when dealing with large numbers of
components
• Statistically can predict the failure of these components
• Failures caused by events external to the component, such as
environmental conditions like excessive heat or vibration, hurricanes
or earthquakes, will appear to be random
• With sufficient understanding of the conditions resulting in the event
as well as the effect such an event would have on the component,
then we should also be able to predict these failures deterministically
11
11
Random Phenomena
• The uncertainty or incomplete information, about a failure process is
a result of
• Its complexity
• Imprecise measurements of the relevant physical constants and
variables
• The indeterminable nature of certain future events
12
12
3/3/2024
Some Definitions
• Reliability is the probability that a component or system will
perform a required function for a given period of time when
used under stated operating conditions
• (It is the probability of non-failure)
R(t)
13
13
Some Definitions
• Availability is the probability that a component or system is
performing its required function at a given point in time
when used under stated operating conditions
A(t)
14
14
3/3/2024
Some Definitions
• Maintainability is the probability that a failed component or
system will be restored or repaired to a specified condition
within a period of time when maintenance is performed in
accordance with prescribed procedures
M(t)
15
15
Why Study Reliability?
• The increased complexity and sophistication of systems
• Public awareness and insistence on product quality
• New laws and regulations concerning product liability
• Government requirements to meet reliability and maintainability
performance specifications
• Profit considerations resulting from the high cost of failures, their
repairs, and warranty programs
16
16
3/3/2024
Complexity and Reliability
1.2
1
System Reliability
0.8 N=2
N=5
0.6 N=10
N=25
0.4 N=50
0.2
0
1.00 0.99 0.98 0.97 0.96 0.95 0.94 0.93 0.92 0.91 0.90
Com ponent Reliability
17
17
Government Regulations
Food and Drug Act
Flammable Fabrics Act
Federal Hazardous Substance Act
National Traffic and Motor Vehicle Safety Act
Fire Research and safety Act
Child Protection and Toy Safety Act
Occupational Safety and Health Act
Federal Boat Safety Act
Consumer Product Safety Act
18
18
3/3/2024
Most Important Product Attributes (Survey)
Attribute Average Score
Performance 9.5
Lasts long time (reliability) 9.0
Service 8.9
Easily Repaired (maintainability) 8.8
Warranty 8.4
Easy to Use 8.3
Appearance 7.7
Brand Name 6.3
Packaging/Display 5.8
Latest Model 5.4 19
19
Reliability vs Quality
• Quality is the amount by which a product satisfies the users’
(customers’) requirements. Product quality is in part a function of
design and conformance to design specifications during manufacture
• Reliability is concerned with how long the product continues to
function once it becomes operational. Therefore reliability can be
viewed as the quality of the product’s operational performance over
time, and as such it extends quality into the time domain
20
20
3/3/2024
Reasons for Evaluating Reliability
• Assessing characteristics of materials
• Predict product reliability in design stage
• Assessing the effect of a proposed design change
• Comparing two or more different venders
• Assess product reliability in field
• Predict product warranty costs
21
21
Example 1.1
• Company manufactures small motors for use in household appliances
• designed a new motor which has experienced an abnormally high
failure rate with 43 failures reported from among the first 1000
motors produced
• Possible causes of these failures included faulty design, defective
material, or a manufacturing (tolerance) problem
• The company initiated an aggressive accelerated life testing program
where they observed that those motors produced near the end of a
production run were failing at a higher rate than those at the start of
the run
22
22
3/3/2024
Example 1.1
• Table 1.1 summarizes the results of testing program
• The failure rate is computed by dividing the number of failures by the
total number of hours on test
• It was assumed that the production process was going “out of
control” and design tolerances were not being met
• Additional emphasis placed on quality control
motor # 1-100 # 101-200 # 201-300 # 301-400 # 401-500 Total
number tested 12 11 12 12 15 62
hours on test 2540 2714 2291 1890 2438 11873
number failed 1 0 1 5 7 14
23
23
Example 1.2
• For a new VCR unit produced by the XYZ Company, the following
distribution of the time to failure was obtained from a reliability
testing program
0.12
0.1
Fraction Failed
0.08
0.06
0.04
0.02
0
1000 3000 5000 7000 9000
Operating Hours
24
24
3/3/2024
Example 1.2
• From this data, F(t) was derived where F(t) is the probability of a VCR
failure occurring by time t (in operating hours):
t
F (t ) 1 e 8750
• For warranty analysis, what is the probability of failure during the first
year, first 2 years, first 3 years? (survey showed that the average use
of the VCR is 3 hours a day)
25
25
Example 1.2
Considering 1 year warranty:
• Assuming the typical consumer will use the VCR an average of 3 hours
a day, then for the first year 1095 operating hours (3 x 365) will be
observed. Therefore, the probability of a unit failing is
1095
F (1095) 1 e 8750
1.8824 .1176
• With over ten percent of the units sold expected to fail during the
first year, the company decided to initiate a reliability growth to
improve product reliability, reduce warranty costs, and increase
customer satisfaction
26
26
3/3/2024
Example 1.3
• A continuous flow production line requires a product to be processed
sequentially on 10 different machines
• When a machine breaks down, the entire line must be stopped until
the failure is repaired – an average downtime of 12 hours
• Machine specs require a 0.99 reliability for each machine over an 8
hour production run
• Assuming a constant failure rate (exponential failure distribution)
• In order to meet production quotas, the line must maintain at least a
0.92 availability
27
27
Example 1.3
How you can decrease MTTR:
• Hiring an additional maintenance person
• Increasing machine spare parts inventory
• Relocating the inventory closer to the production line
• Improving diagnostic procedures
• Simplifying the removal and replacement of high-failure
components
• etc.…..
28
28
3/3/2024
Reliability Specification
• Define failure
• What function is performed?
• Identify failure modes
• Time to failure
• Calendar time
• Operating hours
• Number of cycles (on/off, load reversals, missions)
• Vehicle miles - incidents per 1000 vehicles (IPTV)
• State normal conditions
• Design loads (weight, voltage, pressure, etc.)
• Environment (temp., humidity, vibration, contaminants, etc.)
• Operating (usage, storage, maintenance, shipment, etc.)
29
29
Reliability Specification
• Avoid vagueness
• e.g. “as reliable as possible”
• Be realistic
• e.g. “will not fail under any operating conditions”
• Avoid using only the MTTF or MTBF unless failure rate is constant
• Frame in terms of reliability or design life
• a 95 percent reliability at 10,000 operating hours
• a design life of 10,000 operating hours with a 95 percent reliability
30
30
3/3/2024
Reliability Specification – Example
Is this reliability specification?
Which average? - mean, median, mode?
Operating hours or clock time?
What about on/off cycles?
What are the operating conditions?
31
31
Time to failure: Cycles versus Time
32
32
3/3/2024
The Failure Distribution and the MTTF
• MTTF: Mean Time To Failure
Pr{fails}=.3 Pr{fails}=.5
MTTF = 10
MTTF = 10
Pr{fails}=.7
MTTF = 10
33
33
The Failure Distribution and the MTTF
• Exponential (constant failure rate) Distribution: R(MTTF) = 0.3678
• Normal Distribution: R(MTTF) = 0.5
• Weibull with a shape parameter of 0.5 : R(MTTF) = 0.24
• Weibull with a shape parameter of 2: R(MTTF) = 0.455
34
34
3/3/2024
• Reading: Chapter 1 (pages 1 -11), solve the examples.
• Pages 12 – 32: Review of probability and other information that we
might use later. Do not read it now
35
35