8/15/2021
Basic Reliability Concepts
Nagi Gebraeel,
Professor of Industrial & Systems Engineering
Georgia Tech
1
Reliability: the probability that a product will operate for a specified period of
time under the intended operating condition without failure
Realizability Engineering: characterizes, measure and analyzes system failures
to improve theoretical operations and reduce likelihood of unexpected failures.
Basic Reliability Concepts
Reliability Function
Relationship to CDF & PDF
Mean/Residual Time‐To‐Failure
Failure &Hazard Rate
Bathtub Curve
Decreasing Failure Rate
Constant Failure Rate
Increasing Failure Rate
Comprehensive Example
1
8/15/2021
Reliability Function
• Suppose 𝑛 identical components are tested under their designed operating
conditions. Let us assume that by some time 𝑡, 𝑛 𝑡 failed components, and
𝑛 𝑡 surviving components such that 𝑛 𝑡 𝑛 𝑡 𝑛
• Reliability at time 𝑡, 𝑅 𝑡 , is defined as follows;
Reliability function: ratio of the
component survived after time t
among all the test components
n0=# of identical test components
nf(t)= # of component fail after time t
ns(t) =# of component survive after time t
3
F(t)= failure probability
Reliability Function R(t)= survive probability
• If T is a random variable denoting the time to
failure, then the reliability function at time t
can be expressed as;
1-cdf =R(t)
• Reliability is related to the cumulative
probability function
F t 1 R t ℙ T t
• In fact, R 0 1 and lim R t 0
→
𝐹 0 0
𝐹 ∞ 1
4
2
8/15/2021
Reliability, CDF, & PDF
If T has a probability density function 𝑓 𝑡 , then
𝑅 𝑡 1 𝐹 𝑡 1 𝑓 𝑡 d𝑡
d𝑅 𝑡
𝑓 𝑡
d𝑡 𝑑𝐹 𝑡
𝑓 𝑡
𝑑𝑡
Also since, 𝑅 𝑡 1 𝐹 𝑡
𝑅 𝑡 𝑓 𝑡 d𝑡
<-When only has pdf
𝐹 𝑡 𝑓 𝑡 d𝑡
Example: Reliability Function
• Given the following PDF of the time‐to‐failure of a compressor, which we will
denote as T, what is the reliability for 100‐hr of operating life.
0.001
𝑡 0
f t 0.001 𝑡 1
0 otherwise
0.001
𝑅 𝑡 𝑓 𝑡 d𝑡 d𝑡
0.001 𝑡 1
1 1 1
0.909
0.001 𝑡 1 ∞ 0.001 100 1
3
8/15/2021
Example: Reliability Function
• A design life is defined to be the time to failure 𝑡 that corresponds to a specified
reliability 𝑅. That is 𝑅 𝑡 𝑅.
• Continuing with the previous example, if the reliability of 0.95 is desired, we set
1
𝑅 𝑡 0.95
0.001 𝑡 1
Solving for 𝑡 we get:
𝑡 1000 1 52.6 ℎ𝑟
.
tR= design life= time to failure for reliability R
R(tR) = R
7
Finding the Probability of Failure
• The probability of failure occurring within some interval of time 𝑎, 𝑏 may be found
using any of the three probability functions.
𝑃 𝑎 𝑇 𝑏 𝑅 𝑎 𝑅 𝑏
f(t) 𝐹 𝑏 𝐹 𝑎
• From the previous example:
𝑓 𝑡 𝑑𝑡
Pr 10 𝑇 100 𝑅 10 𝑅 100
0.081
. .
a b t
4
8/15/2021
Mean Time to Failure Among several identical no repairable systems, i.i.d
• One of the key measures of a system's reliability is the MTTF.
MTTF is usually used when the system is nonrepairable.
For repairable systems, the failure time between two successive
failures is usually referred to as MTBF (BBetween)
• Consider n identical nonrepairable systems and their time to failure are
given by t1, t2,…,tn. Then the mean time to failure is given as,
Just Average, if it give you exact values of t
n= # of identical no repairable systems
t1,…tn= their time to failure
9
Mean Time to Failure When it give you distribution of t
• If t is a random variable representing time
to failure, then the Mean‐Time‐To‐Failure,
MTTF can be defined as follows:
𝑀𝑇𝑇𝐹 𝔼𝑇 𝑡 𝑓 𝑡 𝑑𝑡
𝜇 𝑥 𝑓 𝑥 d𝑥
𝜇 𝔼𝑋
• Another measure that is often used to
describe the distribution of the time to
failure is its variance 𝜎
𝜎 𝑥 𝜇 𝑓 𝑥 d𝑥
𝜎 𝑡 𝑀𝑇𝑇𝐹 𝑓 𝑡 𝑑𝑡
𝜎 𝑡 𝑓 𝑡 𝑑𝑡 𝑀𝑇𝑇𝐹
10
10
5
8/15/2021
Mean Time to Failure
• MTTF can also be express in terms of the integral of reliability
• Recall that f t dF t ⁄dt dR t ⁄dt, thus,
𝑑𝑅 𝑡
𝑀𝑇𝑇𝐹 𝑡 𝑑𝑡
𝑑𝑡
Using integration by parts, we see that
𝑀𝑇𝑇𝐹 𝑡𝑅 𝑡 𝑅 𝑡 𝑑𝑡
Since 𝑅 ∞ 0 and 𝑅 0 1, we have,
𝑀𝑇𝑇𝐹 𝑅 𝑡 𝑑𝑡
11
11
T0= already surviving time,
Residual MTTF Residual MTTF= remaining lifetime
• If an equipment has been operating for • Remark: Conditional reliability given that
some time ,𝑇 , we can still calculate its a component has operated for time 𝑇 .
residual MTTF using the condition
𝑅 𝑡𝑇 Pr 𝑇 𝑡 𝑇 |𝑇 𝑇
reliability function 𝑅 𝑡|𝑇 .
Pr 𝑇 𝑇 𝑡 ∩ 𝑇 𝑇
Residual-> 𝑀𝑇𝑇𝐹 𝑇 𝑅 𝑡𝑇 d 𝑡 Pr 𝑇 𝑇
Pr 𝑇 𝑇 𝑡
𝑅 𝑡 𝑇 Pr 𝑇 𝑇
d 𝑡 𝑇
𝑅 𝑇 𝑅 𝑇 𝑡
1 𝑅 𝑇
𝑅 𝑡′ d𝑡
𝑅 𝑇
where 𝑡 𝑡 𝑇
t’= total time for operation until failure 12
12
6
8/15/2021
Example : Residual MTTF
Consider the following reliability function
𝑅 𝑡 𝑒 .
Calculate the MTTF
𝑀𝑇𝑇𝐹 𝑅 𝑡 d𝑡
𝑒 .
𝑀𝑇𝑇𝐹 𝑒 . 𝑑𝑡
0.002
𝑒 1 1
500 ℎ𝑟
0.002 0.002 0.002
13
13
Other Measures of Central Tendency
• The mean of the failure distribution is only one of several measures of
central tendency of the failure distribution.
• Another is the median time of failure. The median divides the distribution
into two halves, with 50 percent of the failures occurring before and after
the median. It is defined as:
𝑅 𝑡 0.5 𝑃 𝑇 𝑡
• A less frequent used measure is the mode, or the most likely observed
failure time and is defined by:
𝑅 𝑡 max 𝑓 𝑡
14
14
7
8/15/2021
Failure/Hazard Rate
• Probability of failure of a component in a given time interval [t1, t2] can
be expressed as follows;
𝑓 𝑡 d𝑡 𝑅 𝑡 𝑅 𝑡
• To see this
𝑓 𝑡 d𝑡 𝐹 𝑡 𝐹 𝑡
1 𝑅 𝑡 1 𝑅 𝑡
𝑅 𝑡 𝑅 𝑡
15
15
Failure/Hazard Rate
• Probability of failure of a component in an time [t1, t2] can be expressed
as;
• Failure Rate is defined as the probability that a failure occurs within the
interval 𝑡 , 𝑡 , given that no failure occurred prior to 𝑡 ,
• By replacing 𝑡 and 𝑡 with 𝑡 and 𝑡 Δ𝑡,
16
16
8
8/15/2021
Hazard rate looks good in theory but means
Failure/Hazard Rate little in reality
The Hazard Rate Function is defined as the limit of the failure rate as Δt
approaches zero, i.e., it is the instantaneous Failure Rate.
Important remark: 1
𝑑𝑅 𝑡 𝑥
d𝑥 ln 𝑥
𝑓 𝑡
𝑑𝑡 ln 𝑅 𝑡 𝜆 𝑡 d𝑡
1 𝑑𝑅 𝑡 𝑅 𝑡 𝑒
𝜆 𝑡
𝑅 𝑡 𝑑𝑡
𝑅 𝑡 exp 𝜆 𝑡 𝑑𝑡
17
17
Cumulative and Average Failure Rate
• The cumulative failure rate over a period of time 𝑡 is:
𝐿 𝑡 𝜆 𝑡 𝑑𝑡
• Another useful function is the Average Hazard Rate, denoted as AFR 𝑡 , 𝑡 ,
and defined as, (Failure)
1
AFR 𝑡 , 𝑡 𝜆 𝑡 𝑑𝑡
𝑡 𝑡
ln 𝑅 𝑡 ln 𝑅 𝑡
𝑡 𝑡
• If 𝑡 0 and 𝑡 𝑡, then AFR can be written as follows:
ln 𝑅 𝑡 𝐿 𝑡
AFR 𝑡
𝑡 𝑡 18
18
9
8/15/2021
Bathtub Curve
• The bathtub curve provides a general description of the hazard function across
the life cycle of a product. It is comprised of 3 main regions.
λ(t)
Infant
Mortality Wear‐out
(Burn‐in) Random Failures Failures
(Useful Life)
Decreasing Constant Increasing
Failure Rate Failure Rate Failure Rate
DFR CFR IFR
Time
19
19
Bathtub Curve ‐ DFR
• New units experience a high failure rates at the beginning of their use
which then decreases over time, hence the term decreasing failure rate,
DFR. This phase is known as infant mortality.
Typically results from manufacturing defects, cracks, poor
workmanship, quality control, defective parts, contamination.
Can be reduced through burn‐in testing where units are subjected to
slightly more severe conditions than those encountered under
normal operation.
20
20
10
8/15/2021
Bathtub Curve ‐ CFR
• The failure rate begins to level for a period of time which is characterized
by a constant failure rate (CFR). In this region, failures are random and do
not follow a predictable pattern.
This phase is typically referred to as the “useful life”.
Events are often “Act of God”.
Failure may be caused by random loads, human error, or chance.
This phase can be reduced by redundancy or excess strength.
21
21
Bathtub Curve ‐ IFR
• The third region also known as the wear‐out phase is characterized by an
increasing failure rate (IFR). Failures in this phase are no longer characterized
by being random and are mostly due to aging and wear
Typical causes of failure in this phase are fatigue due to cyclic loading,
wear, corrosion.
Can be reduced through derating, preventive maintenance, parts
replacement, condition monitoring using sensor technology
22
22
11
8/15/2021
Bathtub Curve (Summary)
Characterized Caused By Reduced By
By
Manufacturing defects: Welding
Burn‐in testing, screening,
Burn‐in DFR flaws, defective parts, poor
acceptance testing, quality control
quality/workmanship, contamination.
Environment, random loads, human
Useful Life CFR Redundancy, excess strength
error, chance events
Fatigue, corrosion, aging, cyclic Derating, part replacement,
Wear‐out IFR
loading preventive maintenance
23
23
Knowledge Checks
Consider the following reliability function
𝑅 𝑡 𝑒 .
Calculate the hazard rate
a) 500
b) 0.2
c) 0.002
Is the hazard rate function
a) DFR
b) CFR
c) IFR
24
24
12
8/15/2021
Problem
A company manufactures widgets. The time to failure in years of these widgets
has the following PDF.
200
𝑓 𝑡 for 𝑡 0
𝑡 10
a) Derive the reliability function and determine the reliability for the first
year of operation.
25
25
Problem
The time to failure in years of these widgets has the following PDF.
200
𝑓 𝑡 for 𝑡 0
𝑡 10
b) Computer the MTTF.
26
26
13
8/15/2021
Problem
The time to failure in years of these widgets has the following PDF.
200
𝑓 𝑡 for 𝑡 0
𝑡 10
c) What is the design life for a reliability of 0.95?
27
27
Problem
The time to failure in years of these widgets has the following PDF.
200
𝑓 𝑡 for 𝑡 0
𝑡 10
d) Is the failure rate DFR, CFR, IFR?
28
28
14
8/15/2021
Problem
The time to failure in years of these widgets has the following PDF.
200
𝑓 𝑡 for 𝑡 0
𝑡 10
e) Will a one‐year burn‐in period improve the reliability in part (a)? Calculate
the new reliability.
29
29
Module Summary
Review of what was covered in this module
Formalized the concept of reliability using
rules of probability.
Defined the MTTF and residual MTTF
Introduced the concept of hazard rate
Explained the bathtub curve and the
different types of hazard rates.
Comprehensive example tied all these
concepts
30
30
15