International Journal of Automotive Technology, Vol. 10, No. 6, pp.
753−759 (2009) Copyright © 2009 KSAE
DOI 10.1007/s12239−009−0088−z 1229−9138/2009/049−13
DEVELOPMENT OF THE FMECA PROCESS AND ANALYSIS
METHODOLOGY FOR RAILROAD SYSTEMS
J. H. KIM , H. Y. JEONG and J. S. PARK
1,2) 1)* 2)
Department of Mechanical Engineering, Sogang University, Seoul 121-742, Korea
1)
2)
Korea Railroad Research Institute, 360-1 Woram-dong, Uiwang-si, Gyeonggi 437-757, Korea
(Received 16 October 2007; Revised 20 March 2009)
ABSTRACT−FMECA (Failure Modes, Effects and Criticality Analysis) is a procedure used to identify potential failure
modes, determine causes and effects of failure modes and mitigate or remove its effects on system functional performance. For
the last several decades, FMECA has been widely used in industry, and specialized versions of FMEA have been developed
for several industrial sectors. For instance, MIL-1629a, SAE-J1739 and IEC-60812 have been mainly used in the military,
automotive and electronics industries, respectively. However, there is no specialized FMECA method for the railroad industry
yet, despite a need for highly reliable systems. Thus, in this study three specifications, MIL-1629a, SAE-J1739 and IEC-
60812, were analyzed and compared with one another, and characteristics and requirements of railroad systems were
summarized. Then a specialized FMECA procedure for railroad systems was proposed based on the processes documented in
the specifications, characteristics and requirements of railroad systems. Finally, the procedure was applied to a railroad system
in order to validate its applicability.
KEY WORDS : Failure modes and effects analysis, Criticality analysis, Railroad
1. INTRODUCTION adapted to a specific application for each industry. For
railroad systems, they have to provide a reliable and safe
As technology evolves, consumer demand for high quality means of transportation because people and cargo are trans-
products increases (Ramakumar, 1993). Consequently, manu- ported by these systems. Thus, they also require methods to
facturers make more of an effort to insure the quality of control design, manufacturing and maintenance processes
their products; at the same time, they require methodo- in order to reduce costs while guaranteeing reliability,
logies to assess the quality or reliability of their products. because the systems are expensive and their maintenance
In the past, manufacturers focused on quality control, to costs are high. However, there has been little research con-
reduce the number of inferior quality products, but recent- ducted on the FMECA process for railroad systems, and no
ly, focus began to shift to reliability analysis, to improve specialized FMECA specification has thus far been pro-
the quality and manufacturing efficiency from the beginn- posed. Therefore, in this study, a specialized FMECA pro-
ing of product development (SAE, 2000). As early as 1949, cedure for railroad systems was proposed by analyzing
US defense sector developed FMECA as a means of several FMECA specifications used in other industry sectors
reliability analysis. FMECA is an analysis methodology by and considering the characteristics and requirements of
which all potential failure modes are found, the causes and railroad systems. Finally, the proposed procedure was ap-
effects of failure modes are analyzed, critical failure modes plied to a railroad system in order to validate its appli-
are selected and methods to mitigate or remove the effects cability.
of critical failure modes are provided (IEC, 2001a; MIL,
1980, SAE, 2000; SAE, 2001). This analysis methodology 2. CHARACTERISTICS OF RAILROAD
was standardized and specified in MIL-1629a by the US SYSTEMS
defense sector, and was later modified and specified into
SAE-J1739 and SAE-ARP5580 by the automotive industry In order to propose a specialized FMECA method for
(SAE, 2000, 2001; MIL, 1980). In addition, these specifi- railroad systems, it is first necessary to determine the
cations were modified for an industry sector such as IEC- characteristics of railroad systems. The main objective of
60812 and STUK-YTO-TR190 (IEC, 2001a, STUK, 2002). railroad systems is to transport people or freight safely and
The main concept and basic procedure are the same in all quickly (Seo, 2000). In other words, the systems should be
FMECA specifications, but a detailed procedure must be reliable (as few breakdowns, especially during transpor-
tation, as possible) and their safety should be guaranteed.
*Corresponding author. e-mail:
[email protected] Railroad systems function along with other systems, such
753
754 J. H. KIM, H. Y. JEONG and J. S. PARK
Table 1. Characteristics of railroad systems.
1 High reliability requirements for punctuality and fast
service
2 High safety requirements
3 Many interfaces between railroad systems and other
control systems and between many subsystems
4 Huge system size
5 High maintenance cost
6 Various service environments
7 Different manufacturing requirements for different ser- Figure 1. Worksheet specified in SAE-J1739.
vice providers
Table 2. Limitations of RPN.
1 The same weight is given to the severity, the occur-
rence rate and the detection rate.
2 Only 120 different RPN’s actually exist even though
it looks like 1000 RPN’s exist.
3 The same change in one parameter affects RPN more
when the other two parameters are bigger. (eg. from
to RPN changes only by 4, but from to RPN changes
by 25.)
Figure 2. Worksheet specified in MIL-1629a.
as traffic signal and control systems; the system boundary
of a train is so wide (Kim, 2009; Seo, 2000). In addition, mode and criticality can be analyzed on one sheet. The
since railroad systems are used for a long time (e.g. over 25 criticality level is an outcome of CA based on RPN (Risk
years), their maintenance cost are several times their Priority Number), which is a multiple of the severity, occurr-
development and manufacturing cost. Thus, it is necessary ence rate and detection rate. That is, when the worksheet is
to maintain the systems efficiently. Railroad systems travel used, both FMEA and CA can be conducted simultaneous-
long distances, and they may experience different environ- ly. However, it is more likely for subjective opinions of
ment conditions, such as temperature and humidity. More- people involved in the analyses to play a role because an
over, since railroad authorities ask railroad system manu- RPN is determined for every cause of a failure mode, and
facturers for different performance requirements, the manu- there are problems in determining RPN’s. An RPN is
facturers should make the systems accordingly (IEC, 2001b). determined from the severity (S), occurrence rate (O) and
These characteristics of railroad systems are summarized in detection rate (D), as shown in Equation (1).
Table 1.
Table 3. Severity classification in MIL-1629a.
3. COMPARISON OF FMECA PROCEDURES Frequency Description
FMECA procedures used in other industrial sectors are Frequent Likely to occur frequently. The hazard will
almost the same in terms of basic concepts and preparation. be continually experienced.
An FMECA procedure consists of a collection of infor-
mation, creation of documents and preparation of reports Probable Will occur several times. The hazard can be
expected to occur often.
(IEC, 2001a, MIL, 1980). In this study, it was believed that
the characteristics of each FMECA procedure could be Occasional Likely to occur several times. The hazard
can be expected to occur several times.
determined by analyzing the FMECA worksheets. Thus,
mainly the worksheets of SAE-J1739, used in automotive Likely to occur sometime in the system life
industry; MIL-1629a, used in military industry; and IEC- Remote cycle. The hazard can reasonably be
60812, used in electronic industry, were analyzed. expected to occur.
Unlikely to occur but possible. It can be
3.1. SAE-J1739 Improbable assumed that the hazard may exceptionally
A worksheet specified for SAE-J1739 is shown in Figure 1. occur
A major characteristic of SAE-J1739 is that FMECA con-
sists of two analyses; one is FMEA, and the other is CA Incredible Extremely unlikely to occur. It can be
assumed that the hazard may not occur.
(Criticality Analysis). As Figure 1 shows, both the failure
DEVELOPMENT OF THE FMECA PROCESS AND ANALYSIS METHODOLOGY FOR RAILROAD SYSTEMS 755
Figure 3. Criticality analysis worksheet. Figure 4. Maintenance analysis worksheet.
RPD=S×O×D (1)
The problems of using RPN’s as a means to represent the
criticality level have been pointed out (IEC, 2001a; 2001b,
Seo, 2000; Stamatics, 2003; STUK, 2002), and research
has been conducted to develop another analysis method
without such problems (Kaenta, 2004; Pillay, 2003; Puente,
2002; Rhee, 2003). The limitations or problems of using
RPN’s are summarized in Table 2.
In SAE-J1739, FMECA is classified into Designed
FMECA and Process FMECA, and explanations are given
separately for each (SAE, 2000; Stamatics, 2003). How- Figure 5. Criticality matrix in MIL-1629a.
ever, the principles in the FMECA’s are the same, and the
FMECA’s are differentiated simply based on which stage in gathered by maintenance personnel can be delivered to
the whole engineering process the FMECA is applied to. designers. Since the maintenance cost for railroad systems
Therefore, in this study, FMECA was not classified as in is huge, it is necessary to conduct MA, as specified in MIL-
SAE-J1739. 1629a. CA and MA worksheets are shown in Figure 3 and
4, respectively.
3.2. MIL-1629 The third characteristic of MIL-1629a is that, unlike
A worksheet specified for MIL-1629a is shown in Figure 2. SAE-J1739, the criticality number (or criticality class) is
The first characteristic of this specification is that FMEA assigned not to a failure cause, but to a failure mode. The
should be conducted first, followed by CA, unlike SAE- criticality number for a failure mode, can be calculated by
J1739. In other words, serious failure modes are selected using Equation (2).
first in FMEA, and CA is conducted only for the serious Cm = λ × α × β × t (2)
failure modes. Since CA is not conducted for all failure
modes, it can be conducted efficiently for a huge system Here, λ is the failure rate for a failure mode, α is a rate of
with many subsystems, like railroad systems. In addition, failure due to a cause if there are many causes for the same
since the railroad service should be reliable and punctual, failure mode (for some systems, the probability of failure
minor failure modes may be allowed to occur although of causes is in database (RIAC 2007), β is the conditional
serious failures, which may stop railroad service, should probability and t is the duration of the applicable mission
not be allowed. Thus, it may be better to conduct FMEA phase, usually expressed in hours or number of operating
and CA separately in railroad systems, as specified in MIL- cycles. Note that the criticality number is a quantitative
1629a.
The second characteristic of MIL-1629a is that MA Table 4. Criticality class classification in IEC-60812.
(Maintainability Analysis) is also specified in a way such Criticality class Probability of occurrence
that MA can be conducted based on the outcomes of
FMEA and CA. System designers should mention failure 1 or E 0 ≤ Pi < 0.001
modes, failure indicators, failure effects, severity classes, 2 or D 0.001 ≤ Pi < 0.01
detection methods and basic maintenance means for main- 3 or C 0.01 ≤ P < 0.1
tenance personnel to use. Thus, by conducting MA, basic
i
4 or B 0.1 ≤ P < 0.2
maintenance means designers can think of methods that
i
5 or A P ≥ 0.2
can be delivered to maintenance personnel, and information i
756 J. H. KIM, H. Y. JEONG and J. S. PARK
Table 5. Frequency classification in IEC-62278.
Category I- A failure which may cause death or weapon
Catastrophic etc.) loss (i.e., aircraft, tank, missile, ship,
system
Category II- A failure which may cause severe injury,
Critical damageproperty
major damage, or major system
which will result in mission loss.
A failure which may cause minor injury,
Category III- minor property damage, or minor system
Marginal damage which will result in delay or loss
of availability or mission degradation. Figure 7. Criticality matrix in IEC-60812.
A failure not serious enough to cause injury,
Category IV- property damage, or system damage, but The process specified in IEC-60812 is similar to the one
Minor which will result in unscheduled mainte- specified in MIL-1629a, and the worksheet for IEC-60812
nance or repair. is shown in Figure 6. However, unlike in MIL-1629a, the
criticality class must only be determined qualitatively in
IEC-60812. A criticality number calculated by Equation
measure of the occurrence frequency of a failure mode. (2) should be plugged into Equation (3) to be converted to
However, the severity level can be qualitatively determined the occurrence probability, P . Then, according to Table 4,
i
as Class I (catastrophic), Class II (critical), Class III (marginal) a criticality class can be determined. That is, a criticality
or Class IV (minor), as shown in Table 3. Then, the criti- number represented quantitatively is converted to a quali-
cality matrix, as shown Figure 5, can be constructed by tative measure of a criticality class. When the criticality class
assigning the criticality number on the ordinate and severity is shown along with the severity class, as shown in Figure
class on the abscissa. High criticality levels are assigned for 7, the risk level (or criticality level) can be assessed. A case
the failure modes close to the upper right-hand corner, with high criticality and severity classes is at high risk, but
requiring high corrective action priorities. a case with criticality and severity classes is at low risk.
In MIL-1629a, the failure effect is classified into local, P = 1 – e– m
C
(3)
next level and end effects. The local effect only affects items i
at the same level; the next level effect affects items at the
next level in the system hierarchy, and end effect affects the 4. FMECA PROCEDURE SPECIALIZED FOR
system. By analyzing the three effects, it is possible to RAILROAD SYSTEMS
understand the flow of the effect that a failure causes on Based on the characteristics of MIL-1629a, IEC-60812 and
system performance. Since there are many subsystems and SAE-J1738, analyzed in the previous sections, a speci-
interfaces in railroad systems, it is better to understand the alized FMECA procedure for railroad systems was pro-
flow of effect, and the FMECA procedure specified in posed in this study, and a proposed FMECA worksheet is
MIL-1629a is better applicable to railroad systems from shown in Figure 8. It asks to perform the FMEA and CA on
this point of view as well. one worksheet, like in SAE-J1738. Since the major objec-
Finally, there is a column for mission phase/operational tive of railroad systems is to transport passengers or freight
mode in MIL-1629a, as shown in Figure 2. safely to the destination in a punctual manner, safety and
Since a system may have different functionalities under reliability are very important. Thus, it is proposed that failure
different conditions, the column can be used to mention
these differences.
3.3. IEC-60812
Figure 8. FMECA worksheet specialized for railroad
Figure 6. Worksheet specified in IEC-60812. systems.
DEVELOPMENT OF THE FMECA PROCESS AND ANALYSIS METHODOLOGY FOR RAILROAD SYSTEMS 757
as ECU, BOU (pressure sensor unit, EP valve, variable load
valve, pressure regulating valve) and basic brake devices
(air compressor, air dryer, disc and lining). Therefore,
FMECA was performed for all 30 sub-devices, the FMECA
samples as shown in Figure10. For each sub-device, there
exist several failure modes and causes. For example, there
are three failure modes and causes in the emergency valve.
Coil failure was the cause of the ‘bad de-energized position’
failure mode, spool valve failure was the cause of the ‘bad
Figure 9. Risk matrix for railroad systems in IEC-62278. energized position’ failure mode and a bad seal in the
‘emergency valve’ was the cause of the ‘air leakage’ failure
modes be analyzed through FMEA and the criticality level mode. After FMECA was performed for the 30 sub-devices
of failure modes be indicated in respect to safety and of the brake system, a total of sixty two failure modes and
reliability. In addition, in FMEA, it is proposed that the causes were found.
local, next level effect and end effects are specified, as in Next, the effect for each failure mode was classified as
MIL-1629a and IEC-60812. Since there are many inter- the local, next and end effect, and the criticality level was
faces between trains and signal and control systems, it is determined at the final stage of FMECA. The local effect
necessary to clarify the effect of failure between interfaces was defined as an effect that affects its own sub-device,
as well. next effect was defined as an effect that affects the brake
In addition, in CA, it is proposed that a criticality class system, and end effect was defined as the effect that affects
be determined for each failure mode. A criticality class can EMU. As shown in Figure 10, the effect of failure modes in
be determined by the method specified in IEC-60812, or it the ‘service analog converter’ of the EP valve was classi-
can be determined from the occurrence frequency, as fied. The ‘bad de-energized position’ failure mode, due to
specified in IEC-62278 (also shown in Table 5), which is coil failure, has the local effect of ‘failure to create service
widely used as the international standard. In addition, it is brake pressure,’ next effect of ‘failure to operate service
proposed that the risk matrix (or criticality matrix) be used, brake’ and end effect of ‘fail.’ The ‘bad energized position’
as specified in IEC-62278 (also shown in Figure 9), which failure mode, due to spool valve failure, has the local effect
is the railroad system RAMS (Reliability, Availability, of ‘failure to release pressure,’ next effect of ‘failure to
Maintainability and Safety) standard. release brake’, and end effect of ‘fail.’ In addition, as for
Finally, maintenance tasks are added to perform the the ‘air leakage’ failure mode due to bad seal failure, ‘air
additional maintenance analysis for serious failure modes, leakage,’ ‘increase in valve operation time’ and ‘malfunc-
like in MIL-1629a. If MA is performed by using FMECA tion’ are the local, next and end effect, respectively.
results, designer opinions can be delivered to maintenance Each failure mode may have a different criticality level,
personnel directly so that maintenance can be performed even in the same device. The criticality level of ‘bad ener-
efficiently and accurately. gized position’ due to spool valve failure among the three
failure modes of the ‘service analog converter’ is very low
5. CASE STUDY OF FMECA SPECIALIZED and considered ‘Negligible,’ according to the railroad system
FOR RAILROAD SYSTEM risk matrix shown in Figure 9, but the criticality level of
‘bad de-energized position’ due to coil failure and ‘air leak-
FMECA was performed for the EMU (Electrical Multiple age’ due to a bad seal are slightly higher and considered
Unit) subsystems to validate the applicability of the pro- ‘Tolerable’. The difference in the criticality level stemmed
posed FMECA process to railroad systems. First, failure from the difference in frequency and severity classes of
records of railroad systems of A and B line of domestic each failure mode. The three failure modes have the same
railroad authority, from January 2000 to June 2006, were frequency as ‘Unlikely to occur but possible’ or ‘It can be
collected to select feasible EMU subsystems to apply the assumed that the hazard may exceptionally occur.’ How-
FMECA processes. All failure records exceeding a 10 ever, the ‘bad energized position’ failure mode due to spool
minute delay were collected in the case of the A line, but all valve failure may cause a severe system damage, and its
kinds of failures, including minor failures which drivers severity is ‘Marginal,’ but the ‘bad de-energized position’
could repair temporarily, were collected in the case of the B failure mode due to coil failure and ‘air leakage’ failure
line. Failures exceeding a 10 minute delay occurred only in mode due to a bad seal may lead to loss of a major system,
the traction control (23 cases) and brake systems (11 cases) so their severity is ‘Critical.’ Thus, it is necessary to set a
in the A line, and most of the reported 221 failures, includ- higher maintenance priority for coil failure and bad seal
ing minor failures, occurred in the brake system in the B over spool valve failure to enhance the reliability and safety
line. Thus, the brake system was chosen as an EMU sub- of the ‘service analog converter,’ which is a brake system
system for FMECA processing. sub-device. Note that the order of the task priority in main-
The brake system are comprised of 30 sub-devices such tenance can be set to be the same because the criticality
758 J. H. KIM, H. Y. JEONG and J. S. PARK
Figure 10. FMECA samples of EMU brake system.
levels of the failure modes are still at an allowable level. plicability of the proposed FMECA process in railroad
However, if the criticality level is ‘Undesirable’ or ‘Intoler- systems was validated.
able,’ a higher maintenance priority should be set and an
effort for improvement, such as design change and supple- REFERENCES
mentary device installation, should be taken.
An additional column regarding the maintenance task is IEC (2001a). Analysis Techniques for System Reliability-
added to each analyzed failure mode. In railroad systems, Procedure for Failure Mode and Effects Analysis (FMEA).
by conducting MA with consideration of FMECA results, IEC Standard. IEC-60812. Int. Electrotechnical Commi-
the maintenance task necessary for failure modes can be ssion.
delivered to maintenance personnel accurately, and failures IEC (2001b). Railway Applications-Specification and Demon-
can be repaired appropriately with less expense. As shown stration of Reliability, Availability, Maintainability and
in Figure 10, maintenance tasks were actually evaluated Safety (RAMS). IEC Standard. IEC-62278. Int. Electro-
based on the FMECA results with brake system designers technical Commission.
and maintenance personnel of the A railroad authority, and Kim, H. J., Bae, C. H., Kim, S. H., Lee, H. Y., Park, K. J.
the maintenance task and schedule could be appropriately and Suh, M. W. (2009). Development of a knowledge-
proposed. Therefore, it is confirmed through this case study based hybrid failure diagnosis system for urban transit.
that the proposed FMECA process is appropriately and Int. J. Automotive Technology 10, 1, 123−129.
effectively applicable to railroad systems. Kmenta, S. and Ishii, K. (2004). Scenario-based failure
modes and effects analysis using expected cost. J.
6. CONCLUSIONS Mechanical Design, 126, 1027−1035.
MIL (1980). Procedures for Performing a Failure Mode,
In this paper, the FMEA standards used in other industries, Effects and Criticality Analysis. Military Standard. MIL-
especially MIL-1629a from the military industry, SAE- 1629a. US Department of Defense.
J1739 from the automotive industry and IEC-60812 from Pillay, A. and Wang, J. (2003). Modified failure mode and
the electronic industry, were analyzed, and the charac- effects analysis using approximate reasoning. Reliability
teristics and strength and weakness of each standard were Engineering and System Safety, 79, 69−35.
defined. In addition, the characteristics and requirements of Puente, J., Pino, R., Priore, P. and Fuente, D. (2002). A
railroad systems were analyzed. Based on the results of the decision support system for applying failure mode and
analyses, FMECA processing and MA were proposed for effects analysis. Int. J. Quality & Reliability Management,
applicable railroad systems. Moreover, FMECA and MA 19, 137−150.
were performed for the EMU subsystems, and the ap- Ramakumar, R. (1993). Engineering Reliability: Fund-
DEVELOPMENT OF THE FMECA PROCESS AND ANALYSIS METHODOLOGY FOR RAILROAD SYSTEMS 759
amentals and Applications. Prentice-Hall. New Jersey. Procedures. SAE Standard. SAE-ARP5580. SAE.
Rhee, S. J. and Ishii, K. (2003). Using cost based FMEA to Seo, S.-B. (2000). Understanding of Railroad Engineering.
enhance reliability and serviceability. Advanced Engineer- Eol and Al (in Korean). Seoul.
ing Informatics, 17, 179−188. Stamatics, D. H. (2003). Failure Mode Effect Analysis: FMEA
RIAC (2007). Failure Mode/Mechanism Distribution- FMD97. from Theory to Execution. American Society for Quality.
Reliability Information Analysis Center, New York. Milwaukee.
SAE (2000). Potential Failure Mode and Effects Analysis STUK (2002). Failure Mode and Effects Analysis of Soft-
in Design (Design FMEA), Potential Failure Mode and ware-Based Automation Systems. STUK Standard. STUK-
Effects Analysis in Manufacturing and Assembly Processes YTO-TR 190. the Radiation and Nuclear Safety Authority
(Process FMEA). SAE Standard. SAE -J1739. SAE. Finland.
SAE (2001). Failure Modes, Efects, and Criticality Analysis