The System Safety Program
In the practice of occupational safety and health in industry today, the primary concern
of any responsible organization is the identification and elimination of hazards that
threaten the life and/or health of employees, as well as those which could cause damage
to facilities, property, equipment, products, and/or the environment. When such risk of
hazard cannot be totally eliminated, as is often the case, it becomes a fundamental
function of the safety professional to provide recommendations to control those hazards
in an effort to reduce the associated risk to the lowest acceptable levels.
It is the intention of this Basic Guide to System Safety to demonstrate the effectiveness
of the system safety process in identifying and eliminating hazards, recommending risk
reduction techniques, and methods for controlling residual hazard risk.
Part I will introduce the reader to the system safety process, how it evolved, how it can
be managed, and how it relates to the current practice of the industrial safety and health
professional. In fact, upon completion of Part I, the reader shall have developed a clear
understanding of this relationship and, quite possibly, have developed an interest in the
further pursuit of the system safety profession. As noted in the Preface, the information
provided here is introductory in scope, intended to merely acquaint the reader with the
system safety approach to hazard analysis and hazard risk reduction.
Asaseparate discipline, system safety had its origins in the aviation and aerospace
industries. Systems safety has proven its worth in the dramatic improvements in
aviation safety over the past 60 years. It is not by chance that flying is demonstrably the
safest mode of travel and this accomplishment has led to anundeniable understanding
that all modern systems require a more logical, focused approach to identifying and
controlling hazards. System safety is no longer a discipline reserved for the aerospace
designer and nuclear engineer; it is the most effective method of improving the safety of
any modern operation. As it has developed and matured, system safety has moved away
from being the exclusive domain of design engineers and has become less mathematical
or abstract and is now more practical and realistic. Modern concepts of system safety
can be used by any organization or person who wants a logical, visible, and traceable
method of identifying and controlling safety hazards and this is the objective of the
Basic Guide to System Safety.
System Safety: An Overview
BACKGROUND
The idea or concept of system safety can be traced to the missile production industry of
the late 1940s. It was further defined as a separate discipline by the late 1950s (Roland
and Moriarty 1983) and early 1960s, used primarily by the missile, aviation,
andaerospace communities. Prior to the 1940s, systemdesigners and engineers relied
predominantly on a trial-and-error method of achieving safe design. This approach was
somewhat successful in an era when system complexity was relatively simple compared
with those of subsequent development. For example, in the early days of the aviation
industry, this process was often referred to as the “fly-fix-fly” approach to design
problems (Roland and Moriarty 1983; Stephenson 1991) or, more accurately, “safety-
by-accident.” Simply stated, an aircraft was designed based upon existing or known
technology. It was then flown until problems developed or, in the worst case, it crashed
(Figure 1.1). If design errors were determined as the cause (as opposed to human, or
“pilot” error), then the design problems would be fixed and the aircraft would fly again.
Obviously, this method of after-the-fact design safety worked well when aircraft flew
low and slow and were constructed of wood, wire, and cloth. However, as systems grew
more complex and aircraft capabilities such as airspeed and maneuverability increased,
so did the likelihood of devastating results from afailure of the system or one of its
many subtle interfaces. This is clearly demonstrated in the early days of the aerospace
era (the 1950s and 1960s). As the industry began to develop jet powered aircraft and
space and missile systems, it quickly became clear that engineers could no longer wait
for problems to develop; they had to anticipate them and “fix” them before they
occurred. To put it another way: the “fly-fix-fly” philosophy was no longer feasible.
Elements such as these became the catalyst for the development of systems engineering,
out of which eventually grew the concept of system safety. The need to anticipate and
fix problems before they occurred led to a new approach a consideration of the design
as a “system.” This means that all aspects of the design of operation (e.g., machine,
operator, and environment) must be considered in identifying potential hazards and
establishing appropriate controls. Another important part of this “systems” approach to
safety is the realization that resources for safety are limited and there must be some
logical, reasoned way to apply resources to the most serious potential problems.
Systems safety provides this capability. Figure 1.2 shows a simplification of the basic
elements of the systems engineering process. It is noted that safety comprises only one
part of this integrated engineering design approach (Larson and Hann 1990). Taken one
step further, Figure 1.3 demonstrates how the systems approach associated with the
initial element of the systems safety engineering process—the design aspect—can
support the identification of hazards in the earliest phases of a project life cycle. Only
after the accurate identification of hazards can proper elimination or control measures
be determined.
FUNDAMENTALS
Since its initial development a half-century ago, the systemsafety discipline has expe
rienced a dramatic evolution of change and growth. Some analysts have compared this
rapid development to the humorous analogy of a man that walked into a doctor’s office
with a frog growing from his forehead. When the doctor asked: “How did it happen?”
The fro greplied: “It started as a pimple on my rear end!” (Olson,undated). Although, as
defined in Chapter 1, system safety has emerged as a subdiscipline within systems
engineering, it has quickly become an essential element of the safety planning process
in many industries including nuclear, aerospace/aviation, refining, healthcare, and so on.
In order to properly understand system safety as utilized in this text, a fundamental
understanding of some basic safety concepts, principles, and terms must first be
examined. The following definitions, from the Glossary of Terms, are therefore
provided here for discussion purposes:
System: Acombination of people, procedures, facility, and/or equipment all functioning
within a given or specified working environment to accomplish a specific task or set of
tasks (Stephenson 1991).
Safety: A measure of the degree of freedom from risk or conditions that can cause
death, physical harm, or equipment/property damage (Leveson 1986). Note: assumption
of risk is an essential ingredient of system safety philosophy.
System Safety Precedence: An ordered listing of preferred methods of eliminating or
controlling hazards (MIL-STD-882).
Hazard: A condition or situation which exists within the working environment capable
of causing harm, injury, and/or damage.
Hazard Severity: A categorical description of hazard level based upon real or perceived
potential for causing harm, injury, and/or damage.
Hazard Probability: The likelihood that a condition or set of conditions will exist in a
given situation or operating environment.
Mishap: An occurrence which results in injury, damage, or both.
Near-miss: An occurrence which could have resulted in injury, damage, or both, but did
not.
Risk: The likelihood or possibility of hazard consequences in terms of severity and
probability (Stephenson 1991).
Introducción
a short, sudden and unexpected event or occurrence that results in an unwanted and undesirable
outcome … and must directly or indirectly be the result of human activity rather than a natural
event’. (Hollnagel, 2004, p. 5)
Accident prevention is the most basic of all safety management paradigms. If safety
management is effective, then there should be an absence of accidents. Conversely, if
accidents are occurring then effective safety management must be absent. Therefore,
understanding how accidents occur is fundamental to establishing interventions to
prevent their occurrence. A simple nexus it would seem, yet the reality is accidents are
complex events, seldom the result of a single failure, and that complexity has made
understanding how accidents occur problematic since the dawn of the industrial
revolution.
In an attempt to unravel the accident causation mystery, over the years authors have
developed a plethora of conceptual models. At first glance they appear to be as diverse
and disparate as the accident problem they purport to help solve, yet closer scrutiny
reveals there are some common themes. There are linear models which suggest one
factor leads to the next and to the next leading up to the accident, and complex non
linear models which hypothesise multiple factors are acting concurrently and by their
combined influence, lead to accident occurrence. Some models have strengths in aiding
understanding how accidents occur in theory. Others are useful for supporting accident
investigations, to systematically analyse an accident in order to gain understanding of
the causal factors so that effective corrective actions can be determined and applied.
Accident models affect the way people think about safety, how they identify and analyse risk
factors and how they measure performance … they can be used in both reactive and proactive
safety management … and many models are based on an idea of causality ... accidents are thus
the result of technical failures, human errors or organisational problems. Hovden, Albrechtsen
and Herrera, 2010, p.855).
This chapter builds on the discussion of hazard as a concept1 to trace the evolution of
thinking about accident causation through the models developed over time thus it forms
a vital foundation for developing the conceptual framework identified as an essential
component of professional OHS practice2. The importance of models of causation to
OHS professional practice is highlighted by Kletz:
To an outsider it might appear that industrial accidents occur because we do not know
how to prevent them. In fact, they occur because we do not use the knowledge that is
available. Organisations do not learn from the past …and the organisation as a whole
forgets. (1993.)
Historical context
Perhaps the earliest well documented application of accident causation knowledge is
that of the Du Pont company which was founded in 1802 with a strong emphasis on
accident prevention and mitigation. Klein (2009), in a paper entitled “Two Centuries of
Process Safety at DuPont” reported that the company founder E.I. Du Pont (1772 –
1834) had once noted “we must seek to understand the hazards we live with”. The
design and operation of Du Pont explosives factories, over the next 120 years, were
gradually improved as a result of a consistent effort to understand how catastrophic
explosions were caused and prevented. In that period many of the principles of modern
accident prevention theory were formulated. By 1891 management accountability for
safe operations was identified as a necessary precept to such an extent that the original
Du Pont plant design included a requirement for the Director’s house, in which Du Pont
himself, his wife and seven children lived, to be constructed within the plant precinct, a
powerful incentive indeed to gain an understanding of accident causation. As described
by DeBlois (1915), the first head of DuPont’s Safety Division, elimination of hazards
was recognised as the priority in 1915 and a goal of zero injuries was also established at
that time. Amongst a list of other safety management initiatives which would still be
considered appropriate in today’s companies’ safety programs, the Du Pont Safety
Division was established in their Engineering Department in 1915 and carried out plant
inspections, conducted special investigations and analysed accidents.
Accident research was also reported as being part of the work of the British Industrial
Health Board between the two World Wars (Surry, 1969). Surry cited Greenwood and
Woods’ (1919) statistical analysis of injuries in a munitions factory and Newbold’s
(1926) study of thirteen factories which also reviewed injuries purported to be the first
research work into industrial accidents. Various other studies around the time (Osborne,
Vernon & Muscio 1922; Vernon 1919;1920; Vernon, Bedford & Warner 1928) examined
previously unresearched areas of working conditions such as humidity, work hours,
workers age, experience and absenteeism rates. Surry also reported that the appearance
of applied psychologists influenced research studies to focus on ‘human output’ and
during the 1930s attention was directed towards the study of individual accident
proneness. Surry noted that “pure accident research declined after 1940 while the study
of performance influencing factors has flourished” (p. 17).
The history of accident modelling itself can be traced back to the original work by
Herbert. W. Heinrich, whose book Industrial Accident Prevention in 1931 became the
first major work on understanding accidents. Heinrich stated that his fundamental
principles for applying science to accident prevention was that it should be: “(1) through
the creation and maintenance of an active interest in safety; (2) be fact finding; and (3)
lead to corrective action based on the facts” (Heinrich, 1931, p. 6). Heinrich’s book,
now in its 5th edition, attempted to understand the sequential factors leading to an
accident and heralded in what can be termed a period of simple sequential linear
accident modelling. While sequential linear models offered an easy visual representation
of the ‘path’ of causal factor development leading to an accident they did not escape the
widely accepted linear time aspect of events which is tied into the “Western cultural
world-view of past, present and future as being part of everyday logic, prediction and
linear causation” (Buzsáki, 2006, p. 8).
Evolution of models of accident causation
The history of accident models to date can be traced from the 1920s through three
distinct phases (Figure 1):
Simple linear models
Complex linear models
Complex non-linear models. (Hollnagel, 2010).
Each type of model is underpinned by specific assumptions:
The simple linear models assume that accidents are the culmination of a series of events
or circumstances which interact sequentially with each other in a linear fashion and thus
accidents are preventable by eliminating one of the causes in the linear sequence.
Complex linear models are based on the presumption that accidents are a result of a
combination of unsafe acts and latent hazard conditions within the system which follow
a linear path. The factors furthest away from the accident are attributed to actions of the
organisation or environment and factors at the sharp end being where humans ultimately
interact closest to the accident; the resultant assumption being that accidents could be
prevented by focusing on strengthening barriers and defences.
The new generation of thinking about accident modelling has moved towards
recognising that accident models need to be non-linear; that accidents can be thought of
as resulting from combinations of mutually interacting variables which occur in real
world environments and it is only through understanding the combination and
interaction of these multiple factors that accidents can truly be understood and
prevented. (Hollnagel, 2010).
Figure 1 portrays the temporal development of the three types of model and their
underpinning principle. The types of model, their evolution, together with representative
examples are described in the following sections.
Heinrich’s Domino Theory
The first sequential accident model was the ‘Domino effect’ or ‘Domino theory’
(Heinrich, 1931). The model is based in the assumption that:
the occurrence of a preventable injury is the natural culmination of a series of events or
circumstances, which invariably occur in a fixed or logical order … an accident is merely a link
in the chain. (p. 14).
This model proposed that certain accident factors could be thought of as being lined up
sequentially like dominos. Heinrich proposed that an:
… accident is one of five factors in a sequence that results in an injury … an injury is invariably
caused by an accident and the accident in turn is always the result of the factor that immediately
precedes it. In accident prevention the bull’s eye of the target is in the middle of the sequence –
an unsafe act of a person or a mechanical or physical hazard (p. 13).
Heinrich’s five factors were:
Social environment/ancestry
Fault of the person
Unsafe acts, mechanical and physical hazards
Accident
Injury
Extending the domino metaphor, an accident was considered to occur when one of the
dominos or accident factors falls and has an ongoing knock-down effect ultimately
resulting in an accident (Figure 2).
Based on the domino model, accidents could be prevented by removing one of the
factors and so interrupting the knockdown effect. Heinrich proposed that unsafe acts
and mechanical hazards constituted the central factor in the accident sequence and that
removal of this central factor made the preceding factors ineffective. He focused on the
human factor, which he termed “Man Failure”, as the cause of most accidents. Giving
credence to this proposal, actuarial analysis of 75,000 insurance claims attributed some
88% of preventable accidents to unsafe acts of persons and 10% to unsafe mechanical or
physical conditions, with the last 2% being acknowledged as being unpreventable
giving rise to Heinrich’s chart of direct and proximate causes (Heinrich, 1931, p.19).
(Figure 3)
Bird and Germain’s Loss Causation model
The sequential domino representation was continued by Bird and Germain (1985) who
acknowledged that the Heinrich’s domino sequence had underpinned safety thinking for
over 30 years. They recognised the need for management to prevent and control
accidents in what were fast becoming highly complex situations due to the advances in
technology. They developed an updated domino model which they considered reflected
the direct management relationship with the causes and effects of accident loss and
incorporated arrows to show the multi-linear interactions of the cause and effect
sequence. This model became known as the Loss Causation Model and was again
represented by a line of five dominos, linked to each other in a linear sequence (Figure
4).
Complex linear models
Sequential models were attractive as they encouraged thinking around causal series.
They focus on the view that accidents happen in a linear way where A leads to B which
leads to C and examine the chain of events between multiple causal factors displayed in
a sequence usually from left to right. Accident prevention methods developed from
these sequential models focus on finding the root causes and eliminating them, or
putting in place barriers to encapsulate the causes. Sequential accident models were still
being developed in the 1970’s but had begun to incorporate multiple events in the
sequential path. Key models developed in this evolutionary period include energy
damage models, time sequence models, epidemiological models and systemic models.
Energy-damage models
The initial statement of the concept of energy damage in the literature is often attributed
to Gibson (1961) but Viner (1991, p.36) understands it to be a result of discussions
between Gibson, Haddon and others. The energy damage model (figure 5) is based on
the supposition that “Damage (injury) is a result of an incident energy whose intensity at
the point of contact with the recipient exceeds the damage threshold of the recipient”
(Viner, 1991, p42).
In the Energy Damage Model the hazard is a source of potentially damaging energy and
an accident, injury or damage may result from the loss of control of the energy when
there is a failure of the hazard control mechanism. These mechanisms may include
physical or structural containment, barriers, processes and procedures. The space
transfer mechanism is the means by which the energy and the recipient are brought
together assuming that they are initially remote from each other. The recipient boundary
is the surface that is exposed and susceptible to the energy. (Viner, 1991)