0% found this document useful (0 votes)

148 views27 pages

Human Error

Human Error is the largest contributor to reduced dependability - HP HA labs: Human Error is #1 cause of failures (2001) - Oracle: half of DB failures due to Human Error (1999) - Gray / Tandem: 42% of failures from human administrator errors (1986) - Murphy / Gent study of VAX systems (1993): % of System Crashes Causes of System Crashes 18% 53% 18% 10% Other System management software failure Hardware failure Slide 4 time (1985-1993)

Uploaded by

nlchaudhari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

148 views27 pages

Human Error

Uploaded by

nlchaudhari

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 27

An Overview of Human Error

Drawn from J. Reason, Human Error, Cambridge, 1990

Aaron Brown CS 294-4 ROC Seminar

Outline
Human error and computer system failures A theory of human error Human error and accident theory Addressing human error

Slide 2

Dependability and human error

Industry data shows that human error is the largest contributor to reduced dependability
HP HA labs: human error is #1 cause of failures (2001) Oracle: half of DB failures due to human error (1999) Gray/Tandem: 42% of failures from human administrator errors (1986) Murphy/Gent study of VAX systems (1993):
% of System Crashes

Causes of system crashes

18% 53% 18% 10%

Other System management Software failure Hardware failure

Slide 3

Time (1985-1993)

Learning from other fields: PSTN

FCC-collected data on outages in the US public-switched telephone network
metric: breakdown of customer calls blocked by system outages (excluding natural disasters). Jan-June 2001
9% 22%

Human error accounts for 56% of all blocked calls

Human-co. Human-ext.

Hardware Failure Software Failure 47% Overload Vandalism

17%

comparison with 1992-4 data shows that human error is the only factor that is not improving over time
Slide 4

Learning from other fields: PSTN

PSTN trends: 1992-1994 vs. 2001 Cause
Human error: company Human error: external Hardware
Software Overload Vandalism

Trend

Minutes

(millions of customer minutes/month)

1992-94
98 100

2001

176 75

49
15 314 5

49
12 60 3
Slide 5

Learning from experiments

Human error rates during maintenance of software RAID system
participants attempt to repair RAID disk failures
by replacing broken disk and reconstructing data

each participant repeated task several times data aggregated across 5 participants
Error type
Fatal Data Loss Unsuccessful Repair

Windows
M

Solaris

Linux
MM M

System ignored fatal input

User Error Intervention Required User Error User Recovered Total number of trials M M 35 MM MMMM 33

M
M MM 31
Slide 6

Learning from experiments

Errors occur despite experience:
3 Windows Solaris Linux

Number of errors

0 1 2 3 4 5 6 7 8 9

Training and familiarity dont eliminate errors

types of errors change: mistakes vs. slips/lapses

Iteration

System design affects error-susceptibility Slide 7

Outline
Human error and computer system failures A theory of human error Human error and accident theory Addressing human error

Slide 8

A theory of human error

(distilled from J. Reason, Human Error, 1990)

Preliminaries: the three stages of cognitive processing for tasks

1) planning
a goal is identified and a sequence of actions is selected to reach the goal

2) storage
the selected plan is stored in memory until it is appropriate to carry it out

3) execution
the plan is implemented by the process of carrying out the actions specified by the plan

Slide 9

A theory of human error (2)

Each cognitive stage has an associated form of error
slips: execution stage
incorrect execution of a planned action example: miskeyed command

lapses: storage stage

incorrect omission of a stored, planned action examples: skipping a step on a checklist, forgetting to restore normal valve settings after maintenance

mistakes: planning stage

the plan is not suitable for achieving the desired goal example: TMI operators prematurely disabling HPI pumps
Slide 10

Origins of error: the GEMS model

GEMS: Generic Error-Modeling System GEMS identifies three levels of cognitive task processing
skill-based: familiar, automatic procedural tasks
usually low-level, like knowing to type ls to list files

an attempt to understand the origins of human error

rule-based: tasks approached by pattern-matching from a set of internal problem-solving rules knowledge-based: tasks approached by reasoning from first principles
when rules and experience dont apply

observed symptoms X mean system is in state Y if system state is Y, I should probably do Z to fix it

Slide 11

GEMS and errors

Errors can occur at each level
skill-based: slips and lapses
usually errors of inattention or misplaced attention

rule-based: mistakes
usually a result of picking an inappropriate rule caused by misconstrued view of state, over-zealous pattern matching, frequency gambling, deficient rules

knowledge-based: mistakes
due to incomplete/inaccurate understanding of system, confirmation bias, overconfidence, cognitive strain, ...

Errors can result from operating at wrong level

humans are reluctant to move from RB to KB level even if rules arent working
Slide 12

Error frequencies
In raw frequencies, SB >> RB > KB
61% of errors are at skill-based level 27% of errors are at rule-based level 11% of errors are at knowledge-based level

But if we look at opportunities for error, the order reverses

humans perform vastly more SB tasks than RB, and vastly more RB than KB
so a given KB task is more likely to result in error than a given RB or SB task

Slide 13

Error detection and correction

Basic detection mechanism is self-monitoring
periodic attentional checks, measurement of progress toward goal, discovery of surprise inconsistencies, ...

Effectiveness of self-detection of errors

SB errors: 75-95% detected, avg 86%
but some lapse-type errors were resistant to detection

RB errors: 50-90% detected, avg 73% KB errors: 50-80% detected, avg 70%

Including correction tells a different story:

SB: ~70% of all errors detected and corrected RB: ~50% detected and corrected KB: ~25% detected and corrected
Slide 14

Outline
Human error and computer system failures A theory of human error Human error and accident theory Addressing human error

Slide 15

Human error and accident theory

Major systems accidents (normal accidents) start with an accumulation of latent errors
most of those latent errors are human errors
latent slips/lapses, particularly in maintenance example: misconfigured valves in TMI latent mistakes in system design, organization, and planning, particularly of emergency procedures example: flowcharts that omit unforeseen paths

invisible latent errors change system reality without altering operators models
seemingly-correct actions can then trigger accidents

Slide 16

Accident theory (2)

Accidents are exacerbated by human errors made during operator response
RB errors made due to lack of experience with system in failure states
training is rarely sufficient to develop a rule base that captures system response outside of normal bounds

KB reasoning is hindered by system complexity and cognitive strain

system complexity prohibits mental modeling stress of an emergency encourages RB approaches and diminishes KB effectiveness

system visibility limited by automation and defense in depth

results in improper rule choices and KB reasoning
Slide 17

Outline
Human error and computer system failures A theory of human error Human error and accident theory Addressing human error
general guidelines the ROC approach: system-level undo

Slide 18

Addressing human error

Challenges
humans are inherently fallible and errors are inevitable hard-to-detect latent errors can be more troublesome than front-line errors human psychology must not be ignored
especially the SB/RB/KB distinction and human behavior at each level

General approach: error-tolerance rather than error-avoidance

It is now widely held among human reliability specialists that the most productive strategy for dealing with active errors is to focus upon controlling their consequences rather than upon striving for their elimination. (Reason, p. 246)
Slide 19

The Automation Irony

Automation is not the cure for human error
automation addresses the easy SB/RB tasks, leaving the complex KB tasks for the human
humans are ill-suited to KB tasks, especially under stress

automation hinders understanding and mental modeling

decreases system visibility and increases complexity operators dont get hands-on control experience rule-set for RB tasks and models for KB tasks are weak

automation shifts the error source from operator errors to design errors
harder to detect/tolerate/fix design errors
Slide 20

Building robustness to human error

Discover and correct latent errors
must overcome human nature to wait until emergency to respond

Increase system visibility

dont hide complexity behind automated mechanisms

Take errors into account in operator training

include error scenarios promote exploratory trial & error approaches emphasize positive side of errors: learning from mistakes

Slide 21

Building robustness to human error

Reduce opportunities for error (Don Norman):
get good conceptual model to user by consistent design design tasks to match human limits: working memory, problem solving abilities make visible what the options are, and what are the consequences of actions exploit natural mappings: between intentions and possible actions, actual state and what is perceived, use constraints to guide user to next action/decision design for errors. Assume their occurrence. Plan for error recovery. Make it easy to reverse action and make hard to perform irreversible ones. when all else fails, standardize: ease of use more important, only standardize as last resort
Slide 22

Building robustness to human error

Acknowledge human behavior in system design:
interfaces should allow user to explore via experimentation to help at KB level, provide tools do experiments/test hypotheses without having to do them on high-risk irreversible plant. Or make system state always reversible. provide feedback to increase error observability (RB level) at RB level, provide symbolic cues and confidence measures for RB, try to give more elaborate, integrated cues to avoid strong-but-wrong RB error provide overview displays at edge of periphery to avoid attentional capture at SB level simultaneously present data in forms useful for SB/RB/KB provide external memory aids to help at KB level, including externalized representation of different options/schemas
Slide 23

Human error: the ROC approach

ROC is focusing on system-level techniques for human error tolerance
complimentary to UI innovations

Goal: provide forgiving operator environment

expect human error and tolerate it allow operator to experiment safely, test hypotheses make it possible to detect and fix latent errors

Approach: undo for system administration

Slide 24

Repairing the Past with Undo

The Three Rs: undo meets time travel
Rewind: roll system state backwards in time Repair: fix latent or active error
automatically or via human intervention

Redo: roll system state forward, replaying user interactions lost during rewind

This is not your ordinary word-processor undo!

allows sysadmin to go back in time to fix latent errors after theyre manifested

Slide 25

Undo details
Examples where Undo would help:
reverse the effects of a mistyped command (rm rf *) roll back a software upgrade without losing user data retroactively install virus filter on email server; effects of virus are squashed on redo

The 3 Rs vs. checkpointing, reboot, logging

checkpointing gives Rewind only reboot may give Repair, but only for Heisenbugs logging can give all 3 Rs
but need more than RDBMS logging, since system state changes are interdependent and non-transactional 3R-logging requires careful dependency tracking, and attention to state granularity and externalized events
Slide 26

Summary
Humans are critical to system dependability
human error is the single largest cause of failures

Human error is inescapable: to err is human

yet we blame the operator instead of fixing systems

Human error comes in many forms

mistakes, slips, lapses at KB/RB/SB levels of operation but is nearly always detectable

Best way to address human error is tolerance

through mechanisms like undo human-aware UI design can help too
Slide 27

Polaris New
No ratings yet
Polaris New
40 pages
IAO131 - Fresh Fever From The Skies
100% (19)
IAO131 - Fresh Fever From The Skies
736 pages
Nasirpouri2017 Book ElectrodepositionOfNanostructu
No ratings yet
Nasirpouri2017 Book ElectrodepositionOfNanostructu
333 pages
Sped 277 Udl Lesson Plan Templated Inferences
No ratings yet
Sped 277 Udl Lesson Plan Templated Inferences
2 pages
TABLE-SPECIAL CIVIL ACTIONS - And-Special-Rules
No ratings yet
TABLE-SPECIAL CIVIL ACTIONS - And-Special-Rules
42 pages
Human Error
100% (1)
Human Error
27 pages
Understanding Human Error & Accidents
No ratings yet
Understanding Human Error & Accidents
13 pages
Human Error 2024, Human Factors and Ergonomics
No ratings yet
Human Error 2024, Human Factors and Ergonomics
33 pages
Human Error
75% (4)
Human Error
36 pages
Human Error: Oleh: Edwina Rudyarti
100% (1)
Human Error: Oleh: Edwina Rudyarti
26 pages
The Interaction of Design and Human Capabilities
No ratings yet
The Interaction of Design and Human Capabilities
24 pages
Human Factors: Risk Management Services Department
No ratings yet
Human Factors: Risk Management Services Department
28 pages
Behind Human Error 2nd Edition Edition Richard Cook Download
100% (5)
Behind Human Error 2nd Edition Edition Richard Cook Download
71 pages
Human Error I
No ratings yet
Human Error I
48 pages
Human Error
No ratings yet
Human Error
11 pages
Design For Reliability
No ratings yet
Design For Reliability
25 pages
The Error Role Rasmussen
No ratings yet
The Error Role Rasmussen
8 pages
Human Error
No ratings yet
Human Error
5 pages
qt4nr8x5b1 Nosplash
No ratings yet
qt4nr8x5b1 Nosplash
1 page
Engineers View of Human Error - Kletz
No ratings yet
Engineers View of Human Error - Kletz
53 pages
Human Error Rca Ebook Sologic
100% (3)
Human Error Rca Ebook Sologic
22 pages
Human Factors For Aircraft Maintenance
100% (1)
Human Factors For Aircraft Maintenance
10 pages
Unit 5
No ratings yet
Unit 5
92 pages
Humman Error
No ratings yet
Humman Error
20 pages
Human Error and Hazards
No ratings yet
Human Error and Hazards
21 pages
Human Reliability: Andi R. Wijaya Teknik Mesin & Industri UGM
No ratings yet
Human Reliability: Andi R. Wijaya Teknik Mesin & Industri UGM
67 pages
UX Design Basic
No ratings yet
UX Design Basic
21 pages
4.7 Out of 5.0 (85 Reviews) : Instant PDF Download
No ratings yet
4.7 Out of 5.0 (85 Reviews) : Instant PDF Download
81 pages
Human Error & Safety in Workplaces
No ratings yet
Human Error & Safety in Workplaces
32 pages
Human Error: Understanding & Mitigation
No ratings yet
Human Error: Understanding & Mitigation
26 pages
(Ebook) Behind Human Error by Richard Cook, Sidney Dekker, Leila Johannesen, David D. Woods ISBN 9780754678335, 9780754696506, 9781315568935, 9781317175520, 9781317175537, 9781317175544, 0754678334, 0754696502, 1315568934 Complete Edition
No ratings yet
(Ebook) Behind Human Error by Richard Cook, Sidney Dekker, Leila Johannesen, David D. Woods ISBN 9780754678335, 9780754696506, 9781315568935, 9781317175520, 9781317175537, 9781317175544, 0754678334, 0754696502, 1315568934 Complete Edition
162 pages
Human Error in Accidents-Ug-1
No ratings yet
Human Error in Accidents-Ug-1
10 pages
Chapter 3
No ratings yet
Chapter 3
42 pages
Proctor Zandt Human Factors 2008 Chapter 3 Reliability
No ratings yet
Proctor Zandt Human Factors 2008 Chapter 3 Reliability
27 pages
Human Error Understanding Human Behaviour and Erro
No ratings yet
Human Error Understanding Human Behaviour and Erro
11 pages
Safety, Accidents and Human Errors
No ratings yet
Safety, Accidents and Human Errors
32 pages
IPA - Human Error Reduction
No ratings yet
IPA - Human Error Reduction
33 pages
Behind Human Error 2nd Edition Edition Richard Cook Newest Edition 2025
No ratings yet
Behind Human Error 2nd Edition Edition Richard Cook Newest Edition 2025
107 pages
Chapter 8
No ratings yet
Chapter 8
19 pages
Principles of System Safety Engineering and Management
100% (1)
Principles of System Safety Engineering and Management
161 pages
System Safety Engineering Guide
No ratings yet
System Safety Engineering Guide
161 pages
Module 2 Error 2021
No ratings yet
Module 2 Error 2021
41 pages
Poka Yoke Error Proofin
No ratings yet
Poka Yoke Error Proofin
75 pages
Baase Henry GoF5e Ch8
No ratings yet
Baase Henry GoF5e Ch8
34 pages
Human Factors Handbook For Process Plant Operations Improving Process Safety and System Performance 1st Edition Ccps (Center For Chemical Process Safety) Download
100% (1)
Human Factors Handbook For Process Plant Operations Improving Process Safety and System Performance 1st Edition Ccps (Center For Chemical Process Safety) Download
43 pages
Software System Safety: Nancy G. Leveson
No ratings yet
Software System Safety: Nancy G. Leveson
52 pages
IMAT3712 24 Errors
No ratings yet
IMAT3712 24 Errors
67 pages
Rootcause Failure Analysis3
No ratings yet
Rootcause Failure Analysis3
19 pages
Human: - Divya Krishnakumar
No ratings yet
Human: - Divya Krishnakumar
35 pages
362 Wilson-Donnelly, Priest, Burke, & Salas (EID Safety Culture Tips 2004)
No ratings yet
362 Wilson-Donnelly, Priest, Burke, & Salas (EID Safety Culture Tips 2004)
6 pages
ENGG5103 Week4 Human Factors
No ratings yet
ENGG5103 Week4 Human Factors
39 pages
Human Error George A. Peters Direct Download
100% (1)
Human Error George A. Peters Direct Download
144 pages
MEN3701 - Human Factors and Reliability
No ratings yet
MEN3701 - Human Factors and Reliability
14 pages
Human Error and Models of Behaviour
No ratings yet
Human Error and Models of Behaviour
46 pages
Increasing Reliability and Safety
No ratings yet
Increasing Reliability and Safety
14 pages
Success and Failure: Human As Hero Human As Hazard: Carl Sandom
No ratings yet
Success and Failure: Human As Hero Human As Hazard: Carl Sandom
9 pages
Human Error George A. Peters Instant Download
No ratings yet
Human Error George A. Peters Instant Download
29 pages
6678understanding Human Error in Mine Safety Geoff Simpson Download Full Chapters
100% (4)
6678understanding Human Error in Mine Safety Geoff Simpson Download Full Chapters
125 pages
Human Reliability: RIAC Human Factors Working Group
No ratings yet
Human Reliability: RIAC Human Factors Working Group
5 pages
Human Error
No ratings yet
Human Error
45 pages
Tiana - Google Search
No ratings yet
Tiana - Google Search
1 page
Wiki Loves Monuments
No ratings yet
Wiki Loves Monuments
17 pages
ENGM90006 Assignment 14 v1
No ratings yet
ENGM90006 Assignment 14 v1
3 pages
Vocabulary + Grammar Unit 1 Test A PDF
100% (1)
Vocabulary + Grammar Unit 1 Test A PDF
3 pages
Pubmed Microneedl Set
No ratings yet
Pubmed Microneedl Set
3 pages
NEUPANE - Richa - Biochar Production Process Optimisation and Product Characterisation
No ratings yet
NEUPANE - Richa - Biochar Production Process Optimisation and Product Characterisation
114 pages
Trivia Quiz: Environment, History, Geography, Sports, Politics
No ratings yet
Trivia Quiz: Environment, History, Geography, Sports, Politics
4 pages
Brave New World Essay
No ratings yet
Brave New World Essay
3 pages
Wonderlic Test Answer Key & Ranks
No ratings yet
Wonderlic Test Answer Key & Ranks
2 pages
Week 5
No ratings yet
Week 5
8 pages
Multicasting in TCP/IP Protocols
No ratings yet
Multicasting in TCP/IP Protocols
48 pages
Social Studies Lesson Exemplar
100% (1)
Social Studies Lesson Exemplar
8 pages
Methods For Testing Tar and Bituminous Materials - Determination of Specific Gravity
100% (1)
Methods For Testing Tar and Bituminous Materials - Determination of Specific Gravity
10 pages
United States District Court Northern District of California
No ratings yet
United States District Court Northern District of California
1 page
Contents-Rules of English Grammar and Usage
No ratings yet
Contents-Rules of English Grammar and Usage
5 pages
Bajau of Semporna Sabah
No ratings yet
Bajau of Semporna Sabah
5 pages
BR SprayMaster
No ratings yet
BR SprayMaster
16 pages
Econometric Multicollinearity Analysis
No ratings yet
Econometric Multicollinearity Analysis
4 pages
Billet Marker
0% (1)
Billet Marker
4 pages
PT Science-6 Q1
No ratings yet
PT Science-6 Q1
6 pages
Scriabin Etude Op.42 No.5
No ratings yet
Scriabin Etude Op.42 No.5
4 pages
The Cinema of Federico Fellini - 30197: Syllabus
100% (1)
The Cinema of Federico Fellini - 30197: Syllabus
4 pages
Finetech GTX 620 Katalogu 944
No ratings yet
Finetech GTX 620 Katalogu 944
4 pages
Karan Resume
No ratings yet
Karan Resume
2 pages
Top Ten Ways of Handling Guest Complaints
No ratings yet
Top Ten Ways of Handling Guest Complaints
6 pages

Human Error

Uploaded by

Human Error

Uploaded by

An Overview of Human Error

Drawn from J. Reason, Human Error, Cambridge, 1990

Aaron Brown CS 294-4 ROC Seminar

Dependability and human error

Causes of system crashes

Other System management Software failure Hardware failure

Learning from other fields: PSTN

Human error accounts for 56% of all blocked calls

Hardware Failure Software Failure 47% Overload Vandalism

Learning from other fields: PSTN

(millions of customer minutes/month)

Learning from experiments

System ignored fatal input

Learning from experiments

Training and familiarity dont eliminate errors

System design affects error-susceptibility Slide 7

A theory of human error

Preliminaries: the three stages of cognitive processing for tasks

A theory of human error (2)

lapses: storage stage

mistakes: planning stage

Origins of error: the GEMS model

an attempt to understand the origins of human error

GEMS and errors

Errors can result from operating at wrong level

But if we look at opportunities for error, the order reverses

Error detection and correction

Effectiveness of self-detection of errors

Including correction tells a different story:

Human error and accident theory

Accident theory (2)

KB reasoning is hindered by system complexity and cognitive strain

system visibility limited by automation and defense in depth

Addressing human error

General approach: error-tolerance rather than error-avoidance

The Automation Irony

automation hinders understanding and mental modeling

Building robustness to human error

Increase system visibility

Take errors into account in operator training

Building robustness to human error

Building robustness to human error

Human error: the ROC approach

Goal: provide forgiving operator environment

Approach: undo for system administration

Repairing the Past with Undo

This is not your ordinary word-processor undo!

The 3 Rs vs. checkpointing, reboot, logging

Human error is inescapable: to err is human

Human error comes in many forms

Best way to address human error is tolerance

You might also like