A story is an accounting of an event as experienced through the eyes, ears, cognitive biases, and paradigms of one person. This is my story about attending the Day with Sidney Dekker ’ at the Vancouver Convention Centre on Friday September 19 2014. The seminar was sponsored by the Lower Mainland chapter of CSSE (Canadian Society of Safety Engineering). I initially heard about the seminar through my associations with RHLN and the HFCoP.
I was aware that Sidney Dekker (SD) uses very few visual slides and provides no handouts. So I came fully prepared to take copious notes with my trusty iPad Air. This not a play-by-play (or a blow-by-blow if you take in SD’s strong opinions on HR, smart managers bent on controlling dumb workers, etc.) I’ve shifted content around to align my thinking and work I’ve done co-developing our Resilient Safety Culture course with Cognitive-Edge. My comments are in square brackets and italics.
SD: Goal today is to teach you think about intractable issues in safety
SD: Don’t believe a word I say; indulge me today then go find out for yourself
SD: We care when bad people make mistakes but we should care more when good people make mistakes and why they do
Where are we today in safety thinking?
Here is a recent article announcing a new safety breakthrough [http://bit.ly/1mVg19a]
LIJ medical center has implemented a safety solution that will be all to end all
A remote video auditing (RVA) in a surgical room developed by Arrowsight [http://bit.ly/1mVh2yf]
RVA monitors status every 2 minutes for tools left in patients, OR team mistakes
Patient Safety improved to a near perfect score
Culture of safety and trust is palpable among the surgical team
Real-time feedback on a smartphone
RVA is based on the “bad apple” theory and model and an underlying assumption there is a general lack of vigilence
Question: Who looks at the video?
Ans: Independent auditor who will cost money. Trade-off tension created between improving safety or keeping costs down
Assumption: He who watches knows best so OR team members are the losers
Audience question: What if the RVA devices weren’t physically installed but just announced; strategy is to put in people’s minds that someone is watching to avoid complacency
SD: have not found any empirical evidence that being watched improves safety. But it does change behaviour to look good for the camera
Audience question: Could the real purpose of the RVA be to protect the hospital’s ass during litigation cases?
SD: Very good point! [safety, cost, litigation form a SenseMaker™ triad to attach meaning to a story
One possible RVA benefit: Coaching & Learning
If the video watchers are the performers, then feedback is useful for learning purposes
Airline pilots can ask to replay the data of a landing but only do so on the understanding there are serious protections in place – no punitive action can be a consequence of reviewing data
Conclusion: Solutions like RVA give the illusion of perfect resolution
How did we historically arrive at the way we look at safety and risk today?
[Reference SD’s latest book released June 2014: “Safety Differently” which is an update of “Ten Questions About Human Error: A New View of Human Factors and System Safety”]
Late Victorian Era
Beginning of measurement (Germany, UK) to makes things visible
Discover industrial revolution kills a lot of people, including children
Growing concern with enormous injury and fatality problem
Scholars begin to look at models
1905 Rockwell: pure accidents (events that cannot be anticipated) seldom happen; someone has blundered or reversed a law of nature
Eric Farmer: carelessness or lack of attention of the worker
Oxford Human Factor definition: physical, mental, or moral shortcoming of the individual that predisposes the person
We still promote this archaic view today in programs like Hearts & Mind [how Shell and the Energy Institute promote world class HSE]
campaigns with posters, banners, slogans
FAITH-BASED safety approach vs. science-based
In 2014, can’t talk about physical handicaps but are allowed to for mental and moral (Hearts and Minds) human deficiencies
SD: I find it offensive to be treated as an infantile
1911 Frederick Taylor introduced Scientific Management to balance the production of pigs, cattle
Frank Gilbreth conducted time and motion studies
Problem isn’t the individual but planning, organizing, and managing
Scientific method is to decompose into parts and find 1 best solution [also known as Linear Reductionism]
Need to stay with 1 best method (LIJ’s RVA follows this 1911 edict)
Focus on the non-compliant individual using line supervision to manage dumb workers
Do not let people to work heuristically [rule of thumb] but adamantly adhere to the 1 best method
We are still following the Tayloristic approach
Example: Safety culture quote in 2000: “It is generally acknowledged that human frailty lies behind the majority of our accidents. Although many of these have been anticipated by rules, procedures, some people don’t do what they are supposed to do. They are circumventing the multiple defences that management has created.”
It’s no longer just a Newton-Cartesian world
Closed system, no external forces that impinge on the unit
Linear cause & effect relationships exist
Predictable, stable, repeatable work environment
Checklists, procedures are okay
Compliance with 1 best method is acceptable
Now we know the world is complex, full of perturbations, and not a closed system
[Science-based thinking has led to complex adaptive systems (CAS) http://gswong.com/?wpfb_dl=20]
SD’s story as an airline pilot
Place a paper cup on the flaps (resilience vs. non-compliance) because resilience is needed to finish the design of the aircraft by the operators
Alway a gap between Work-as-imagined vs Work-as-done [connects with Erik Hollnagel’s Safety-II]
James Reason calls the gap a non-compliance violation; we can also call that gap Resilience – people have to adapt to the local conditions using their experience, knowledge, judgement
SD: We pay people more money who have experience. Why? Because the 1 best method may not work
There is no checklist to follow
Taylorism is limited and can’t go beyond standardization
Audience question: Bathtub curve model for accidents – more accidents involving younger and older workers. Why does this occur?
SD: Younger workers are beaten to comply but often are not told why so lack understanding
Gen Y doesn’t believe in authority and sources of knowledge (prefer to ask a crowd, not an individual)
SD: Older worker research suggests expertise doesn’t create safety awareness. They know how close they can come to the margin but if they go over the line, slower to act. [links with Richard Cook’s Going Solid / Margin of Manoeuvre concept http://gswong.com/?wpfb_dl=18
This is not complacency (a motivational issue) but an attenuation towards risk. Also may not be aware the margins have moved (example: in electric utility work, wood cross-arm materials have changed). Unlearning, teaching the old dog new tricks is difficult.[Master builder/Apprenticeship model: While effective for passing on tacit knowledge, danger lies in old guys becoming stale and passing on myths and old paradigms]
1920s & 1930s – advent of Technology & animation of Taylorism
World is fixed, technology will solve the problems of the world
Focus on the person using rewards and punishment, little understanding of deep ethical implications
People just need to conform to technology, machines, devices [think of Charlie Chaplin’s Modern Times movie]
Today: Behaviour-based Safety (BBS) programs still follow this paradigm re controlling human behaviour
Example: mandatory drug testing policy. What does this do to an organization?
In a warehouse, worker is made to wear a different coloured vest (a dunce cap)
“You are the sucker who lost this month’s safety bonus!” What happens to trust, bonding?
Accident Proneness theory (UK, Germany 1925)
Thesis is based on data and similar to Bad Apply theory
Data showed some people more involved in accidents than others (eg. 25% cause 55%)
Idea was to target these individuals
Aligned with the eugenic thinking in the 1920s (Ghost of the time/spirit/zeitgeist)
Identify who is fit and weed out (exterminate) the unfit [think Nazism]
Theory development carried on up the WWII
Question: what is the fundamental statistical flaw with this theory?
Answer: We all do the same kind of work therefore we all have the same probability of incurring an accident
Essentially comparing apples with oranges
We know better – individual differences exist in risk tolerance
SD: current debate in medical journal: data shows 3% of surgeons causing majority of deaths
Similar article in UK 20% causing 80%
So, should we get rid of these accident-prone surgeons?
No, because the 3% may include the docs who are willing to take the risk to try something new to save a life
Nuclear, radar, rocketry, computers
Created a host of new complexities, new usability issues
Example: Improvements to the B17 bomber
Hydraulic gear and flap technology introduced
However, belly-flop landings happened
Presumed cause was dumb pilots who required more training, checklists, and punishment
Would like to remove these reckless accident-prone pilots damaging the planes
However, pilots are in short supply plus give them a break – they have been shot at by the enemy trying to kill them
Shifted focus from human failure to design flaws. Why do 2 switches in dashboard look the same?
In 1943 redesigned switch to prevent bellyflopping
Message: Human error is systemically connected and predictability so to the features of tools and products that people use. Bad design induces errors. Better to intervene in the context of people’s work.
Safety thinking begins to change: What happens in the head is acutely important.
Now interested in cognitive psychology [intuition, reasoning, decision-making] not just behavioural psychology [what can be observed]
Today: Just Culture policy (human error, at-risk behaviour, reckless behaviour)
After lunch exercise: Greek airport 1770m long
Perceived problem: breaking EU rules by taxiing too close to the road
White line – displaced threshold – don’t land before this line
Need to rapidly taxi back to the terminal to unload for productivity reasons (plane on-the-ground costs money)
Vehicular traffic light is not synced with plane landing (i.e., random event)
Question: How do you stop non-compliant behaviour if you are the regulator? How might you mitigate the risk?
SD: Select a solution approach with choices including Taylorism, Just Culture, Safety by Design
Several solutions heard from the audience but no one-best
SD: Conformity and compliance rules are not the answer, human judgment required
Situation is constantly changing – Tarmac gets hot in afternoon; air rises so may need to come in at a lower angle. At evening when cooler, approach angle will change
[Reinforces the nature of a CAS where agents like weather can impact solutions and create emergent, unexpected consequences]
SD concern: Normalization of deviance – continual squeezing of the boundaries and gradual erosion of safety margins
They’re getting away with it but eventually there will be fatal crash
[reminds me of the frog that’s content to sit in the pot of water as the temperature is slowly increased. The frog doesn’t realize it’s slowly cooking to death until it’s too late}
Back to the historical timeline…
1980s Systems Thinking
James Reason’s Swiss Cheese Model undermines our safety efforts
Put in layers of defence which reinforces the 1940s thinking
Smarter managers to protect the dumb workers
Cause and effect linear model of safety
Example: 2003 Columbia space shuttle re-entry
Normal work was done, not people screwing up (foam maintenance)
There were no holes according to the Swiss Cheese Model
Emergence: Piece of insulation foam broke off damaging the wing
Example: 1988 Piper Alpha oil rig
Prior to accident, recognized as the most outstanding safe and productive oil rig
Explosion due to leaking gas killing 167
“I knew everything was right because I never got a report anything was wrong”
Looking for the holes in the Swiss Cheese Model again
Delusion of being safe due to accident-free record
Many people carry an idealistic image of safety: a world without harm, pain, suffering
Setting a Zero Harm goal is counter-productive as it suppresses reporting and incents manipulation of the numbers to look good
Abraham Wald example
Question: Where should we put the armour on a WWII bomber?Wrong analysis: Let’s focus on the holes and put armour there to cover them up
Right analysis: Since the plane made it back, there’s no need for armour on the holes!
Safety implication: Holes represent near-miss incidents (bullets that fortunately didn’t down the plane). We shouldn’t be covering the holes but learning from them
Safety management system (SMS)
Don’t rest on your laurels thinking you finally figured it out with a comprehensive SMS
Australian tunnelling example:
Young guy dies working near an airport
There were previous incidents with the contractor but no connection was made
Was doing normal work but decapitated finishing the design
An SMS will never pick this up
Don’t be led astray by the Decoy phenomenon
Only look at what we can count in normal work and ignore other signals
Example: Heinrich triangle – if we place our attention on the little incidents, then we will avoid the big ones (LTA, fatality) [now viewed as a myth like Accident Prone theory]
Some accidents are unavoidable – Barry Turner 1998 [Man-made Disasters]
Example: Lexington accident [2006 Comair Flight 5191] when both technology and organization failed
Complexity has created huge, intractable problems
In a world of complexity, we can kill people without precursory events
[If we stay with the Swiss Cheese Model idea, then Complexity would see the holes on a layer dynamically moving, appearing, disappearing and layers spinning randomly and melting together to form new holes that were unknowable and unimaginable]
Safety has become a bureaucratic accountability rather than an ethical responsibility
Amount of fraud is mounting as we continue measuring and rewarding the absence of negative incidents
Example: workers killed onsite are flown back home in a private jet to cover up and hide accidents
If we can be innovative and creative to hide injuries and fatalities, why can’t we use novel ways to think about safety differently?
Sense of injustice on the head of the little guy
Advances in Safety by Design
“You’re not lifting properly” compared “the job isn’t designed properly”
An accident is a free lesson, learning opportunity, not a HR performance problem
Singapore example: Green city which to grow must go vertically up. Plants grow on all floors of a tall building. How to maintain?
One approach is to punish the worker if accident occurs
Safety by Design solution is to design wall panels that rotate to maintain plants; no fall equipment needed
You can swat the mosquito but better to drain the swamp
Why can’t we solve today’s problems the same way we solved them back in the early 1900s?
What was valued in the Victorian Era
- People are a problem to control
- We control through intervention at the level of their behaviour
- We define safety as an absence of the Negative
Complexity requires a shift in what we value today
- People are a solution, a resource
- Intervene in the context and condition of their work
- Instead of measuring and counting negative events, think in terms of the presence of positive things – opportunities, new discoveries, challenges of old ideas
What are the deliverables we should aim for today?
Stop doing work inspections that treat workers like children
It’s arrogant believing that an inspector knows better
Better onsite visit: Tell me about your work. What’s dodgy about your job?
Intervene the job, not the individual’s behaviour.
Collect authentic stories.
[reinforces the practice of Narrative research http://gswong.com/?page_id=319]
Regulators need to shift their deliverables from engaging reactively (getting involved after the accident has occurred), looking for root causes, and formulating policy constraints
Causes are not things found objectively; causes are constructed by the human mind [and therefore subject to cognitive bias]
Regulators should be proactively co-evolving the system [CAS]
Stop producing accident investigation reports closing with useless recommendations to coach and gain commitment
Reference SD’s book: Field Guide to Investigating accidents – what you look for you will find
Question: Where do we place armour on a WWII bomber if we don’t patch the holes?
Answer: where we can build resilience by enabling the plane to take a few hits and still make it back home
[relates to the perspective of resilience in terms of the Cynefin Framework http://gswong.com/?page_id=21]
Resilience Engineering deliverables
- Do we keep risk awareness alive? Debrief and more debrief on the mental model? Even if things seem to be under control? Who leads the debriefing? Did the supervisor or foreman do a recon before the job starts to lessen surprise? [assessing the situation in the Cynefin Framework Disorder domain]
- Count the amount of rework done – can be perceived as a leading indicator although it really lags since initial work had been performed
- Create ways for bad news to be communicated without penalty. Stat: 83% of plane accidents occur when pilots are flying and 17% when co-pilots are. Institute the courage to speak up and say no. Stop bullying to maintain silience. It is a measure of Trust and empowers our people. Develop other ways such as role playing simulations, rotation of managers which identify normalization of deviance (“We may do that here but we don’t do that over there”)
- Count the number of fresh perspectives and opinions that are allowed to be aired. Count the number of so-called best practice rules that are intelligently challenged. [purpose of gathering stories in a Human Sensor Network http://gswong.com/?page_id=19]
- Count number or % of time re human-human relationships (not formal inspections) but honest and open conversations that are org hierarchy-free.
Spend less time and effort on things that go wrong [Safety-I]
Invest more effort on things that go right which is most of the time [Safety-II]
Don’t do safety to satisfy Bureaucratic accountability
Do safety for Ethical responsibility reasons
There were over 100 in attendance so theoretically there are over 100 stories that could be told about the day. Some will be similar to mine and my mind is open to accepting some will be quite different (what the heck was Gary smoking?) But as we know, the key to understanding complexity is Diversity – the more stories we seek and allow to be heard, the better representation of the real world we have.