Monday, 10 October 2016

Exploring the Feynman Gap

Wolswijk (23 October 2014)
Engineers tend to know about the "Feynman Gap" (or Feynman Technique): largely because of the NASA Challenger disaster.

But it is not well known in the management field, and I think it should be. 

NASA's the space shuttle Challenger had an accident on 28 January 1986. William Graham, head of NASA, asked Caltech Professor, Richard Feynman, to join the investigation committee, known as the Rogers Commission (Feynman & Leighton, 1988; Rae, 28 January 2014).

Professor Feynman joined the commission, despite his misgivings about the way the investigation was being run, the people they were expected to talk to (ie, to managers rather than the technicians and engineers who had actually been doing the work) and the highly bureaucratic and formal way that the investigation processes were structured (Feynman & Leighton, 1988).

What Professor Feynman discovered - through his informal and direct approach to determining the process failure in the NASA system - was that at the pre-launch teleconference, engineers were worried that the cold temperatures being experienced at the time of launch would cause the O-rings - which actually did fail - to fail (Feynman & Leighton, 1988; ESReDA, 2015). 

Professor Feynman tried to find out what the thinking was at that teleconference. During the teleconference, a senior manager had asked "one of the engineering managers [worried about the risk of failure] to put on his 'management hat' instead of his 'engineering hat'". That manager then changed his risk assessment, saying that risk was low and the launch needn't be delayed.  

When talking to the engineers about that decision-making in that situation afterwards, Professor Feynman got them to write down what their evaluation of likely failure had been, and relates: 
"All right," I said. "Here's a piece of paper each. Please write on your paper the answer to this question: what do you think is the probability that a flight would be uncompleted due to a failure in this engine?" They write down their answers and hand in their papers. One guy wrote "99-44/100% pure" (copying the Ivory soap slogan), meaning about 1 in 200. Another guy wrote something very technical and highly quantitative in the standard statistical way, carefully defining everything, that I had to translate—which also meant about 1 in 200. The third guy wrote, simply, "1 in 300." Mr. Lovingood's paper, however, said, Cannot quantify. Reliability is judged from: • past experience • quality control in manufacturing • engineering judgment. “Well”, I said, “I’ve got four answers, and one of them weaseled.” I turn to Mr Lovingood, “I think you weaseled.” “I don’t think I weaseled.” “You didn’t tell me what your confidence was, sir; you told me how you determined it. What I want to know is: after you determined it, what was it?” He says, “100 percent” — the engineers’ jaws drop, my jaw drops; I look at him, everybody looks at him — “uh, ugh, minus epsilon!” So I say “well yes, that’s fine. Now the only problem is, WHAT IS EPSILON?” He says “10 to the minus 5. (Feynman & Leighton, 1988, p. 183; ESReDA, 2015, p.31)
10 to the minus 5 is 100% minus .00005. So pretty much 100%.

But none of them communicated their means of calculating their evaluation of risk to each other on 27 January 1986, and they allowed the launch to proceed despite their misgivings on 28 January.

The most surprising thing for me is that, when asked to 'change hats', an engineer moved from a 0.2% or 0.3% likelihood of failure to a 0.00005% likelihood of failure. That is a significant shift in perceived risk.

The outcome of Professor Feynman's investigation was that managers underestimate risk. We self-censor, and do not clearly communicate risk, or magnitude of risk. This is an inherent risk of management, and forms what has now become known as the Feynman Gap.

How we can guard against that is by using the Feynman Technique: explain things as if you were explaining to a five year old. Simplify. Use an analogy. Ensure understanding, not agreement.

Point out the consequences if things go wrong.

That keeps us honest.


Sam

No comments :

Post a Comment