What are the root and contributing causes of the accident in the Gulf? What could have and should have been done to prevent the accident? If the potential for the accident could not have been absolutely prevented, what could have and should have been done to mitigate the consequences of such an accident? And if things were done, why were they apparently not done effectively? The nation deserves and awaits answers to these questions.
Any enterprise that is engaged in activities with the potential for public and employee harm should be required to develop and implement quality and risk management systems for the prevention of events with intolerable effects. Such management systems would focus on various types of analyses of the quality of hardware designs and process designs.
One such analytical system is Failure Mode and Effects Analysis, which is particularly useful in analyzing the quality of the design of a hardware item, for example a blowout preventer. A short and simplistic description of the analytical method is as follows. Each characteristic of the component is identified. For each characteristic, each mode of potential, credible failure is identified. For each credible mode of failure, the adverse effects of such failure are assessed. If any effect is intolerable, the design of the characteristic must be changed to eliminate the credible failure mode. If the design can’t be changed to eliminate the credible failure mode, something must be established to mitigate the effect of the failure – preferably something in the design, rather than in an operational procedure. (Care must be taken to identify credible failure modes that can exist due to the interaction of two or more characteristics in given states.)
Another such analytical system is Hazard-Barrier-Effects Analysis, which is particularly useful in analyzing the quality of the design of a process, for example a process for positioning and installing a blowout preventer. Again, a brief and simplistic description of the analytical method is as follows. Each task in the process is identified in sequence. For each task, each of the six “M”s that may be operative in the task is identified. (The six “M”s are  man,  machine,  material,  method,  measurement, and  mother-nature or man-made environment.) For each “M”, any hazard related to the “M” is identified and its potential adverse effect assessed. For intolerable effects, the process design must be changed to eliminate the potential hazard. If the hazard cannot be eliminated, multiple barriers must be established for the prevention of human error that can activate the hazard, as well as multiple barriers for the mitigation of the intolerable adverse effects of the hazard. (Not to get too technical but, again, care must be taken to identify hazards that can arise from the interaction of “M”s.)
Avery powerful analytical system is Probabilistic Risk Analysis or Probabilistic Safety Analysis. This analytical system is used to determine the ultimate effects or outcomes, called “end states” and the probability of each, given some undesired initiating occurrence. For example, given the loss of the primary power source on drilling rig, what are the possible outcomes or end states and what is the probability of each? To answer such questions, event trees and fault trees are used. The event tree shows the hardware systems that would come into play to respond to the undesired initiating occurrence and, based on the success or failure of each responding hardware system, an end state is arrived at. The paths from the initiating occurrence through each responding hardware system, with either its successful or failed response, to the end state, is called a “sequence”. Each sequence leads to an end state. Then a fault tree can be used to determine the probability of success or failure of each responding hardware system. Given the probability of success or failure of each responding hardware system, the probability of each end state can be determined. If an undesired end state has an unacceptably high probability, the design must be changed to lower the probability of that end state to an acceptable level. Are death and oil spill end states with probabilities of greater than one in a million per year acceptable?
Of course, in addition to management systems to assure quality of design, there must be management systems to assure quality of conformance to design.
Do oil companies have persons qualified to establish and implement such quality and risk management systems and are they voluntarily implementing such systems? If so, one must ask if they are being implemented with logic, rigor and consistency. Re ipsa loquitur. Does the event speak for itself? Do repeated events speak for themselves? When decision makers fail to recognize the need for such management systems (knowledge-based error) or when they implement faulty management systems (cognition-based error) or when they recognize the need but choose not to satisfy the need (value-based error), they’re making human error. We must recognize that human error extends upstream of the point at which the process was last touched – upstream of the point of the initiating error that occurs on the shop floor or in the field (reflexive-based error, error induced condition-based error, skill-based error and lapse based error.).
As certainly as it is not government’s role to engage in the oil business, it is government’s role to establish the rules of engagement. The government has failed to establish adequate rules of engagement in the oil sector – particularly with regard to rules for the implementation of quality and risk management systems with an emphasis on hardware and process design analyses.
The Congressional committee addressing the oil industry regulatory issues would do well to invite expert testimony on this subject. To listen to an audiocast which further explores this article topic, visit ASQ Weekly Audiocast - Ben Marguglio on BP Oil Spill.
About the Author:
Ben Marguglio is president of BW (Ben) Marguglio, LLC and Bookinars, Inc. He consults and presents seminars on quality and risk management systems, human error prevention and root cause analysis. He is a Fellow (since 1974) of the American Society for Quality (ASQ) and is certified by ASQ as a Quality Engineer, Reliability Engineer, Manager or Quality/Organizational Excellence and Quality Auditor. He is the author of over 100 management and technical papers and presentations and three books, the most recent being the Human Error Prevention Bookinar.