process safety management (psm) books

Human Reliability Analysis



Home

Bookshop
Seminars/Webinars

Incidents
Management
Occupational Safety Offshore Industries
Onshore Industries
PSM
SEMS
Technical Safety
  ALARP
  ERPG
  Fault Tree Analysis
  FSA
  Human Reliability

Acronyms / Definitions
Affiliates
Annotums
Citations
Examples
Organizations
Privacy / Commercial
Site Map

Contact Us

Human Reliability Reliability, Maintainability and Availability (RAM) programs generally require reliability information and data for human performance in addition to that for equipment and instrumentation, but also for human performance.

Human Reliability Analysis (HRA) is used to determine the probability that a task or activity will be completed successfully within a required time period, and that no other human action that could be detrimental to system performance will take place. An HRA analysis can also help identify areas where potential improvements can be made.

Human Error

Most accidents and operating upsets involve some sort of human error. For example, Geyer et. al (1990) state that operator error is a direct cause of nearly a third of all pipework failures. (These errors consisted mostly of inadequately cleaning lines and incorrectly setting valves.) Indeed, it is almost certain that some type of human error will be involved in incidents because usually the operator being, in Trevor Kletz’ phrase, ‘the last man on the bus’ always has a chance to stop the chain of events. If he or she fails to do so he or she is not to blame for the event — after all there were probably many other mistakes made by supervisors, managers, engineers and designers prior to the final operator error. Looked at in this manner, all failures can be attributed to errors made by human beings somewhere in management chain.

Errors can either be of commission or omission. Errors of commission typically involve failure to follow procedures, taking a short cut or making an (incorrect) assumption about the validity of an instrument reading. Errors of omission often occur during the response phase of an incident. For example, an operator may fail to isolate a tank that has already started to overflow.

A common human error occurs when an operator or supervisor does not realize that he or she has exceeded a safe operating limit. Not realizing how far out of control the operations have become, they decide to fight the problem rather than shut down and bring the facility operations to a safe state.

Various types of human error are discussed by Mostia (2003). They can also be categorized as shown in the figure below.

Human-Reliability

Errors of Intent

Errors of intent are a special type of error that occur when supervision or management knowingly decide to over-ride the normal operating or safety procedures. They may either break a rule, such as deliberately not following a procedure, or they violate the intent of a standard policy. For example, an operations supervisor may choose to ignore a lab result or an instrument reading, either because they do not believe it or because they are prepared to over-ride the implications of that undesirable piece of information.

Errors of Action

Errors of action can be placed into one of the following four categories:

  1. Mistakes;
  2. Slips;
  3. Fixation; and
  4. Error in an Emergency

Mistakes

A mistake (sometimes referred to as a cognitive error) occurs when a person acts on an incorrect train of reasoning, often because he was not properly informed as to what to do or how to do it. A mistake can be defined as follows:

A mistake is a human error that is a failure in diagnosis, decision-making, or planning.

Mistakes can be further divided into those that are ‘procedural’ and those that are ‘creative’. A procedural mistake occurs when, for example, there is a lack of clarity in the operating instructions, thus causing an operator to misinterpret them. A creative mistake occurs when a brand-new situation develops­, often during an emergency, and the operator has to develop a response on the spot, often in a very short period of time.

Slips

A slip occurs when a person makes an error, even though that person knew what to do and how to carry out a task. It is defined here as:

A slip is a human error resulting from failure to carry out an intention, even though the person concerned had the capability, time, and equipment to successfully carry out that intention.

Slips usually occur during normal, routine, non-stress situations. For example, an operator may routinely take two samples from a certain section of the plant every shift, and he may have successfully performed this action hundreds of times. Then, on one occasion, he slips up and inadvertently switches the samples. Mistakes imply thinking; slips imply routine.

Worker fatigue is a common reason for the occurrence of slips.

Fixation

Most people have trouble grasping and understanding complexity, so they tend to fixate on the one or two solutions that they believe can resolve problems — even if those solutions are not correct. Examples of fixation include:

  • A plant experiences operating problems over a period of days. Different shifts witness different aspects of the problem, and so come up with different causes and proposed solutions. The people on each shift tend to discount the opinions of the other shifts because ‘seeing is believing’; people place more credence on their own experience than on the un-witnessed experience of others.


  • During an emergency, an operator is typically swamped with a large amount of information from the control panel; much of the information is confusing or apparently self-contradictory, particularly if one or two instruments are in error. In such situations, most people tend to fixate on one or two instrument readings, and then exclude all other information, regardless of its relevance. (Fixation was an important part of the Three Mile Island nuclear power plant incident, where operations personnel chose to believe a faulty instrument, even though many other instruments were indicating that the signal from the first instrument was incorrect.)

Error in an Emergency

A rule of thumb is that human error rates rise to 50% during an emergency; i.e., there is a 1 in 2 chance that a person will do the wrong thing during the high stress conditions of a plant emergency. Therefore, if an operator is called upon to perform, say, six tasks during an emergency, the chance of getting them all right is 0.56, which is 1.6% — in other words, he will almost certainly fail to implement the full sequence of tasks correctly. Consequently operators should not be expected to control a facility during an emergency. At most they should carry out a few automatic actions in which they have been thoroughly trained, and then turn over control of the plant to the instrumentation and to the trained emergency response team.

THERP

One method for analyzing human reliability is a straightforward extension of probabilistic risk assessment (PRA) - in the same way that equipment can fail, so can a human make mistakes and slips. One technique for predicting human error rates is the Technique for Human Error Rate Prediction (THERP), which was developed in the 1950s. As with other PRA techniques, THERP models can use either point.

A THERP analysis considers different types of error, such as not following an instruction, choosing a wrong switch or skipping a step in a sequence of activities, and forecasts the error rate for each of these tasks. If a person can make more than one type of error when carrying out a task, then the probabilities are added to one another. For example, when opening a valve an operator may:

  • Open the wrong valve;
  • Skip the step altogether; or
  • Open it only part way.

If the respective probabilities for these errors are 0.01, 0.03 and 0.03 then the overall error rate is 0.07 (excluding second order terms). It is also possible to factor in recovery rates. For example, if the wrong valve is selected, then there may be a 40% probability that the operator will recognize and correct the error while there is still time, thus reducing the overall probability of error to 0.6 x 0.01, or 0.006.

A THERP analysis is most effective when the tasks are routine and when there is little stress.


home | top of page | view cart

Copyright © Sutton Technical Books 2007-2012. All rights reserved

6340 N. Eldridge Parkway, Ste-I #206
Houston, TX  77041