Dr. Gopal Jayaraman |
Mr. Kaushik Jayaraman |
When dealing with Process Safety, it is quite easy to envision a strong path to protection through better systems and maintaining their integrity, while overlooking the human factors that are involved in the continuous operations and maintenance of those systems. In an age of computerized digital information overload, it becomes prudent to cater to the personnel who are actually involved in handling the information. Alarms are a step -up towards informing these priority stakeholders, when they need to sit-up and take notice.
In our quest to improve systems, and add in the bulk technological advancements, we have come across a new labyrinth of hardware and related process information. The flood of information coming in to the operator leaves quite a few baffled. It is easy to understand how someone may miss a couple signals in the mayhem. This makes it necessary to help control the process data flow to the operator and make it possible to take actions, where required, when required. This, in essence, is the crux of the Alarm Management credo. Alarms should function to alert the operators to a change where an action by the operator is required at that point in time.
The implementation of a Process Safety Management (PSM) System revolves around the premise of preventing the Loss of Primary Containment (LOPC) of Highly Hazardous Chemicals (HHC) in a process. One of the key factors governing this is maintenance of the process within the intended operating envelope. The focus of alarms thus should be built around the identified safe operating process envelope. This paper showcases the issues faced by Essar Oil Limited at its Vadinar Refinery while moving towards better process control and safety through Process Alarm Management.
HISTORICAL RELEVANCE
Alarm systems were offshoots of the control systems developed to keep processes operating successfully to create desired products. This started with having "panel boards" in control rooms loaded with a slew of control instruments and indicators. These were in turn connected to sensors in the field, which relayed information through 4 - 20 mA current loops. Initially designed only for relaying information, the need for controlling process parameters drove the focus onto designing the panel boards on the basis of human factors and limitations. Alarms were formed as a displayed light beacon usually coupled with an audible horn (i.e. audio and visual signals) to notify the panel operator to the deviation from desired operational envelope either after the deviation or just on the verge of it. These were laid out in bunches reflecting the plant layout, making it easier to recognize and initiate corrective actions. It was deemed a simple matter to look at the entire panel board and decide where action was required.
But as plant complexities grew, these became more difficult to keep track of, and more operators and controllers were required. This increased the threat of human errors, and chances of failure. This was evidenced in high-profile incidents such as the Three-Mile Island Nuclear Meltdown (1979), Bhopal Gas Disaster (1984), Chernobyl Disaster (1986), etc. As part of increasing process integrity and, process and equipment monitoring, the advent of the Distributed Control Systems (DCS) both helped and complicated the alarm scenario.
FIGURE 1: CONTROL ROOM (GULF OIL CO., PORT ARTHUR) |
FIGURE 2: MODERN DCS-EQUIPPED CONTROL ROOM WITH CRT. |
FIGURE 3: THREE-MILE ISLAND, 1979 |
FIGURE 4: UNION CARBIDE, BHOPAL, 1984 |
FIGURE 5: CHERNOBYL, 1986 |
With the DCS, the engineer need not spend more time, money and space for the alarm setup, but all that needed to be done was simply type in a command location, and a set-value for the parameter to be monitored. This led to everyone alarming everything they could get their hands on. Instead of increasing process integrity, this led to operators being over-burdened and ignoring relevant alarms due redundancy or flood situations. It was not after several high consequence process events that people stopped to take notice. Milford Haven (1994), BP Texas (2005) are but some in the long line of incidents because of improper Alarm Management, where either alarms were acknowledged and forgotten about, or just observed as the usual effect. Another consequence that was noted was the restrictions on display size. Multiple screens represented a single unit and no complete picture emerged from a single display (BP Texas). Alarms had to become reliable and fast, but on an average every parameter had alarms at 80% and 20% of the high and low limits, which was hardly any use to the stressed operators.
ALARM MANAGEMENT PRINCIPLES
Alarm Management is essentially the application of knowledge of human factors (scientifically) in the engineering of plant instrumentation and system information, for designing alarms to increase their usability for managing abnormal situations.
In most processes, maintaining asset parameter at a set value is not completely possible all the time. This is why, identifying and establishing a Process Envelope becomes very important. Most often, abnormal situations are caused by the process parameters running out of control, i.e. disturbances beyond the "Process / Operating Envelope" (normal operating range), which may be of minimal or catastrophic consequence.
It is the responsibility of the Operations team to identify the cause of the situation, quickly, and execute corrective actions in a timely and efficient manner. For this to be possible, they should have an idea of what could go wrong, and an indication of when it does go wrong. This is the fundamental principle behind an alarm. The ultimate objective is to prevent, or at least minimize, physical and economic loss through operator intervention in response to the condition that was alarmed. Alarms should be set-up if and only if there are relevant operator actions connected to them, but ultimate plant safety should not depend on operator response to an alarm.
Many process plants devote considerable resources to rationalizing of the alarm systems, which would allow the operators to effectively manage the process instead of merely responding to the alarms throughout the shift. A well-designed and adequately functioning alarm system is crucial to the existing plant process safety. But simply staying within the limits is neither sufficient nor useful without knowledge of the critical operating limits. This can be better explained by the two pictures shown here. The limits around the normal operating parameter are shown in the Figure 6, while Figure 7 below showcases the actions induced by the same.
In essence, the Alarm indicating an "abnormal situation" should be easy to understand and presented at a rate that the operator will be able to deal with to initiate a corrective action. Every alarm should have a related corrective action.
Studies by the Abnormal Situation Management (ASM) Consortium have shown that worker actions cause 42% of the abnormal situations or upsets in process operations. A prime example would be the Three Mile Island Incident (1979), when operators could not understand the exact fault due to a "lit-up" panel and took corrective actions that actually led to the incident. The ASM Consortium also notes that 36% of the upsets can be related to Equipment problems, with half of these attributable to the equipment or process units functioning outside of their "operating envelope".
ISSUES & CHALLENGES
Alarm Management should start from the design phase itself, in order to help in the plant console design, layout, training, manual preparation, etc. But in most cases, the plants considered for alarm management are older, and the whole study should be started afresh. Besides, before the plant starts operation, it is difficult to design or configure optimal alarm settings. There may be too many alarms where a single parameter may reliably govern the process (consequential), or there may be too few to give the operator an accurate view of the actual process variance.
FIGURE 6: ILLUSTRATION OF THE LIMITS SURROUNDING A PARAMETER (OPERATING ENVELOPE)
FIGURE 7: OPERATION LIMITS AND ALARM SET POINTS
FIGURE 7: OPERATION LIMITS AND ALARM SET POINTS
Most standards and user documents, including from the American National Standards Institute (ANSI)/ the Instrumentation, Systems, and Automation Society (ISA) [ISA 18.2 Management of Alarm Systems for the Process Industries], detail the "What" and not the "How" of Alarm Management. Though, all of the standards or Engineering Practices (including the Engineering Equipment and Materials Users Association (EEMUA) 191, International Electro-technical Commission (IEC) 61508 and 61511) provide the basic framework for the implementation of the alarm management system at the facility. Most Distributed Control Systems (DCS) come with an abundance of unstructured alarms.
The Refinery (or any other Hydrocarbon Processing Industry (HPI)) is among facilities that undergo constant changes in an effort to improve their productivity and market share. This, in turn, initiates a change in the operating envelope. A process/ operating envelope (Figure 6) is a collection of boundary limits that, when exceeded, put the integrity of assets at risk. It becomes a challenge to monitor the change in the envelope and maintain the Alarm levels at the point where operator intervention is meaningful.
There is a lack of Process Safety Information (PSI)
FIGURE 9: DECISION ON ALARM SETTINGS OFTEN IS A DELICATE BALANCING JOB
Most standards and user documents, including from the American National Standards Institute (ANSI)/ the Instrumentation, Systems, and Automation Society (ISA) [ISA 18.2 Management of Alarm Systems for the Process Industries], detail the "What" and not the "How" of Alarm Management. Though, all of the standards or Engineering Practices (including the Engineering Equipment and Materials Users Association (EEMUA) 191, International Electro-technical Commission (IEC) 61508 and 61511) provide the basic framework for the implementation of the alarm management system at the facility. Most Distributed Control Systems (DCS) come with an abundance of unstructured alarms.
The Refinery (or any other Hydrocarbon Processing Industry (HPI)) is among facilities that undergo constant changes in an effort to improve their productivity and market share. This, in turn, initiates a change in the operating envelope. A process/ operating envelope (Figure 6) is a collection of boundary limits that, when exceeded, put the integrity of assets at risk. It becomes a challenge to monitor the change in the envelope and maintain the Alarm levels at the point where operator intervention is meaningful.
There is a lack of Process Safety Information (PSI) systems that defines the operating parameters, their limits, the alarm set points, the effects of variance beyond the Maxima and Minima values, the reaction time or priority, etc. Building up such a database forces the investment of a lot of time and working personnel that may be otherwise busy. Often times, it is a question of quantity vs. quality when deciding whether you use all your manpower for improving production. Is there a chance that your fine tuning may result in better control and response for the future? The effects of such an analysis will often tend to favour production as often against the intangible/ invisible safety benefits.
Alarm Flood Situations where the operators get so many alarms coming in that they don't know which actions to take first (over-alarming). This is where Alarm Prioritization and Rationalization come in. Yet, current systems may not reach effective levels for quite some time and need constant monitoring and corrections.
Relevance of Alarms also plays a major role in keeping Operators focussed. For example, an alarm with a lower set-point may give early indication of the abnormal situation developing but a very low value may add to the Nuisance value. The abnormal situation could take hours to develop and Operators may feel free to ignore or become desensitized to the alarms till values reach their self- managed set-points (stale).
Alarms have been observed to chatter, if operating too close to the envelope limits. These may also increase the number of alarms/ deviations from the operating envelope. It is essential for control room operating personnel to distinguish between false and genuine boundary excursions, and ensure that the deviations are relevant. This would make it achievable to minimize process upsets or loss of containment.
Operators must sometimes bypass or temporarily render alarms and emergency shut-down devices inoperative so they can either be tested to ensure dependable operation or repaired. Because the process unit typically remains in operation while these alarms or emergency shut-down devices are temporarily out of service, the ability to monitor the process units during this period for possible process upsets or possible need for shutdown of the process is diminished. As a result, it is important for a refinery to minimize the bypass time, communicate awareness of the degraded operational safety condition to all refinery personnel who need to know, and keep records documenting the rationale for, and confirming the restoration of, the bypassed components.
STEPS IN ALARM MANAGEMENT
Alarm Management may start at any stage of a Processing Plant's lifecycle. For Essar Oil, the journey started in 2009, with the establishment of a Process Safety Management (PSM) System, and the creation of a PSI database. As this neared completion in 2010-2011, the need for a proper Alarm Management system was felt, and acknowledged through various high-level committee meetings, and Internal and External Audits. Any facility hoping to move along this path must first acknowledge the problems that exist. At the Refinery, Alarm Management was done in three phases, viz.
PHASE 1:
ALARM SYSTEM PERFORMANCE STUDY
ALARMS RATE (FOR) |
ACCEPTABLE |
MANAGEABLE |
10 minutes |
1 |
2 |
1 hour |
6 |
12 |
1 day |
150 |
300 |
PHASE 2:
PERFORMANCE IMPROVEMENT PROJECT
SIGNAL TYPE | FLOW | LEVEL | PRESSURE | TEMPERATURE |
Dead Band % | 5 | 5 | 2 | 1 |
All of the results were tabulated for records.
PHASE 3:
LIFECYCLE MAINTENANCE & PERFORMANCE MONITORING
FIGURE10: ALARM PRIORITY THROUGH CONSEQUENCE MATRIX
ANNEXURE -3 CONSEQUENCE MATRIX
FIGURE 11: DEGREE OF OPERATOR RESPONSE
The EEMUA Reference Standard also suggests an additional performance benchmark of:
EFFECTS/ BENEFITS
The major actions taken at our site to achieve adequate alarm rates (in addition to the above) have been to disable/ suppress alarms (where no operator action was required), set Alarm On-Off delay time (to take care of chatter), or in some cases, a change in the logic. A drastic reduction in the number of alarms was noticed. At the Refinery site, the numbers reduced from an initial 2456 alarms across 19 Units per hour to 455 alarms (June to August 2012). At the Power plant, there has been a more drastic change from an initial number around 9000 to around 500 alarms per day. The process is still under progress and review, keeping in view the almost constant modifications and expansion activities that are in progress at Vadinar.
It should be noted that the whole process has lent a better understanding of the system among the Operations, and other Maintenance personnel. It has improved operator response time, and complimented the Process Safety Information. There is a better monitoring of process excursions, which is being developed to monitor Critical Operating Parameters (COP). The reduction in alarms also improves operational situation response, affecting production and emergency response activities positively, as has been noted in Process Mock Drills, conducted in the plants. Another positive benefit has been increased reliability of equipment and the process. The alarm studies led to identification and correction of a few logic changes consistent with design and operational safety.
CONCLUSION
The demands on Operations are increasing due to a variety of factors, such as: (A) the need for process operation close to maximum efficiency; (B) higher costs of process interruptions; (C) more complex processes; (D) lower safety margins; (E) environmental regulations; (F) fewer operators; or (G) higher staff movement (less experienced operators).
Alarms and instruments form a vital link in communications between important parts of the process and the operator. Without properly functioning alarms and instruments, it is difficult to know the operating status of the process and safety equipment. It is essential that unique programs be present for the care and attention of these alarms and instruments. The Mechanical Integrity program should encompass these and any bypasses be done through Risk Assessment after appropriate level authorizations. Any alarm changes should go through a proper Management of Change process.
While alarm rationalization and prioritization studies do often end up reducing the alarm rate, the essential thing to remember is that the Alarm Management process is not towards only reducing the number of alarms, it is towards optimizing so that operators are able to react properly to avoid any loss of containment. Resorting to suppression is not suggested. Nuisance alarms can be significant early warning signs to maintenance issues on critical plant process and safety equipment.
Monitoring the Alarm Rate as part of the Process Safety Performance Indicators (PSPI) can go a long way towards helping maintain and fine tune existing systems. In the newer systems, Alarm Management is becoming an integral component of the initial design itself, and is being incorporated in the Vendor Packages, such as advertised by ABB, Honeywell, etc. It should also be noted that the establishment of Critical Operating Parameters (COP) and the Process Envelope will also be beneficial in understanding the process better, leading to better monitoring of the process excursions beyond stated envelope, and ultimately reducing/ stopping Loss of Primary Containment. It should be always remembered that alarms are a part of the overall plant Layers of Protection, and as such should be maintained reliably.
REFERENCES
[*Gist of the Technical Paper presented on ALARM MANAGEMENT and the Experience in ESSAR OIL REFINERY during a PSM SEMINAR at Kualalampur , Malaysia in April 2014.]
AUTHORS:
DR. GOPAL JAYARAMAN, Head (HSE), Energy Business, Member: ASSE
The author has helped develop the HSEF Department together with the Essar Refinery Integrated Management System for the Refinery and is currently working on establishing a World Class Safety Performance for Essar Energy Business (Essar Refinery, E&P and Power). He has published Papers and received numerous awards for the Essar Group.
Dr. Jayaraman graduated in Chemical Engineering (1971) and obtained PhD in Environment Science & Ecology with special focus in Oil & Gas Sector. He has over 40 years of Industrial Experience in Operation, Project Execution and HSE in Oil, Gas & Petrochemicals (Upstream & Downstream).
He can be reached by mailing to Jayaraman.Gopal@essar.com/ jayagopal6@gmail.com (LinkedIn/ Facebook Profiles)
MR. KAUSHIK JAYARAMAN
Dy. Manager (Process Safety Management), Essar Refinery
Member: ASSE, IChemE, IIChE
The author has been a part of the Process Safety Management division since 2011 and has worked in development and implementation of the Process Safety Management System at the Refinery.
Mr. Kaushik completed his graduation (B. Tech, Petrochemical Technology) and post-graduation (M. Tech, Gas Engineering) studies with Gold Medals and has Diplomas in Fire Safety, Industrial Safety, Project Planning Management, and Work Place Safety. He has Industrial Experience in Pipelines, Refining, and Safety (Operational, Process, and Occupational) in Oil & Gas.
He can be reached at Kaushik.Jayaraman@essar.com / jkeins@gmail.com (LinkedIn/ Facebook Profiles)
Other Contributors
1. Mr. Rajesh Shah (Head-Safety, Essar Oil Ltd),
2. Mr. Prakash Pathak (JGM, Instrumentation, Essar Oil Ltd),
3. Mr. Rajesh Mandaliya (Sr. Manager, Instrumentation, Essar Oil Ltd),
4. Mr. M. K. Sharma (Head-Operations, Essar Power – Vadinar Power Company Ltd)