Fundamentals of Operational Reliability

Fundamentals of Operational Reliability


Operational reliability is a priority for any process unit facility.  A definition of Reliability is: “The ability of an apparatus, machine, or system to consistently perform its intended or required function or mission, on demand and without degradation or failure.” Reliability is something that owners and operators consider a priority, but achieving stable reliable operations is usually easier said than done. A reliable operation typically has results that are characterized as safe, stable and maximize profitability.

This blog will address, at a high level, the fundamentals of what is required to implement and achieve a reliable operation.  These fundamentals will include the principles of “Design it Right; Operate it Right; Maintain it Right”.  Reliable operations require competent leadership and a employee/contractor culture of accountability. It has been demonstrated that an operation with excellent safety performance will likely have solid reliability performance.  There are exceptions to this “rule of thumb” but in most instances it holds true as the fundamentals of good safety practices used by employees and contractors also translates into good work practices relating to equipment performance.

Figure 1 below illustrates the functional relationships and intersections needed to achieve operational reliability.

Intersection of reliability functions
Figure 1

Design It Right

Designing process equipment that is “fit for service” is generally the result of any grass roots/new construction facility.  The design conditions used initially are integrated across all operating units allowing them to function reliably to produce products as intended. Design should incorporate Recognized And Generally Accepted Good Engineering Practices (RAGAGEP). However, initial design conditions invariably change as the operation adapts to changing environmental, market demands, customer demands, equipment efficiency upgrades, etc.  When the equipment was initially designed there should have been reliability objectives broadly defined in terms of:

    • Reliability philosophy including benchmarks, availability, affordability of redundancy, etc.
    • Operating conditions including process conditions, environmental factors, unit run lengths, etc.
    • Unit operating modes and performance expectations
    • Equipment performance including benchmarks, failure modes, life expectancy, etc.
    • Equipment design including simplest design for rotating equipment, instrumentation philosophy, etc.
    • Specialized operator and maintenance skills


However, changes to the outlined above may require equipment alterations including new process units and these conditions now must be integrated into the operation. Changing operating limits and procedures typically can and do impact downstream operations for other equipment.  What is required for any changes to equipment is a rigorous management of change (MOC) process to understand the technical, operating, and maintenance impacts on the operation. All too often process equipment changes are designed with ineffective MOC changes to assess risks and implement steps to mitigate these risks.

Operate it Right

Operations is the “owner” of Reliability.  This translates into direct responsibility to ensure that both the operating procedures, technical input/changes and maintenance work is done in accordance with best practices. Key aspects of Operator Reliability:

    • There should be an annual Reliability Improvement Plan that is driven by and supported by Operations
    • Operations should expect accountability for any aspect of change that might affect its operating windows/limits and operating modes.
    • Ensure that operating windows are in place at all times and that changes to equipment, feedstock, process rates, etc. have been properly vetted and steps taken to mitigate risks including necessary operator training.
    • Established weekly/monthly scorecard to track Reliability performance including, environmental and personnel safety, incidents and follow up, equipment performance for rotating, I/E/A, bad actors. Each of these metrics should have benchmark driven targets and goals embedded in the scorecard.
    • Operators must be “checked out” on the equipment, appropriate operator rounds and recording of results; early identification of equipment performance that indicates unstable or substandard performance is critical to assessing risks and follow up actions.


Operations “owns” the equipment and is responsible for it’s care and performance; they hold others accountable to support the overall Reliability goals.

Maintain it Right

For many years the notion of reliability and maintenance have been linked together and this can create the belief that Maintenance is responsible for Reliability.  This, of course, is not true and clearly Maintenance is an important feature to support Reliability but there is much more to consider.  It is the fundamental role of Maintenance to focus on:

    • Executing work in the most efficient way possible to support the operation and technical priorities. It is not the responsibility of Maintenance to decide what these priorities are, rather, these priorities are decided by Operations with support from both Maintenance and Technical
    • Leadership and support in developing Reliability improvement plans and initiatives that support Operations and Technical goals
    • Well documented work processes detailing how work is identified, planned, scheduled and executed including a continuous improvement loop.
    • Preventative and predictive maintenance programs in place for all equipment classes
    • Develop a reliability scorecard with KPI’s of maintenance efficiency (e.g. Work order management, backlog, etc.), equipment reliability (e.g. pump mean time between failures [MTBF]), compressor availability, piping leaks, etc. are critical to the Operation’s understanding of Reliability performance. These key performance indicators (KPI’s) should be of joint interest with Maintenance, Operations and Technical leadership


It is worth noting that many excellent practices exist to improve Maintenance focus and productivity and support to Operations/Technical. Although there is insufficient time available in this blog to detail these practices, a sample of these are outlined below:

    • Reliability centered maintenance to identify, categorize, risk assess equipment
    • Risk based work selections (RBWS)
    • Equipment strategies for key equipment to develop operating and repair procedures for critical equipment
    • Bad actor identification and elimination


The above is a high level overview of Reliability fundamentals and touches on only a few of the important aspects of process equipment Reliability. Key principles of Design it Right, Operate it Right and Maintain it Right create a foundation to build Reliability improvements. Aspects including Turnaround planning and execution, Performance benchmarking, Project Total Quality Assurance (TQA), Project development, Operations Autonomous Maintenance, etc. are some of the many best practices available to support Reliability.

The above fundamentals, although may well ring true with most readers, in many cases are improperly implemented or missing altogether in process based operations. This tends to be the case for smaller, stand alone Refining or Chemical plants where insufficient resources or leadership exist to develop and implement reliability systems.

Becht has both the capability to assess Reliability performance (including cold eyes reviews with protocols) and subject matter experts (SME) to support all aspects of the above fundamentals.  Operational Reliability cannot be “bought” by simply rebuilding or building new equipment.  Operational Reliability can only be achieved with the proper intersection of designing, operating and maintaining equipment right.


About The Author

Bruce Scott began his professional career in with Shell Canada for 10 years, then moved to Imperial Oil for 25 years and retired in 2010. He holds a degree in mechanical engineering and is a member of the Professional Engineers of Ontario. His experience is broad based including pipelines, marketing facilities, crude oil supply, facilities planning , numerous refining operating assignments in maintenance, projects, reliability, technical leadership and oil sands operations. During his refinery operating assignments in both Europe and Canada, he worked to develop and implement industry best practices for equipment reliability. Since retirement, he has provided a broad scope of consulting services through Becht to the petrochemical industry in North America and the far east.

Authors Recent Posts

Fundamentals of Operational Reliability

Leave a Reply

Let Becht Turn Your Problem
Into Peace of Mind