Top 5 Data Center Operations Tips to Reduce Human Error in Critical Environments

Top 5 Data Center Operations Tips to Reduce Human Error in Critical Environments

Salute Mission Critical shares data center operations tips on how to reduce human error and reduce downtime.

The Uptime Institute claims approximately 70% of data center outages are caused by human error; leading to the largest cause of downtime in our industry, and the most preventable. We can trace human errors back to three interrelated aspects – the job, individual and organization. The job may be complex in nature; at times, demands can be overwhelming and procedures may not be as clear as they should be. For individuals, we consider confidence and competence in their attitude, personality, and skill level. Organizationally, resources, communications and culture can impact behavior and lead to errors in performing work.

These can affect an individual’s ability (and create limitations) to focus effectively on managing risk and reducing human errors. The following tips and tricks are used throughout a successful data center operating model to mentally stimulate technicians’ focus and reduce the likelihood of an outage.

Locating Precise Locations with a Data Center Grid System

A grid system for floor and cabinet locations will help technicians navigate the white space and locate the precise location of the facility or IT infrastructure equipment. Start with letters in sequence on one wall, aligned with floor tiles or spaced two feet apart for solid floor environments.

data center grid system

Use double letters once A-Z has been exhausted (i.e., AA, AB, AC and so on). The adjacent walls will be labeled with numbers in sequence using the same spacing. This will create specific locations for moves, adds, changes or maintenance and repair activities. For example, there will be a location in the room called F10 that will allow technicians to visually see the labeling F and 10 near the ceiling on two adjacent walls. Continue the grid system for labeling assets in cabinets that normally don’t get relocated, such as power strips, patch panels and cables.

A Reference Guide to Verify Normal Data Center Operating Conditions

Placing small circular magnets on facility infrastructure equipment can be a quick reference guide to verify normal operating conditions. It’s one more aid to help technicians who perform rounds and readings confirm what’s expected and easily identify potential issues. Put the magnets near a breaker or toggle position and when applicable, colored magnets can further correspond to expected conditions. For example, a green magnet by an open breaker and a red one by a closed breaker is an easy way for a technician to match what they are expecting to see visually.

magnets on facility infrastructure

Warning: Hazardous Equipment or Materials

A tactile warning on door handles such as knurling or abrasive tape applied to the contact surface provides an additional reminder that a technician is about to enter a room that contains potentially hazardous equipment or materials. Some jurisdictions may require tactile warnings for doors to hazardous areas. Whether during scheduled maintenance or an evacuation, the sense of touch can trigger a reminder that what’s behind the door is an area to use extra caution.

door handle

Wayfinding: Enhancing the Experience of the Facility

Most data center operators use color coding to reduce human error; for example, distribution path A is blue and path B is green. Some data centers have taken this a step further and applied the colors to front and side panels on facility infrastructure equipment. Colored labels on this equipment should be used at a minimum; having panel covers painted to match takes risk mitigation to the next level beyond the standard black, white and gray.

Cover Up: Mitigate the Risk of EPO Activation

Emergency Power Off (EPO) activation is almost always accidental and EPO buttons should be protected by a hinged plastic cover to mitigate the risk of accidental activation. The cover adds one more step to activation as the cover must be lifted before the button can be accessed. To further mitigate the risk of an unintended EPO event, wire the cover with a pressure sensitive contact switch that will sound an alarm the moment the cover is lifted.

Human error is inevitable but these tips and tricks can help technicians focus on the work they are about to perform and minimize human errors in their data center.

Mike Jones, SVP Facility Operations

Human error is inevitable but considering and implementing these tips and tricks can help technicians focus on the work they are about to perform and minimize human errors in their data center. To learn more about the best practices and programs we use to manage risk in data center operations at Salute Mission Critical, contact us.

Join Us On Social

Get Salute Insights In Your Inbox

Your data will not be shared outside of Salute and used only to provide you with information. For more details, please refer to our Privacy Policy.

Featured Resource

We’ve Solved the Data Center Talent Crisis. Here’s How

We’re able to support critical infrastructure around the world at scale because we’ve solved the talent challenge that plagues even the most sophisticated hyperscalers, colo providers, and edge operators.

The ‘secret’ is our unique people, processes, and technology. See them in action in our Data Center Operations Playbook.

Salute in Action

Case Studies

You Might Also Like

Get Salute Insights In Your Inbox

Your data will not be shared outside of Salute and used only to provide you with information. For more details, please refer to our Privacy Policy.

Get Salute Insights In Your Inbox

Your data will not be shared outside of Salute and used only to provide you with information. For more details, please refer to our Privacy Policy.
Scroll to Top