Save money, time, and frustration with early operations intervention
The pace of growth and change in our industry has made for thrilling careers for everyone. Those who have thrived have been those who learned and adapted the fastest. Those who embrace the notion of bringing operations in at the early stages of designing a data center benefit from cost savings and speed to market because it eliminates rework and increases reliability from day one. About ten years ago, Lee Kirby wrote a paper for the Uptime Institute Journal titled “Start with the End in Mind.” It was a position paper that culminated his lessons learned, having worked with the major players as the industry proliferated. With key leaders sharing the message, this article drove a healthy discussion within the industry.
However, the adage that if you keep doing the same thing, you will get the same result holds. Given that the same legacy mindset still hampers many data center build projects, I wanted to share the most common opportunities to bring operations in during the design of a build or retrofit project. The two important categories to focus on are Maintenance and Serviceability and Operational Risk.
Maintenance and Serviceability
Training for operations staff
Training from equipment and software vendors on all in-house systems, including but not limited to Building Management System (BMS), Access Control, Closed-circuit Television (CCTV), Heating, Ventilation, and Air Conditioning (HVAC), Uninterruptible Power Supply (UPS), generators and any other equipment and/or systems, that are required by the Operator to manage the site operations effectively.
Data center space planning involves the careful consideration and management of physical resources such as floor space, power, cooling, and networking infrastructure to ensure optimal utilization and availability of IT equipment. Space to work safely and allow for the replacement of significant components within an asset is essential. Rear access to equipment is often required for maintenance and repairs.
Determining which equipment can be installed above the floor or on rooftops is often determined through a space planning lens. Operations will need safe access equipment to maintain, collect data or repair equipment. Requiring lifts is time-consuming, expensive and creates additional safety risks.
Asset lifecycle strategies
This involves the planning, management, and optimization of IT assets throughout their entire lifespan, from acquisition to disposal. Installing larger assets like transformers, generators, chillers, and Remote Condensing Units (RCU) during building construction is easy before doors and walls are put up. But fast forward 10-15 years later, when it’s time to physically replace those assets in a live data center; it can create significant challenges, especially when the work may require shutting down critical load and removing walls.
Open grate flooring
A raised flooring system that allows for better airflow and cooling in data centers, as well as easier access to underfloor cabling and infrastructure. This is common for the multi-story gantry approach but creates safety concerns as any tool or nut/bolt dropped may be lost or has the potential to impact infrastructure located on the floor below. It’s also very challenging to maneuver heavy equipment/material across and around open grate flooring. In addition, heat rejection from one asset can directly impact another (e.g., heat from ground generators rising to chillers above).
Domestic water services above critical plant or data rooms
The installation of plumbing systems that deliver potable water to a building’s critical infrastructure, such as plant rooms or data centers, from above rather than below. This is a simple one to prevent, but you still see it today. Typically, it is addressed by installing bunds, drip trays, and leak detection, but water always seems to find a way around protective measures. Chilled water is considered an exception, as automatic measures can detect and isolate leaks.
The Security Operations Center (SOC) and Facility Operations Center (FOC) should be powered by a UPS
All power outlets in the entire Security SOC and FOC shall be on a UPS so that it remains powered up for some time in the event of a loss of power to the site, or parts of the site, for whatever reason, until the primary power is restored, or the emergency power systems are available to provide emergency power to the site, whichever is first. When an incident affects critical load, it’s valuable to have segregated monitoring, troubleshooting, and access control capabilities.
Clearly identifying and labeling IT assets, equipment, and infrastructure within a data center environment for easier management, troubleshooting, and maintenance. Labeling must be fully complete, unique, correct and up to date. Operators depend on it. All equipment shall be uniquely and visibly labeled. Labeling shall be consistent and accurate throughout all design and as-built documentation to assist operations and security staff in the execution of their duties confidently and accurately.
Automatically synchronize system clocks
Owners and Operators should agree that all in-house systems shall have automated synchronization of clocks, managed by the owner and implemented during equipment and system installation to ensure all timestamps are identical. Building a timeline of historical events while troubleshooting an incident is a lot easier when all system clocks are in sync.
These are the lessons we have learned, and they’re opportunities to bring operations into the design phase of a build or retrofit project. A build or retrofit project that considers these lessons learned from the onset will deliver long-term operational effectiveness and less risk to tenants while avoiding rework and delays that these can cause because the return on these investments is how quickly you can populate the data center with the revenue generating businesses that drove the demand in the first place. Now is the time to change the way we think about operations.