Problem solve Get help with specific problems with your technologies, process and projects.

Key aspects of designing a data center for maintainability

Designing your data center for maintainability is key to successful operations and maintenance. Learn best practices to implement during the data center design and construction phases that will pay dividends on your facility's uptime for years.

Successful organizations recognize that to maintain continuous operations requires providing all the necessary tools and resources to the operations and maintenance (O&M) staff. Obviously, this means designing and building a concurrently maintainable and fault tolerant facility. But this is just the start. The O&M staff also requires the tools and resources to properly operate and maintain the facility over its life of 15 to 20 years or longer.

Owners should not view these aspects independently (design and build first, then provision staff and resources). Instead, both aspects should be considered holistically, beginning in the programming stage and continuing throughout the life of the facility. In effect, develop a culture and thought process where all of the interdependencies and requirements between design and construction and O&M are considered continuously.

This approach begins with what can be called "designing for maintainability." By establishing the O&M requirements as part of the original programming process, you can identify and take advantage of significant opportunities. The design and constructed facility must provide the physical capabilities to meet the O&M staff needs, and the O&M organization must have the capabilities to meet the facilities' needs. This starts with the Owner's Project Requirements (OPR) document.

Data center owner's project requirements document
The OPR document captures and details the functional requirements of a project. This document typically includes high-level requirements such as system redundancy, ability to operate without off-site utilities (and for how long this is possible), space planning and use, occupancy levels, etc. This document should also include O&M considerations like system/equipment naming conventions, valve and switch tagging requirements, storage and spare parts requirements, and as-built and closeout documentation requirements. By including these elements during the programming phase, the design and construction project team will by necessity have to ensure the project delivers both the physical infrastructure and the O&M foundations for continuous operations.

Once it is acknowledged that basic fundamental O&M requirements are necessary, it becomes obvious that early planning can save significant resources. Combining the O&M requirements definition with the design and construction phase and developing O&M processes and deliverables during construction not only assures these are in place on Day One, but also saves time, labor and money. Examples where this can be demonstrated include developing the maintenance procedures and program, training O&M staff, provisioning spare parts and establishing service-level agreements (SLAs), to name a few.

Organize Computerized Maintenance Management Software in design phase
The documented history of implementing Computerized Maintenance Management Systems (CMMS) is filled with gallant attempts that resulted in compromised success, blown budgets and sometimes outright failure. In many cases the fundamental problem is lack of proper planning and forethought. The easiest time to organize a CMMS program and develop naming conventions is during the project design phase. This allows the construction documents and installed equipment to be labeled consistent with the naming convention employed in the CMMS. The best time to collect requisite maintenance procedures, tool requirements and spare parts data is during the project equipment submittal process and when O&M manuals and other closeout documents are assembled. Likewise, by requiring the project team to formally consider the O&M needs during the project programming and design phase, these O&M information and documentation requirements can be embedded in the contract documents to ensure delivery by the contractors.

Another example is the negotiation of service-level agreements (SLAs) during the contract bid phase. Owners can negotiate the lowest total cost of ownership (TCO) by requiring vendors to include long-term maintenance proposals as part of their proposals for upfront construction. Aspects that can be disclosed and considered part of the overall contract award are fixed labor rates; unit costs for parts; minimum response times; and required critical spares to be kept on-site, on the technician's vehicle or in close proximity; escalation rates over time, and more. Vendors are typically much more competitive when SLAs are negotiated as part of the construction contract than when they're established after the fact.

Consistency from construction documents to O&M
Quality O&M procedures are built on clear and accurate documentation that reflects the physical environment. In other words, as-built drawings should include equipment IDs and valve and switch numbers that are consistent with those cited in O&M procedures, and which are accurate with regards to actual tags and labels found in the field. The effort to reconstruct these consistencies when not considered during the design and construction phase is not only significant, but increases the risk of errors. By requiring these conventions to be used throughout design and construction, the installed equipment, valves and switches can be permanently labeled during installation, and concurrently with the development of O&M procedures and the CMMS system. O&M manuals and Systems Operations and Maintenance Manuals (SOMMs) can be developed and assembled to reflect these consistencies.

Critical operations require detailed procedures that include all expected modes of operations. The modes and configurations are usually categorized as "normal," "maintenance" and "emergency" (and sometimes "recovery"). These procedures should reflect the design intent and optimize the installed redundancies to reduce risk and optimize plant performance. This starts with the design engineer's written sequences of operations, proceeds to the commissioning agent's Functional Test and Integrated Systems Test scripts, and culminates in final O&M procedures. Basically, the commissioning scripts validate the engineer's sequences of operations, and the O&M procedures are validated by the successful commissioning scripts.

By combining these development processes, the resulting O&M procedures will provide clear and concise direction to O&M staff on how to operate in all modes. The O&M procedure can include the sequence of operation, single-line and flow diagrams, step-by-step procedures (including valve and switch numbers), descriptions of expected responses and outcomes, and reference manuals, drawings, etc. Again, including the detailed requirements for these deliverables as part of the overall project requirements, rather than addressing them separately, can save time and effort as well as reduce errors.

Bring staff onboard for training during data center design phase
Properly trained O&M staff is a prerequisite to continuous operations. It has been documented that the majority of impact events at critical facilities are due in some part to human activity. It is unreasonable to design and construct a facility and then bring in a new O&M staff, provide a few hours or days of generic training, and expect them to have the necessary skills and knowledge to reliably operate these complex infrastructures. Instead, the training and site-specific education of the O&M staff should begin during the design and construction phase, with the objective of having a fully trained staff with all required information, documentation and resources available on Day One when the site commences critical operations.

The first step should be to identify and assign O&M staffing to the site and get them involved in the project as early as possible. Staff members should have input into the OPR document to ensure the O&M needs are considered, as discussed earlier. They should then manage the development of the O&M deliverables concurrently with design and construction to take advantage of the opportunities also mentioned earlier. They should participate in site tours and inspections throughout the construction process, support development of the commissioning scripts, and actively witness and participate in the startup and testing of the facility, including Factory Witness Tests, progress inspections, Pre-functional Tests, Functional Tests and Integrated Systems Tests. At the completion of these activities, the O&M staff is ready to fully absorb formal, site-specific O&M training that should occur at substantial completion, just prior to the site going live.

This strategy lays out a progressive training process in which the O&M staff gains overall familiarity with the design intent, constructed facility, operating modes, and maintenance requirements, and is fully trained and organized on Day One. The formal training should be video recorded and can include written materials that are saved to provide refresher training, remedial training and of course new hire training over the life of the facility.

Begin with the end in mind
According to Steven Covey's book The 7 Habits of Highly Effective People, it's important to "begin with the end in mind." The ultimate goal and objective is not to design, build, commission and deliver a critical facility. It is to provide continuous operations for the life of the facility in support of critical missions. So it makes sense to take a holistic approach to the design, construction and delivery of a critical facility that provides the best overall value and the highest prospect that continuous operations can be achieved. By designing for maintainability and including the requisite O&M needs with the physical facility's design and construction, you can save time and money, provide the capabilities and functionalities expected, deliver the highest quality products, and begin Day One operations with a facility and staff capable of optimal performance and maximum uptime.

ABOUT THE AUTHOR:Terry L. Rodgers, CPE and Associate Partner at Syska Hennessy Group, has over 25 years of experience in critical facilities operations and management. Terry earned a BSME from Virginia Tech in 1981. Currently at Syska Hennessy, Terry works in the Critical Facilities and National Commissioning groups and has been an active member of ASHRAE Technical Committee 9.9 since 2004. He is a member of Syska's Critical Facilities Technical Leadership Committee and chair of the Syska Green Critical Facilities Committee. Terry has co-authored various books, whitepapers and presentations on critical facilities.

What did you think of this feature? Write to's Matt Stansberry about your data center concerns at

Dig Deeper on Data Center jobs and staffing and professional development

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.