Resilient data centers withstand damage to the network cabling, power outages, unexpected server downtime, attacks or surges in user demand and other events by avoiding single points of failure and using adaptable IT systems.
In terms of physical infrastructure, data center resiliency increases by installing multiples. If one PDU fails, another is at the ready to take over its workload. If one carrier goes down, the business can run all its workloads over a second carrier's connections. N+1 redundancy means that there's one extra piece of equipment for however many pieces you have operating. N+M could mean two, three or any number of multiples for that equipment. 2N designated an infrastructure that's completed duplicated for protection.
The definition of data center resiliency has evolved as IT technologies such as virtualization and micro services joined traditional uptime boosters like N+1 redundant power supplies and equipment. Be prepared to discuss distributed applications and server clusters in a job interview. Distributed applications run on multiple servers simultaneously, and may be architected as independently deployed micro services. Server clusters prevent outages with high availability or fault tolerant virtualization, allowing VMs to jump from one server to another in the event of a problem.
Some data centers today are designed to fail, meaning that workloads easily jump off of problem servers and onto new ones, across a server cluster and even across multiple data centers. This architecture is moving from the domain of Web-scale content providers into enterprise IT.
When you talk about data center resiliency to prospective employers, relate the mechanics -- diverse network lines, fault-tolerant clusters -- to IT service and Business continuity. No matter how beaten up your physical systems are, or how many outages are rippling through the IT systems, the data center continues to deliver mission-critical services.