IT leaders constantly dream up ways to meet data center requirements for performance and efficiency, but time and...
money always seem to quash grand plans.
Not every IT infrastructure project needs to be a time-consuming, capital-intensive, paradigm-shifting corporate initiative. Quick and easy updates significantly benefit data center facilities and IT performance, and act as a training ground for new employees.
1. Upgrade server hardware
Strategic memory and local disk upgrades give servers quick and easy performance or capacity boosts.
Memory is a limiting resource in virtualization, and servers rarely come with a full complement onboard. Inventory unused slots and add memory to assist existing VMs or accommodate future server consolidation.
Solid state drives (SSD) are a local disk storage upgrade for strategic servers. SSDs improve I/O and lower latency, ideal for workloads sensitive to storage bandwidth. SSDs can accelerate performance if a server's workloads rely on disk caching. Rather than rip and replace all the disk drives, add an SSD to a server's local storage to clear bottlenecks and stop errors.
Server firmware upgrades are fast and free, but also disruptive. Only perform them to fix specific problems like hardware or operating system support. Check your asset management inventory and get a list of the current server models and firmware versions, and then check the server vendors' download sites for updates. Ascertain via the details or release notes whether the update actually solves a problem for you. Peripheral interface and adapter devices also have firmware that may need updates.
Memory and disk upgrades pose downtime (unless hot plugging) and re-racking issues. "RAM upgrades are cheap and effective, but ... it's not exactly an 'in place' upgrade," said Pete Sclafani, COO and co-founder of 6connect, a network automation solutions provider in San Francisco. Perform memory and SSD upgrades during scheduled server downtime.
Disk capacity is expensive, and you can forestall major capacity additions by removing unnecessary content or migrating data to lower storage tiers. For example, temporary directories flood with unneeded data, so clear out /tmp and c:/temp directories in servers and storage subsystems.
Try a zero byte reclaim for thin storage deployments. "Write zeros to all allocated but unused space," said Tim Noble, director of IT operations at ReachIPS, a cloud platform provider in Anaheim Hills, Calif. A zero byte reclaim of the server's allocated, never needed storage frees up space on the array.
2. Redo cables
As network bandwidth reaches 10 Gigabit Ethernet (GigE), 25 GigE and faster, aging Category (Cat) 5 and 5e copper cabling infrastructure for 1 GigE is unable to cope with the new data center requirements.
In some cases, the right hardware is in place for higher bandwidth networks, but the cabling is not. "People tend to forget that when the physical network gear is upgraded, your cabling may not be taking full advantage," Sclafani said.
Don't rip out aging cabling all at once; Ethernet cabling is fully backward-compatible. Make relatively small, incremental investments in faster cables as time and money allow. Servers will remain on 10 GigE for the foreseeable future, so focus on network backbones, especially Ethernet-based iSCSI and Fibre Channel over Ethernet storage arrays. For example, Cat 6 cables can support 10 GigE to 55 meters while Cat 6a and Cat 7 cables can handle 10 GigE to 100 meters, without requiring new network adapters, switches or other components.
Long distances -- and 40 GigE+ Ethernet bandwidths -- need expensive optical fiber media and specialized skills to splice and integrate, which entail a formal capital upgrade project.
Differentiate the new cables from older twisted-pair lines with colored jackets or another labeling scheme. Clearly duplicate markings or labels for patch panels.
3. Add sensors
If you can't measure it, you can't manage it. Data center infrastructure management (DCIM) tools monitor the electrical and environmental behaviors of complex facilities.
DCIM requires a proliferation of sensors placed strategically around the data center. These tools may trigger automated responses to situational events, such as migrating workloads when a server becomes too hot, or sounding an alert when moisture suggests a cooling loop leak. Missing or inadequate sensors can leave input gaps.
What are you missing?
- Temperature sensors locate hot spots within racks and rows.
- Humidity sensors warn of excessively dry air or damaging condensation levels.
- Moisture (liquid) sensors are essential when chilled water circulates in heat exchangers or rack doors.
- Power monitors track energy use in real time.
- Air flow sensors ensure that fans are running and filters are unclogged.
- Motion detectors spot unauthorized intruders and trigger security alerts and cameras.
- Smoke/fire sensors protect valuable assets and lives.
- RFID tags help automate hardware inventory control.
"Data center monitoring tends to be the last addition to the budget and the first to get axed when project timelines go sideways," Sclafani said. "Your sensors and instrumentation probably have room for improvement."
New sensors are quick and non-invasive installs, done in small increments to keep cost and time commitments minimal.
4. Boost data security
OS and application security updates might seem obvious, but these low-level tasks get postponed by day-to-day firefighting and complex data center projects.
Check system inventory reports and patch each server with the latest available security updates, Noble said. "This will be easier if you have automation tools like Puppet," he added. "But even a large number of servers can be patched pretty quickly if there is a concerted effort."
Hypervisor updates, such moving to VMware vSphere 6, are rarer and might be delayed by testing. Check the hardware and software inventory of your virtualized servers to verify that they support the new requirements, and finish lab testing so the new features can move to production. "You might also simply update the VMware Tools on all of your hosts to [your current ESXi version]," Noble said.
Look for other security enhancements: Check and fix file permissions, scour Active Directory user accounts for old or inaccurate entries and so on. These activities pose little risk to operating services.
5. Check and improve processes
Modern data centers are process-driven -- policies and procedures reduce errors and ensure consistent results regardless of who performs the work. As more IT departments move beyond script-based automation (such as PowerShell) to embrace sophisticated workflow automation tools, it's easy to forget the actual steps and why they're there. Roles and priorities change, opening strategic opportunities to review, streamline and optimize workflows.
"Find an operational task, map it out and see how you can make it more efficient," Sclafani said. "You get extra points if you also ask your internal or external customers for input on processes [to] optimize."
Perform a fire drill to verify that existing infrastructure works as expected. This is particularly important with disaster recovery (DR) and resilient systems such as server clusters. Test server failover in active/passive clusters or simulate the loss of a server in active/active configurations.
"If you have a DR site, a weekend of maintenance in one data center might be a good time to test operations in your alternate data center," Noble said. Unacceptable service disruptions indicate additional remediation work to meet the data center's requirements before real trouble strikes.
About the author:
Stephen J. Bigelow is a senior technology editor at TechTarget, covering data center and virtualization technologies. He acquired many CompTIA certifications in his more than two decades writing about the IT industry.
Don't waste your work. Always take the time to label and update documentation to reflect changes. Verify that any automated inventory or change management tools see the hardware, firmware, operating system, hypervisor or other changes.
On any server upgrade, benchmark the system before and after, and measure the performance difference to gauge effectiveness. Monitoring tools like Nagios and SolarWinds measure system and workload performance for pre- and post-optimization tracking.