Sergey Nivens - Fotolia


Anticipate data center challenges with an IT monitoring strategy

Data center challenges come in all shapes and sizes, but by keeping close tabs on systems and data, admins can be ready to react.

As a data center administrator, it's easy to get into a rut of addressing problems as they arise without looking at the bigger picture. However, data center challenges -- ranging from operational hurdles to macro-economic issues -- can arise from all angles. Instead of waiting for inevitable problems to occur, admins should actively collect data, analyze trends and be ready to react.

Here's a look at three common data center challenges that administrators face along with the IT monitoring practices they can use to anticipate and address them.

Operational issues

If your cycle time for bug fixes runs 12 months or more, that's a good sign that legacy systems, such as COBOL -- as well as the operational practices causing these long cycle times -- need to be replaced. Data center admins could consider replacing those legacy systems with a software as a service offering or a rewrite in SQL and C. If you replace systems without a willingness to also change business processes, you'll spend a fortune -- and a good percentage of the software will run badly. A rapidly increasing list of change requests from particular departments is a warning sign that some staffers are resistant to changing business processes.

Since the current best practice is to push cold data off the primary storage tier, a great way to justify the purchase of more drives is to have computer-generated trend data.

Operationally, the most important thing for admins is to collect trend data on what's happening in the data center. If a job is taking twice as long to run, it's imperative to know why. Monitor operations in storage, networking and servers use the results to indicate bottlenecks and failures. A good IT monitoring system will cost money and staff time, but brute-force approaches will gather extraneous data, likely resulting in information overload rather than finding root causes.

Upgrading storage

It seems like a simple solution for storage upgrades to buy more drives when the storage farm fills up, but it's important to choose the correct type of storage, such as fast solid-state drives (SSDs), slow Serial Advanced Technology Attachment (SATA) bulk drives or networked storage. You'll need usage monitoring for each tier covering capacity usage and IOPS trends. Since the best current practice is to push cold data off the primary storage tier, a great way to justify the purchase of more drives is to have computer-generated trend data. If you have a lot of gear, trend analysis will help migrate drives to where they are needed most.

Storage in the enterprise will become more complex. It has evolved from a simple model of primary and secondary hard disk drive (HDD) storage to one based on SSDs and bulk SATA HDDs. The next two years will see nonvolatile DIMM (NVDIMM) storage, 3D XPoint nonvolatile memory express (NVMe) SSDs, high-capacity SATA SSDs and more network and clustering options, such as virtual storage area networks (SANs), hyper-converged systems and Remote Direct Memory Access connections. Automated IT monitoring will be the only way to optimize operational tuning when these technologies hit the mainstream.

Advancing networks

Networking poses a number of data center challenges, too. Templates and policies to control virtual local area network setup and teardown will become more popular, as will delegating their use to departmental users of cloud services. However, these users aren't incentivized to tune networks, and they may leave loose ends when a new cloud service is deployed. An automated tool that spots bottlenecks is a useful way to improve the user experience.

It's also crucial to look at trends around latency and the percentage of carrying capacity on a link. This can reveal a need to restructure some workloads and then demonstrate that the changes worked. As cloud and cluster orchestration technologies advance, intelligent load balancing approaches will become more prominent, where resource-heavy instances are intermixed with light loads.

The server has a few extra blocks to monitor, including dynamic RAM (DRAM) and CPU utilization. We'll soon have cloud orchestration tools that identify hot spots, followed by automated balancing. This is still an evolving area, and organizations need to use historical results to guide decisions on a per-app basis.

Part of any good IT monitoring software is the ability to receive alerts if something crosses a threshold. Look for a software package that drills down into issues rapidly. There are tools -- such as eG Innovations Enterprise 6.1 and PrinterLogic's Printer Installer -- that can turn a report of a slow job at a terminal into a flag of an app-process stall in seconds instead of manually drilling down through a pictorial system tree.

Overall, intelligent use of IT monitoring software and trend analysis will make IT much more responsive to data center challenges and tamp down real crises

Next Steps

How strong are your data collection and analysis skills?

Get to know your data center monitoring system

Let DCIM help in data center management

Experts weigh in on 2016's data center challenges 


Dig Deeper on Data center capacity planning