Today’s data center must stay robust, efficient and – most of all – properly utilized. Idle resources within an environment waste money. But heavily used data centers with improper resource configurations create dangerous scenarios where a hardware failure could spell trouble for other physical hosts. The challenge for IT administrators is to manage and to utilize computing resources that span the entire environment, often including physical, virtual and cloud resources. In this tip, we’ll discuss resource planning and issue mitigation as means of optimizing resource use. We’ll also address how to quell issues before they become serious problems.
Best practices for resource planning
Almost all of today’s data center environments have or will have some form of virtualization
deployed. This demands additional considerations when deploying a virtualized physical platform. We
now have multiple virtual
machines (VMs) relying on a single hardware platform for computing resources such as CPU,
memory, and I/O for network – and sometimes disk – traffic. When it comes to resource management
and spotting problems, proactive planning can help with resource utilization issues.
Resource load balancing
Whether working in a virtual or physical environment, it’s crucial to know what resources are being
used and to which VMs those resources are being allocated.
Requires Free Membership to View
When you register, you’ll also receive targeted alerts from my team of editorial writers and independent industry experts with the latest news, tips, and advice to help you do your job more efficiently and effectively. Our goal is to keep you informed on the hottest topics and biggest challenges faced by IT professionals today working with data center technologies.
Margie Semilof, Editorial DirectorFigure 1 — Citrix XenServer 6.0 Enterprise Hypervisor is configured as a single host with local storage only. For small environments this works well; however, more hosts are required as businesses change.
As you see in Figure 1, a single XenServer host delivers several highly utilized workloads to the end user. Although the solution will work, it leaves little room to adjust for usage spikes or for the potential need for additional virtual servers. In the example above, engineers would have to remove virtual resources — RAM in this case — from other virtual machines to allow for growth on a single physical box. Physical resources must be made available to all VMs as required. Leaving room for emergencies or expansion purposes is a must in any environment. This is where load balancing and VM management comes into play.
Both a virtual and physical server must have the right type of resources assigned to it. When heavily used workloads are deployed, resources must be planned and delivered to each VM without affecting other virtual or physical workloads. Using the same example scenario, we can introduce a secondary physical host in Figure 2 with similar hardware specifications and begin to load-balance both the virtual and physical resource elements.
Figure 2 — Citrix XenServer 6.0 Enterprise Hypervisor illustrates multiple hosts configured within a pool. This pool shares resources because VMs are capable of moving live between the hosts across a storage area network (SAN) backbone.
In the new scenario, we have two physical hosts joined to a pool where resources are shared between VMs. Since every environment is unique, the process of load-balancing physical resources will require individual planning. In this example, the two physical hosts have extra resources available as required by the existing virtual machines. Extra CPU, RAM and storage requirements have been set for this specific environment so that VM agility can be guaranteed.
Also, many environments will want to load-balance for high availability (HA). Using XenServer 6.0 as the sample hypervisor, built-in tools will help with this process. By going into the Pooled Server HA feature, administrators are able to see which machines are able to safely fail over to the other host. From there, engineers are able to determine how many resources are required to handle the load per physical box. The most important element to remember is these two servers are load balanced with resources available for new VM creation, failover and workflow automation. In terms of HA, if one of the above physical servers fail, the other will be able to handle the critical virtual machines of the failed host.
When resources are balanced between machines, VMs have the ability to move as needed between physical hosts without affecting the current resource state. Consider disaster recovery (DR) as one possible example. If a physical host fails in this type of load-balanced scenario, VMs will migrate to the next available host where resources can be found. If either of these machines were completely utilized, DR and failover would be impossible because the other server simply would not have the resources available to support the influx of additional VMs.
Workflow automation
Many environments using both virtual and physical servers may require an element of workflow
automation. For example, if one particular server is heavily utilized, new VMs will be spun up to
help cover the existing server. One example is the Citrix Workflow Studio, a member of the Citrix
Delivery Center product family. This is an IT process automation application that enables
administrators to create, schedule, run and manage workflows. The workflows tie technology
components together, mechanize repetitive configuration processes and coordinate condition-based
triggers for administrative tasks. Built on the Microsoft .NET Framework, Windows Workflow
Foundation and Windows PowerShell, Workflow Studio allows engineers to dynamically create new
virtual resources to respond to capacity needs, on-premise or off-premise. In this scenario, it’s
important to have the proper resources aligned to the spare physical machines so new virtual
machines will have RAM and CPUs available to them.
There will always be a need for new VMs to be created in an environment, and it will be up to administrators to apply the correct amount of resources to each new VM. Over- and under-allocating resources can waste both time and money. This is why having a strategic plan for VM management around existing resources is important. By knowing what is currently available within the data center, engineers can deliver workloads more efficiently.
This means that administrators will need to keep a keen eye on their physical and virtual environments and know how many users or machines can live on that host safely and efficiently. For example, consider a virtual desktop infrastructure. As users log in, they will begin to consume resources located on a machine being monitored in Figure 3.
Figure 3 — Citrix XenServer 6.0 Enterprise Hypervisor shows an isolated XenServer host utilized only for virtual desktop infrastructure. VMs are stored either locally or can be stored on a backbone SAN.
Currently, the machine shown in Figure 3 is not being heavily utilized. However, with an influx of users, that count can increase quickly and create problems for poorly balanced environments. Use this data to size virtual workloads accordingly. For example, setting a cap on resource use will allow a safe number of workloads to be launched on a physical box within a well-managed data center.
Working with resource alerts and alarms
Creating alerts and notifications within a data center will help to maintain a healthy
environment and improve VM management. Catching an issue before it becomes noticeable to users or
jeopardizes a service-level agreement (SLA) will help keep both physical and virtual machines
within a data center running longer and more efficiently. From a resource perspective, leading
hypervisors will have alerts and notifications that can be setup, such as in Figure 4.
Figure 4 — Citrix XenServer 6.0 Enterprise Hypervisor provides alerts that can be configured per VM and per physical host. In this case, alerts are set up for a Windows Server 2008R2 Enterprise Licensing Server.
With alert monitoring, engineers can setup CPU, network and disk alarms to warn of encroaching trouble, allowing technicians to mitigate resource issues before they affect the end user. Setting up resource alerts is crucial in the planning and deployment process. Many environments leave this for the last step only to run into resource-based issues quickly within their data center.
Using existing and third-party resource monitoring tools
Administrators often need to check on resources that directly affect a specific physical or
virtual server. In these cases, there are great third-party granular tools capable of reporting on
special database servers, cloud-based machines and other heavily utilized workloads. One such tool,
up.time, by uptime software Inc., helps
administrators monitor servers, virtual machines, the cloud, a colocation and more. Using up.time’s
graphical server monitoring software, an administrator can graph and analyze all critical server
resources running inside the data center independent of any operating systems that is being used.
In-depth, granular monitoring of resources such as CPU, memory, disk, processes, workload, network,
user, service status and configuration data can help engineers properly allocate and plan out their
data center resources.
Another solid network monitoring tool comes from SolarWinds. The tool is called Orion Network Performance Monitor (NPM) and provides granular network traffic and performance monitoring. To assist engineers with their daily tasks, NPM monitors, tracks the up/down status and analyzes real-time, in-depth network performance statistics for routers, switches, wireless access points, servers and any other SNMP-enabled device. For large data center environments, NPM allows engineers to quickly view the status of core IT services and data center through refined alerting that dynamically groups related systems and devices.
In addition, resource issues can usually be diagnosed and answered using onboard tools. For example, Resource Monitor, offered by the Windows OS platform, graphs resource utilization on a machine and shows an administrator how those resources are used.
Figure 5 — Resource Monitor is showing memory utilization on a Windows Server 2008 R2 Enterprise Exchange Server.
In the scenario of Figure 5, this server is having RAM issues with store.exe. Engineers should be aware that Exchange can be RAM intensive, so seeing this type of utilization is common. However, having this information, engineers are able to either add more resources to this box or offload some of its workload to other machines.
Natively, Resource Monitor has several tabs that help engineers probe their machines and see where resources are being used. Another example is network throughput. In Figure 6, we see a normally operating server. However, if there was a network spike we would be able to see the source then decide how to best mitigate the issue. To get a granular look, engineers are able to dive into the environment and create their own monitors to see where the data center is lacking or how well it’s performing.
Figure 6 — Resource Monitor is showing network traffic and utilization on a Windows Server 2008 R2 Enterprise Exchange Server.
Data center storage resource considerations
Storage resources can be very limited and expensive. Improper storage utilization can lead to
performance problems and very costly resolutions. It’s always important to monitor how storage is
being used in both a virtual and physical environment. Intelligent storage tools can help ease
workload pains by consolidating data and delivering it efficiently. A major vulnerability of a data
center SAN environment is usage spikes, such as when a large number of users access the system at
any given time.
In these situations, disks become heavily utilized and performance can slow to a near halt. To combat this, SAN manufacturers look to solid-state technologies and intelligent deduplication-aware caching mechanisms to reduce performance bottlenecks.
Figure 7 — Graph shows how Flash Cache affects a NetApp 3000 series controller and its disk aggregate. This capture shows a total of 80 minutes of activity – with the first 20 being without caching.
In Figure 7, we see a heavily utilized workload being accessed by a number of users. The device in this example is a NetApp controller. You can see the difference in disk performance without cache and with onboard caching enabled. These types of data center efficiency solutions help keep an environment running longer and smoother. In this case, engineers will not have to buy a larger aggregate of disks for resource distribution. Rather, they are able to use their existing storage to more efficiently deliver large workloads to the end user.
Planning and attention pay dividends
Always remember that computing resources are finite. It can be extremely expensive to add
resources when reacting to an unexpected event or shortage. This means administrators must keep a
proactive eye on the entire data center environment and catch resource utilization problems before
they begin to affect the workload or the end user.
About the author: Bill Kleyman, MBA, MISM, is an avid technologist with experience in network infrastructure management. His engineering work includes large virtualization deployments as well as business network design and implementation. Currently, he is the Virtualization Architect at MTM Technologies Inc. He previously worked as Director of Technology at World Wide Fittings Inc.
This was first published in February 2012
Data Center Strategies for the CIO
Join the conversationComment
Share
Comments
Results
Contribute to the conversation