Taking the time to investigate the methods available for monitoring network I/O and identifying possible causes of slow networking is well worth the effort.
If an application owner reports slow networking, then it is essential to make sure that the cause is not a bottleneck in the wide-area network. Poor network performance can often be attributed to sources outside of virtualization. There may be an outage or routing problem that has yet to be reported or discovered.
Another area to check is the IP configuration. Simple tools like ping, pathping, tracert and nslookup can still be useful in diagnosing network problems.
One of the most common problems is a poor or incorrect domain name server (DNS) configuration. Another place to check is the configuration of the application within the virtual machine (VM). If there is a setting or an option that could significantly degrade network performance — perhaps the application polls the network for availability of external network components — then this can lead to unnecessary traffic.
Once you have excluded these as potential problems, it’s worth confirming whether the optimized components have been configured correctly. Next, check whether the network problems affect just the VM in question or all the VMs on the same host. This is also a good way to determine whether the problem is specific to the application owner’s VM or whether it is a systemic problem. Most hypervisor vendors offer network tools that allow you to monitor traffic coming in and out of the VM.
VMware has a utility called esxtop that can see network statistics (Figure 1) and troubleshoot network performance problems. Hitting n on the keyboard toggles esxtop to a network mode, and f on the keyboard allows the administrator to add additional fields.
These utilities allow you to see how much bandwidth is actually being used by the VM and whether the physical system is seeing a significant number of dropped network packets. They also show the transmit and receive rate of the system.
When a machine sends out packets but does not receive an acknowledgement, it can indicate a problem with network interface card (NIC) teaming algorithms, referred to as the reverse NIC team problem. In this scenario, advanced NIC teaming has been enabled, and although packets leave the physical host via one network layer, they arrive back at the host via the wrong physical switch and to the wrong NIC. Serious problems such as these may need wider investigation. In some cases, it can result in the abandonment of a particular NIC teaming policy that has been deemed unreliable for the wider network.
Plenty can be done to improve and monitor network performance for VMs as your consolidation ratios grow. The key to the best optimization is following your virtualization vendor’s best practices, while modifying them to suit the unique traffic characteristics of your network. The most critical part is to understand the relationships between your VMs and the wider physical world.
Mike Laverick is a professional instructor with 17 years experience in technologies such as Novell, Windows and Citrix. Involved with the VMware community since 2003, Laverick is a VMware forum moderator and member of the London VMware User Group Steering Committee. He is also the owner and author of the virtualization website and blog RTFM Education, where he publishes free guides and utilities aimed at VMware ESX/Virtual Center users.