Network switches are critical elements of any network infrastructure. They must check each packet arriving at every port, determine the appropriate port for the intended recipient, and then send every packet on its way to the correct destination – and all of this happens in real time at roughly line speed.
Traditional switch designs are more than adequate to support current network bandwidth, but bandwidth isn’t the only consideration for today’s workloads. Organizations are running more network-based applications and putting more traffic on the network. In addition, increasing reliance on time-sensitive applications has placed a new focus on switching latency. In this tip, we take a look at high-speed, low-latency network switches, help you decide if they’re right for your data center and offer some advice for planning and deploying those devices.
Getting a handle on high-speed network switches
Although most IT professionals understand the concepts of network switching and the effects of switching latency, the question of latency has not been a significant problem for typical networks and workloads – a switch has rarely been the gating item for network performance. However, network environments are getting more congested, and mainstream switch designs are being challenged by the changing composition of network traffic.
For example, IT professionals see smaller packets in network-centric applications like Voice over Internet Protocol (VoIP). Although packets may be smaller and the number of packets is significantly greater, the overhead associated with switching each packet stays the same. This means more work for the switch and increased potential for switch saturation and performance problems.
“When you start blending [VoIP] with your data, you end up with just a tremendous amount of packets,” said Todd Erickson, president of Technology Navigator in Cary, N.C. “The VoIP issue absolutely brought it to the surface, especially in high-density employee phone areas.”
Effects of storage and virtualization on the network
Storage traffic can also flood a traditional network with additional data that a switch must accommodate. The use of iSCSI can burden a network at 1 Gigabit Ethernet (GbE) speeds, and emerging storage network technologies, such as Fibre Channel over Ethernet (FCoE), can potentially challenge 10 GbE infrastructures. This was never a problem with conventional Fibre Channel, because that technology is dedicated to storage traffic. It's the addition of storage traffic on Ethernet LANs that can burden a switch. Applications that access page files (aka “swap files”) across the network can produce significant traffic that might drive a switch backplane into saturation.
Some organizations might choose to boost network application response times by intentionally fragmenting the packet size. While this can enhance some time-sensitive applications, such as point-of-sale and other real-time products, a network that's already busy may see exactly the opposite effect on performance because of the correspondingly higher volume of traffic. Experts caution that trading packet size for traffic volume only helps when network utilization is relatively low.
Virtualization is another technology that adds traffic to the LAN and stresses conventional network switches. For example, a single server with multiple workloads may generate a tremendous volume of network traffic. When multiple virtualized servers are connected to the same switch, backplane saturation can potentially result, unless adequate traffic analysis and architectural planning is considered in advance. Activities within the virtualization cluster can generate significant activity, such as virtual local area networks (VLAN), tagging and so on.
“If you're serving VHD files across the network, you can very quickly max out a network because you have such large files,” said Chris Steffen, principal technical architect at Kroll Factual Data, based in Lovaland, Colo.
Even desktop virtualization deployments can exacerbate traffic congestion at the switch ports. “When you get into 10 GbE unified [FCoE] adapters and you're going to put 100 or 150 users on a single connection, that's where we're seeing [high-speed switch use],” Erickson said.
Heeding the red flags
Symptoms of switching stress can be hard to spot unless you're looking for them. One of the most common symptoms is high CPU utilization on the switch. In many cases, this symptom translates into dropped packets and performance problems with the applications that generate or depend on time-sensitive network traffic, such as VoIP, digital voice and video conferencing systems. If the condition persists for even a few minutes, users will notice problems, applications may produce errors, VoIP phones may reset and other behaviors may generate alerts or tickets for action.
Given the significant cost of high-speed, low-latency network switches, it might be possible to avoid that investment by redesigning the existing network infrastructure to address traffic problems. For example, it may be possible to ease switch stress by segmenting the switches, watching CPU utilization on the switches and adjusting the traffic distribution to minimize the path for the most demanding traffic sources.
“In one facility, we've taken the voice-specific areas with conference bridge and video, and we put that onto a separate standalone switch purposely so that it would not interfere with our other core network applications,” Erickson said. He also noted that a VLAN isolation strategy would have worked, but that would have added CPU overhead.
Selecting and deploying high-speed network switches
Justifying a high-speed, low-latency switch is perhaps the most challenging aspect of the technology – they are costly, and won't necessarily provide the same benefit in every network situation. For example, many instances of switch congestion and performance problems are often resolved by adding more everyday switch ports and making network changes that will better distribute the traffic. However, be aware that additional switches need to be monitored and managed, and that can make the prospect of “just adding more ports” less attractive. Experts agree that high-speed, low-latency network switches really shine when the network must support a variety of time-sensitive applications. This is especially true when working with 10 GbE infrastructures that are intended to handle demanding workloads, such as voice, video and high-end storage, like FCoE, on the same network.
When selecting a high-speed switch, it's important to consider attributes and features beyond switching speed. For example, look for FCoE compatibility in the switch. Erickson noted that FCoE has not yet been fully standardized, so using network switches that do not share the same approach may result in performance degradation or outright incompatibility.
Also, pay attention to 10 GbE switch implementations. Although 10 GbE is an established standard, some vendors may tweak the standard a bit and make their implementation proprietary, requiring special network adapters and/or cables to run 10 GbE properly. Clarify the need for proprietary gear early in the switch selection process.
Consider the effect of overcommitted backplanes on overall switch performance in the field. As one simple example, 20 full duplex Ethernet ports running at 100 Mbps would require 4 Gbps of bandwidth on the switch's backplane. If the backplane only has 2 Gbps of bandwidth, you would only be able to run at 50% utilization before the switch starts dropping frames. Vendors often get away with overcommitting their switches, because networks rarely run at 100%. Therefore, switch performance “looks good” under light traffic loads, but it's a factor that should weigh heavily on any high-performance switch acquisition.
Finally, evaluate the scalability of the high-speed switch and determine how you can add more ports in the future. This may simply be a matter of daisy chaining one switch to another, adding blade switches or other proprietary modules, or other approaches. Also, consider the availability of configuration and management tools. It's important that the new switch integrates with existing tool sets and provides the metrics that allow IT staff to monitor and make good decisions for the new switch throughout its operating lifetime.
Testing advice for network switches
Most technology deployments are preceded by a period of testing and evaluation. The goal is first to ensure that the new product, system or tool can actually deliver the expected results in a controlled environment before rolling it out to production. A testing period also allows IT staff to develop familiarity and expertise – a level of comfort with operating and managing the technology. The lab is an appropriate place to test features such as failover or fault tolerance, not the production data center where service-level agreements may be in place.
High-speed, low-latency network switches also require due diligence, though experts note that the evaluation period is far shorter than other data center products. It rarely takes more than several weeks to learn management tools, verify feature compatibility and optimize the switch configuration. The main issue to consider when testing is how the switch performs under load conditions, particularly with conditions that involve load balancing. It's not just a matter of ensuring the switch's electrical integrity, but actually verifying that the switch will perform the way its manufacturer advertises.
“There should be very little in the way of proof of concept,” said Chris Steffen, principal technical architect at Kroll Factual Data, based in Lovaland, Colo. “I would want to know how the monitoring tools work, and frankly, I would expect a demo of how the monitoring tools integrate within the System Center environment that I have now.”