Network monitoring and analysis tools have become a necessity. They scan and trouble-shoot in real-time to pinpoint network glitches, flaws and vulnerabilities. They alert managers and administrators to discrepancies. And they provide an overall view of network performance.
But it seems some networking pros have become annoyed with the almost constant bells and whistles caused by false-positives or other non-critical foul-ups. So much so, a recent study revealed, that networking folks are ignoring alarms or setting the threshold so high that some critical alerts may be slipping through the cracks.
A telephone survey of 195 IT professionals, performed by consulting firm, Channel Source Direct, and commissioned by Reston, Va.-based network analytics vendor, Netuitive, found that while most users heavily rely on network performance monitoring and analysis systems, several hobble or deactivate the alerting functions to avoid information overload. One technique, which network managers use to reduce false alerts, is to intentionally set thresholds at higher than optimum levels, or simply turning off alarms.
"This is especially true among respondents who are getting 100 or more alerts per day," the study indicates. "Forty-seven percent of these respondents say they have already set thresholds high to reduce their alert volumes, thus limiting the system's alarm functionality.
In the survey, 30% of respondents said they receive 100 or more alerts per day. In some cases, up to 5,000 alerts were reported. In corporations with fewer than 100 servers, 15% said they receive 100 or more alert searches per day, while 41 % of companies with 100 or more servers said the same.
The research also found that "false alerts, non-critical alerts that do not indicate an immediate or impending service problem, are a common problem." Roughly 30% of respondents said more than half of the alerts they receive are false and 16% said that 70% or more of their alerts are non-critical.
"Unfortunately, this means that the users are getting less benefit than the performance monitoring software is designed to provide, and they risk missing critical early notifications of impending problems," the study continued.
For example, picture a multi-national bank that raises the alert threshold to cut down on typical influx of alarms generated by the Monday morning return to work e-mail crunch. The bank has grown accustomed to the alerts from increased use of the e-mail server, so even some alerts that slip through are ignored.
But one Monday, while hundreds of employees sip coffee and sort through weekend e-mails, an e-mailed worm is released. Because the threshold is turned up, or the alerts are shut off to compensate for the increased use of the e-mail server, no one realizes. The worm proliferates, crippling the network and costing hours in lost productivity and downtime, not to mention headaches for the IT department, which is charged with fixing the damage done, then explaining how the worm got through in the first place and why the thresholds were set so high.
The whole mess could have been avoided had the alert threshold been set at the appropriate level.
Some players in the analysis and monitoring space include Netuitive, which commissioned the study; WildPackets, which offers analysis tools such as OmniPeek; and other vendors like Hewlett-Packard OpenView, IBM Tivoli and BMC Software PATROL. Netuitive's software, the company said, automatically analyzes and correlates a company's performance monitoring data in real-time, requires no manual configuration or specialized programming and offers Trusted Alarms, an indicator of impending service problems.
Scott Haugdahl, CTO of Walnut Creek, Calif.-based WildPackets and a searchnetworking.com expert, agreed with the study that some managers frequently turn off and tune out alert functions, but said not all are guilty of ignoring the din. Instead, he said, many just have to prioritize their needs and determine what alerts they want to receive.
"The out-of-the-box threshold for any monitoring tool is set to the defaults of a typical network," he said. "The problem is, what is a typical network? Managers need to take the time to fine-tune thresholds. We have to educate the users of these tools."
But therein lies a catch 22, because more than half of the survey respondents said they spend roughly 10 hours per quarter toying with threshold settings to cut down on false alerts. Thirty percent said they spend 25 hours per quarter, which translates to almost two weeks a year.
"However, even aggressive threshold administration fails to completely eliminate the problem of false alerts," according to the study. "Among respondents who devote 50 or more hours per quarter to thresholding, 43% say half or more of their alerts are false and 20% say more than 70% are false."
One problem with the increasing volume of alerts is that they often arrive faster than a problem can be diagnosed. Half of the respondents said it usually takes less than 15 minutes to diagnose a problem. A mere 19% said it takes less than five minutes. The scope of the problem intensifies in organizations with extremely high numbers of alerts -- 100 or more a day. Among that group, about 25% of respondents said it usually takes less than 10 minutes to diagnose a problem, while 13% reported it takes an average of 30 minutes or more.
Still, Haugdahl said, the wasted time and headaches can be averted with fine-tuning and education.
"[Alerts] can't be totally ignored," Haugdahl said. "The issue is not so much setting thresholds too high, it's really turning it on and off and missing [alerts] altogether. You have to know more than red-light, green-light."
Companies need to determine what they consider critical and make sure they set monitoring and analysis tools in a way that major alerts won't go unnoticed. But what's critical to one company, may not be critical to another.
"There's more than one way to look at an alert," he said, adding corporations need to decide "what's important to you, what's not?"
Overall, Haugdahl said, the benefits of a network analysis or monitoring tool outweigh the pitfalls, however, network managers must determine what alerts they want to receive and prioritize, instead of turning off the systems' alert function or setting thresholds so high that some critical alerts could sneak by.
"The more you want to optimize your network performance, the more alerts you'll receive," he said. "It would be nice to reduce the alerts, but at what cost? There has to be some level of alerts that come through."