Large IT shops using HP OpenView or BMC Patrol may now have an open source alternative. Nagios is a Linux-based host, service and network monitoring program that is starting to attract attention because of its quick configuration and easy maintenance.
Be wouldn't it be tough for IT managers sell higher-ups on the virtues on a open source monitoring tool? It might be worth the effort, said James Turnbull, author of Pro Nagio 2.0. Turnbull spoke recently with SearchOpenSource.com Assistant Editor MiMi Yeh about how Nagios is different from its counterparts in the commercial world and why IT shops should give it a chance.
What sets Nagios apart from other open source network monitoring tools like Big Brother, OpenNMS, OpenView and SysMon?
James Turnbull: I think there are three key reasons why Nagios is superior to many other products in this area -- ease of use, extensibility and community. Getting a Nagios server up and running generally only takes a few minutes. Nagios is also easily integrated and extended either by being able to receive data from other applications or sending data to reporting engines or other tools. Lastly, Nagios has excellent documentation backed up with a great community of users who are helpful, friendly and knowledgeable. All these factors make Nagios a good choice for enterprise management in small, medium and even large enterprises.
Why shouldn't you run Nagios as the root user?
Turnbull: Running any process that doesn't need it as the root user is poor security. It's better to create a user and group to run Nagios with to reduce the chance that a compromise of Nagios would compromise the root user.
Why is it important to have Nagios check the external command file every 30 seconds? Isn't that excessive?
Turnbull: Checking the external command file can take place as frequently as required. The external command file holds commands and checks data [often from remote hosts or other tools that you have integrated with Nagios]. So the sooner Nagios retrieves and processes this information the sooner the information is displayed and potentially alerted on in Nagios.
What tips, best practices and gotchas can you offer to sys admins working with Nagios?
Turnbull: I guess the best recommendation I can give is read the documentation. The other thing is to ask for help from the community -- don't be afraid to ask what you think are dumb questions on Wikis, Web sites, forums or mailing lists. Just remember the golden rule of asking questions on the Internet -- provide all the information you can and carefully explain what you want to know.
Are there workarounds to address the complaint that Nagios has no individual IP addresses for each host and service must be defined?
Turnbull: I think a lot of the 'automated' discovery tools are actually more of a hindrance than a help. One of the big flaws of enterprise monitoring is monitoring without context. It's all well and good to go out across the network and detect all your hosts and add them to the monitoring environment, but what do all these devices do?
You need to understand exactly what you are monitoring and why. When something you are monitoring fails, you not only know what that device is but what the implications of that failure are. Nagios is not a business context/business process tool. The fact that you have to think clearly about what you want to monitor and how means that you are more aware of your environment and the components that make up that environment.
Is there any advice you would give to users?
Turnbull: The key thing to say to new users is to try it out. All you need is a spare server and a few hours and you can configure and experiment with Nagios. Take a few problems areas you've had with monitoring and see if you can solve them with Nagios. I think you'll be pleasantly surprised.