There's a good deal of spelunking going on these days at the Oregon State University Open Source Labs, but there's nary a rocky underground cave in sight.
The lab is using San Francisco-based Splunk Inc.'s troubleshooting application to manage systems that support thousands of open source developers worldwide. So far, OSU's Open Source Labs are spelunking with Splunk on open source projects from the Mozilla Foundation, the Apache Software Foundation and OpenOffice.org.
Troubleshooting gains were apparent almost immediately after Splunk troubleshooting was implemented, said Corey Shields, OSU infrastructure Manager at the labs.
While using Splunk to conduct searches across his infrastructure, Shields said he noticed a cron job (a Unix command that creates a table or list) set up by a developer who had left six months earlier that was running every minute. "Splunk showed its worth almost immediately by helping to find problems I didn't even notice the symptoms of," he said.
It was troubleshooting finds such as the cron job one that allowed the lab to nearly triple its size while maintaining the same staffing levels. This point is especially important given that 70 of the lab's 130 servers are fully managed by the OSL staff, which comprises a group of Oregon State University interns that changes regularly with each passing semester.
"Having this application running definitely cut our troubleshooting time down. We were doing three gigabytes of log file data a day, and that gets difficult with the staff size we have today," Shields said.
The tasks became even more difficult when there was an error in one of the lab's three mail relays, such as a lost or misdirected email. In a case like that, Shields and his staff would have had to go through three different boxes to fix the problem.
However, with Splunk spanning across the multiple hosts, Shields and his team could use the Splunk software to do a search for a user's email address to find out where an email ended up or if it was ever sent out at all.
Splunk also eliminated the need to rely solely on the grep and awk commands, which are used for finding bugs or issues in the individual lines of code.
"Grep and awk give you a 'fire and hope you aimed right' interface to find what you're looking for. Splunk's interface gives you a way to 'aim that bullet' once it leaves the chamber," Shields said. The preciseness of the tool has saved anywhere from hours to days of time that can now be better used maintaining the servers that host important open source projects.
In the future, Shields said that he is looking forward to more value from the reporting aspects of the Splunk software, especially in the area of metrics.
"Metrics are a big thing to management and to the higher ups," Shields said. "In previous jobs when a boss would ask me for metrics on the drop of a dime, it was not always easy. But with something like Splunk -- with the tons of data that's stored there and with it all indexed, you could potentially come up with a massive amount of metrics fairly easily."