I found out something interesting today. The monitoring system we use (Nagios) is not monitoring everything that it is supposed to for any of my servers. When my networking/Linux team set the system up, they asked for a list of servers and their IP addresses that needed to be monitored. I provided the requested list and off they went to add my 130+ systems. I asked if they needed a list of drives etc to monitor, and they said no, they should be able to scan the systems and add all available drives. I would need to tweak things a little, in case there were USB drives attached that were temporary, but that is all.
So off we went along our merry way.
Yesterday, I found out the not so fun way, that apparently they didn't have the ability to scan the systems for all drive letters to monitor, and were only monitoring the C drives. I found this out because a user at one of my remote offices said he could save something to their network drive. I logged on and the 1.5TB drive had 15MB free. yes, 15MB free!!!!
I proceeded to get screamed at why aren't we monitoring these systems, this is a critical system, what would have happened if it crashed and we lost all the data, etc, etc, etc. I was still under the impression that it was being monitored, but when i went to look, only the C drive, none of the other 4 drives are being monitored.
So my question, if a monitoring system is not monitoring what it is supposed to, is it still working?
ARGH!!!!!!!!!!!!!!
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment