One of my first jobs in IT was for a smallish company of around 200 employees. My office was the server room and we had two racks of servers and networking equipment and one rack of battery backups. One day a hurricane Isabel came through the area, that should give you a time reference.

We prepared days in advance knowing Isabel’s path was heading right for us. Our generators were full and ready to do, we had extra manpower standing by in case anything horrible happened. I spent the majority of the night walking the property looking for something to do. It was, to say the least pretty uneventful.

After a few hours and Isabel had pretty much passed I decided to call it a night and headed back to my office before heading home. What do I find when I reached my office? About 3 to 4 inches of water on the floor. Now don’t forget my office was the server room and we did not have a raised floor.

Needless to say, I spent the next 4 hours, and three shop vacs, trying to get all of the water out of the data center. Luckily we didn’t have any equipment on the bottoms of the rack so nothing other than a couple of desktop computers were destroyed.

So why am I telling you this story? If I had monitoring in place to detect water I would have been alerted the second that floor becomes wet. Not hours later where real damage could have occurred.

What can monitoring do for us?

I know for the longest time when I thought of monitoring from an IT perspective I thought about monitoring whether a server was up or not. But monitoring is so much more than that! Monitoring can provide us some incredibly valuable information that is not only about servers but our environment as well.

With proper monitoring in place, we can feel confident that if there is a problem we will be notified. With my current job, we have monitoring in place on all facets of our data center. Anything deemed urgent will alert me during working hours and after working hours. Anything not deemed urgent will go straight to my email to be taken care of the next business day.

Monitoring in all of its forms can give us as Systems Administrators a piece of mind. It can allow us to relax on our days off or after business hours knowing that should something happen our monitoring would let us know.

I spoke about this briefly in another post but monitoring helps us Systems Administrator be a little more proactive about fixing problems. With the correct monitoring in place, we can correct problems before any users discover them. Because I always keep redundant Hyper-V clusters if a Hyper-V server went down none of my users would notice it because everything would fail over to the other one. But I would know about it because of my alerting. I can then take care of that problem and get the server back up with ZERO downtime or interruption to my users.

Different Types of Monitoring

There are so many different types of monitoring it could be a post all in itself. But I have already mentioned a very critical one all data centers should have. That is environmental monitoring and most Systems Administrators think of that as a temperature sensor. It is that but so much more because you can also type in water sensors, decibel sensors, smoke sensors, rate of rise heat sensors, etc.

Rate of rise heat sensor is a temperature sensor that will alert when the temperature rises or falls a certain number of degrees in a specified period.

The next type of monitoring is Server monitoring and this is where things can get pretty deep. There are many 3rd party companies out there like “SolarWinds” and “Microsoft” who have monitoring software for servers. Both of these software’s are incredibly customizable and allow you to alert on pretty much anything. We can monitor on items such as:

  • Server Uptime
  • Network performance
  • Memory Usage
  • Processor Usage
  • Disk Usage
  • Application availability

I could honestly go on for ever and those are just the standard monitors for servers that you can setup. Software like SolarWinds SAM has an entire module just for monitoring Exchange or IIS on your Windows Servers. It can also monitor Linux servers if you happen to have any in your environment.

Image result for solarwinds monitoring SAM
This image is of a SolarWinds dashboard

Can there be too much monitoring?

Is there such a thing as too much monitoring? Personally, I do not think so as long as the amount of monitoring you are doing is not affecting performance on your servers. I have certainly seen servers in some environments that had 15 different agents on it for different things and each one had a monitoring component to it. Luckily for those environments, it did not affect performance.

But and there is a big but here. Even though you have all of this monitoring be very specific about what actually alerts you or someone else. Make sure those alerts are actionable alerts. Otherwise, you and others will start to get alert fatigue were you will be so used to ignoring certain alerts you will miss a real one.

If you find yourself constantly ignoring the same alert, I am guilty of this right here. You may want to ask yourself do we really need this alert or do I really need to be on it. I get hundreds of emails a day many of those are alerts and many of those alerts are probably alerts I don’t need to be on. If you find yourself in this situation either remove yourself from the alert or find the owner of the alert and ask them to remove you.

Conclusion

I hope I have shown you how important monitoring is not only in servers but in the environment. Keeping a close eye on your data center with the available tools will help to keep you one step ahead of problems as they come up.

I would love to hear any thoughts you have on this article so please leave a comment or follow me on twitter @MikeWalton1984.