Saturday, December 01, 2012

Brief list of port monitoring tools

A while back we deployed some special purpose server systems. They are pretty simple in terms of function but do provide some critical infrastructure support.

So when they go off-line for whatever reason (power failure, unplugged network cable, etc.) we need to respond to get them back online.

Proactive monitoring is pretty thin and currently we have more of a reactive monitoring solution. Someone needs one of these systems, finds it is down, and call us to fix it fast.

Nice.

So one solution developed was as simple batch file that ran a ping against the server IP’s. If a ping failed, then notice would be auto-emailed to selected staff to check it out.

That seems ok, but what happens when the NIC is up and responding to ping, but the core applications/OS has actually hung and it really isn’t “operationally” on-line, though the NIC is? Kinda gives the impression you don’t have a problem that you really do.

These systems are very simple and we can’t run any additional “client" software on it to “phone home” for service health and availability…something like Paessler's PRTG Network Monitor.

I did identify a few critical network services running on the systems and found that they communicated out on specific ports.

If we could run port-scans against those ports and found them open/listening, then that might provide a more accurate assessment of the servers’ health rather than the basic ping reply/no-reply feedback.

So here are a few of the tools and utilities I considered in that approach.

If you have any additional utilities or tricks for remotely monitoring server/service availability please drop a tip into the comment jar!

Cheers,

--Claus V.

2 comments:

Anonymous said...

What you need is to setup a Nagios (Open Source monitoring solution) to monitor the state of your applications, not just the state of the servers they run on. We run Nagios on a spare Linux VM at work and use it to monitor all kinds of infrastructure services (e.g., LDAP, OCSP, DNS, SSH, HTTP, etc, etc...). It's very flexible in the types of alerts it can generate in response to an event as well: email, pager, SMS, etc...

http://www.nagios.org/

Claus said...

@ Anonymous - That would be a simple and nice solution. Unfortunately, as I understand it deployment of that solution would require installation of an "active" client component on the target box.

As I said, we cannot do that in this model, so we are restricted to ONLY passive remote monitoring of the system. I know it isn't logical but that decision is above our heads. :p

So though far from perfect, this is a somewhat better that simply relying on a ping routine to see if the box is up or not.

Thanks for the solution you posted and it might be a perfect option for others who have more flexibility.

Cheers!

--Claus V.