MS Cluster Servers
How do I monitor MS Cluster server applications?
I really don't know. There appear to be no really good, thorough methods.
- Can watch to see if the cluster service fails - that would be a problem, but doesn't happen when an application fails over.
- Can poll for events in the event log. There aren't any good, specific event id's to watch for. Could monitor for several events that might indicate some kind of problem, but wouldn't indicate a specific problem occurred. This could work if we identified the pattern of events when a node brings a service online.
- Can monitor the existence/free space of the Q: drive on the node that is supposed to be the primary machine in the cluster. When drive Q: goes away we can know when the cluster group has failed over. However in a 2 node active-active configuration this would not tell us when the application running on node 2 failed over to node 1 (if node 1 owns the cluster group.)
It would be nice if there were performance monitor "counters," SNMP MIB's, WMI, or some other exposure of the status of each resource on a cluster server. This would provide alerting possibilities for "abnormal" situations. I understand there is some API that cluster administrator was built on, but writing my own cluster administrator with alerting capability is not attractive. Neither is sitting 24 hours a day and watching cluster administrator to find out whan an app fails over.