Monitoring

An MCE deployment can be monitored via a dashboard that can be enabled via the optional configuration keys:

keyvalue
debug.mce.orleansdashboard.enabledtrue
debug.mce.orleansdashboard.port8033

This will enable a web-based monitoring dashboard that provides information on the health of the running MCE cluster.

MCE dashboard overview

From here you can see the overall health of the cluster, including the error rate, average response time and number of requests being processed per second.

You can also monitor the health of each node in the cluster as well as the distribution of work amongst those nodes (activations).

MCE dashboard silo

Indications of performance issues#

A healthy system should exhibit the following:

  • <1% error rate
  • <150ms average response time
  • Good distribution of activations across silos

Performance issues will generally be accompanied by errors - indicating that there is a problem processing messages in the cluster.

In some cases performance issues are a result of an unexpected load imbalance and rectify themselves over a short period of time. However, if this happens consistently it could indicate an issue with the cluster that should be investigated.

If you suspect there is a performance degradation consult:

  • The performance counters of each silo - CPU/Memory footprint
  • The logs of each silo for errors