Skip to content

Latest commit

 

History

History
66 lines (47 loc) · 3.03 KB

monitoring.adoc

File metadata and controls

66 lines (47 loc) · 3.03 KB

Monitoring Decaton

This document guides you how to monitor your Decaton processor applications. Decaton processor collects and provides its metrics through Micrometer.

You can choose any monitoring system Micrometer supports. (See Micrometer docs for the details)

Processor Metrics

Installation

To export Decaton metrics, all you have to do is to register your MeterRegistry to Decaton Metrics at application start-up. If you use PrometheusMeterRegistry, registration can be done as follows.

PrometheusMeterRegistry registry = new PrometheusMeterRegistry(PrometheusConfig.DEFAULT);
Metrics.register(registry);

Metric Definitions

See Metrics.java for the full list of metric definitions.

In this section, we pick up some important metrics. We recommend you to monitor at least these metrics in a production environment.

Name Description

decaton.tasks.processed

The total number of tasks processed. This metric shows the throughput of your application.

decaton.tasks.discarded

The number of tasks discarded. This metric shows how many tasks are dropped during task extraction. Droppings are often caused by invalid messages in a topic.

decaton.tasks.error

The number of tasks thrown exception by process. Note that only synchronous failures during executing DecatonProcessor#process() are reported by this metric.

decaton.tasks.process.duration

The quantile summary of the time of a task taken to be processed. This metric denotes the process latency summary. High latency divergence between 50th and 99th implies that few heavy tasks can block other tasks. In such situation, extending partition concurrency might help to improve the throughput.

decaton.tasks.complete.duration

The quantile summary of the time of a task taken to be completed. This metric is basically same as tasks.process.duration unless you use asynchronous processing completion.

decaton.partition.paused

Indicate whether the partition (identified by partition label) is currently paused by back pressure or not by 1 or 0 value. You can compute sum of all partitions on your monitoring system to tell how many partitions on an instance is currently paused. If your consumer experiences many pauses, it might be indicating that your processor’s throughput is not sufficient compared to tasks inflow.

Kafka Client Metrics

Since Decaton uses standard Kafka Java client (consumer/producer) internally, it’s also a good idea to monitor Kafka client metrics.

Kafka clients provides a way to customize metric-reporting configuration via metric.reporters property. If necessary, you can set metric.reporters property through SubscriptionBuilder#consumerConfig as usual Kafka consumer construction.