Releases: Morningstar/kafka-offset-monitor
Releases · Morningstar/kafka-offset-monitor
v0.4.6 - Happy New Year!
New functionality:
- The topic tab now also displays partition information.
- Implemented Java-based in-memory data storage which provides some new functionality. Right now this is lightly implemented into the front-end, but over time will replace the current implementation. This allowed me to implement some new functionality:
- Report if consumer-group is currently active
- This will eventually allow us to report on inactive consumer-groups
- (Experimental) Report Burrow-like consumer-group status calculation via REST endpoint (/consumergroup), while updating Burrows rules a bit. The rules I implemented are as follows:
- Evaluate per consumer-group topic-partition:
- Rule 0: If there are no committed offsets, then there is nothing to calculate and the period is OK.
- Rule 1: If the difference between now and the last offset time-stamp is greater than the difference between the last and first offset time-stamps, the consumer has stopped committing offsets for that partition (error)
- Rule 2: If the consumer offset decreases from one interval to the next the partition is marked as a rewind (error)
- Rule 3: If over the stored period, the lag is ever zero for the partition, the period is OK
- Rule 4: If the consumer offset does not change, and the lag is non-zero, it's an error (partition is stalled)
- Rule 5: If the consumer offsets are moving, but the lag is consistently increasing, it's a warning (consumer is slow)
- Roll-up all consumer-group topic-partitions per consumer-group and report a consumer-group status:
- Set consumer-group status to ERROR if any topic-partition status is STOP
- Set consumer-group status to ERROR if any topic-partition status is REWIND
- Set consumer-group status to ERROR if any topic-partition status is STALL
- Set consumer-group status to WARN if any topic-partition status is WARN
- Set consumer-group status to OK if none of the above rules match
- Evaluate per consumer-group topic-partition:
- Report if consumer-group is currently active
Of course some of the bugs you were seeing were fixed as well:
- Synchronizing around all SQLite DB activity. SQLite only allows one operation at a time with the DB file.
- This fixed all DB create/update/delete issues at the expense of sometimes blocking DB operations while another DB operation is taking place. This is unavoidable using SQLite. Long term fix will be to replace SQLite with a more appropriate DB engine.
- Fixed an issue where LogEndOffset and Lag can display incorrect values.
- Added retry logic around building the ZkUtils object. This fixed the issue where we would not re-connect to Zookeeper if the zk service went down and then was restored.
- Updated some dependency versions.
v0.4.1 - Stability improvements
Lots of stability improvements in this release. Here are the changes:
Functional changes:
- Graphs could sometimes be difficult to read because you would see non-whole numbers on the y-axis legend, when facts being drawn are always whole numbers. Set y-axis to draw only whole numbers in the legend to make the graph easier to interpret.
- Respect command-line arg kafkaOffsetForceFromStart, starting consumer offset listener clients from the beginning of the log by implementin a ConsumerRebalanceListener.
Stability improvements:
- Created function tryParseOffsetMessage to attempt to parse a kafka offset message retrieved from the internal committed offset topic:
- Handles messages of other types and questionable correctness.
- Added 100% unit-test coverage for this new function.
- Add robustness to the log-end-offset getter thread:
- No longer shutting down the application on error. Instead, closing and destroying the client and re-creating it.
- Sleeping on error before re-creating client and continuing to process
- Deal with thread-safety issues on shared memory between threads that retrieve data from Kafka.
- Stopped polluting consumer groups in zookeeper by not creating a unique consumer group name for the consumer-offset and log-end-offset listener at each client instantiation.
- Improved createNewAdminClient code, simplifying the error paths and property calling close on error.
- Re-factored some of the error handling paths, simplifying them.
- Closing all kafka clients on error so connections do not leak.
General improvements:
- Begin to reduce usage of Zookeeper when using offsetStorage = kafka:
- Override getTopics() in KafkaOffsetGetter to retrieve topics directly from the Kafka broker
- Override getClusterVis() in KafkaOffsetGetter to retrieve cluster information directly from the Kafka broker.
- Use constants for all property in createNewKafkaConsumer().
- Fixed some bad grammar in error messages.
- Fixed silly com.twitter.util-core dependency in build.sbt.
v0.4.0: Support for Kafka brokers >= v0.9 !!!
Big news!!
kafka-offset-monitor now supports Kafka brokers >= v0.9. This includes the added support for Kerberos / SASL_PLAINTEXT!
- Upgraded Kafka bindings to 0.9.0.1 and changed kafka client to new Java-based client - supporting Kafka brokers >= v0.9 and also supporting authentication/authorization.
- This will no longer work with versions of kafka pre-0.9.0.1, but has been tested to work on 0.9.0.1 and 0.10.
- Re-factored usage of ZkUtils due to a significant API change from 0.8 to
0.9. - Added two new command-line parameters:
kafkaBrokers
, andkafkaSecurityProtocol
-- allowing users to choose SASL_PLAINTEXT for the security protocol when using kafka as the offset storage location, but defaulting to the original PLAINTEXT protocol when no option is chosen. - Replaced copy-pasted wire-protocol code that retrieves metadata from the kafka brokers with an implementation that uses the kafka provided AdminClient.
- Replaced usage of the kafka Scala client from pre-0.9 with the new kafka java-based client.
- NOTE: While I have tried to make all of the necessary changes, zookeeper and storm offset storage paths have not been tested outside of the existing unit-tests.
- Refactored ZkUtilsWrapper due to underlying API change of ZkUtils.
- Changed the headings on the group include html to be more representative of the actual data.
- Offset becomes CommittedOffset.
- logSize becomes LogEndOffset.
- Upgraded Scala to 2.11.8
- Upgraded SBT to 13.13