Kaiju is an artifact based on the Esper engine and implementing event-driven observability in container orchestration through Kubernetes.
The content of this repository is discussed in details in:
Mario Scrocca, Riccardo Tommasini, Alessandro Margara, Emanuele Della Valle, and Sherif Sakr. 2020. The Kaiju project: enabling event-driven observability. In Proceedings of the 14th ACM International Conference on Distributed and Event-based Systems (DEBS ’20). Association for Computing Machinery, New York, NY, USA, 85–96. DOI:https://doi.org/10.1145/3401025.3401740
Key aspects:
- Kaiju can process observations (i.e. metrics, logs and traces) as events in real-time
- Kaiju is an easily-pluggable solution for companies already implementing projects of the CNCF stack
- Kaiju implements a modular solution to take into account the different processing required in dealing with metrics, logs and traces
- Kaiju enables the definition and processing of custom and configurable types of events between different components
The repository is organized in 5 main folders:
kaiju
: Contains the source file for Kaiju modules.rim
: Contains a modified version of the HotR.O.D. demo application by Uber.kube
: Contains manifests to launch an environment for experiments using a Kubernetes cluster.fluentd-kaiju
: Contains files to build the modified fluentd image reporting logs to Kaiju.kaiju-agent
: Contains files to build the modified agent reporting traces to Kaiju.
Kaiju is composed of four different types of modules, each one can collect a different sort of data, and can forward generic events modelled as POJO (Event.java
class in the Kaiju project). The modularization enables horizontal scalability for each type of module and the hierarchical composition of kaiju-hl
modules.
The Kaiju
image can be built using the DockerFile
provided and then launched with different arguments to execute the four different models. Available arguments are:
--mode
or-m
(Required) to set the type of module to be executed. Values can betraces
,logs
,metrics
,high-level
.--parse
(defaulttrue
) to set whether or not the filestmts/statements.txt
should be parsed looking for statements to be installed.
--rtime
or-rt
(default2min
) to set the retention time to be substituted in statements using the:retentionTime:
placeholder.
Syntax of statements.txt
files is: key-config
: value-config
[,key-config:value-config]*
=statement
. One statement for each line. #
at the beginning of a line comments it out.
To integrate kaiju-metrics
with the CNCF stack, we would like to receive data from an agent that pulls Prometheus endpoints. We choose to adopt the Telegraf agent from Influx and its plugin to scrape Prometheus endpoints. This plugin allows to manage simple endpoints, multiple endpoints load-balanced from a service and resources annotated for scraping. Deploying a set of agents in the different nodes of the cluster allows to collect metrics forwarding them to the kaiju-metrics
component. We model each metric as a POJO taking into account the Telegraf output format: the timestamp when the metric were collected, a name, a key-value map of tags as labels and a key-value map of fields representing values sampled.
To integrate kaiju-logs
with the CNCF stack, we would like to implement a unified logging layer for all cluster logs. We chose then to provide a solution based on Fluentd and requiring little modifications to configuration files. We implemented an output plugin able to forward data to the kaiju-logs
component, and we specified the DockerFile
s to build a customer fluentd-kaiju
image integrating this plugin (available in folder fluentd-kaiju
). We model each log as a POJO with a key-value map of fields. The main problem with logs is related to the multitude of different formats, often not structured and difficult to parse. For this reason, we choose to adopt the ingestion time as timestamp for logs and to process them trying to flatten JSON based syntax in a map of fields
To integrate kaiju-logs
with the CNCF stack, we would like to be able to retrieve traces from applications instrumented with the OpenTracing API. We chose to exploit language-dependent client libraries provided by Jaeger and we implement a custom version of the jaeger-agent reporting them to the kaiju-traces
component (available in folder kaiju-agent
together with the DockerFile
). As in Jaeger the agent receives spans in push on a UDP port and forwards data to the specified collector. We model each span and its contained events as a set of POJOs, as defined in the internal Jaeger model. We consider as timestamp for spans the ingestion time.
kaiju-hl
accepts socket connections from other modules to collect events. kaiju-metrics
, kaiju-logs
, kaiju-traces
or other kaiju-hl
modules can forward data assigning statements to a specific listener. kaiju-hl
accepts only events parsable to the POJO described in the Event.java
class, but offers the possibility to configure, from file stmts/events.txt
, a set of different events that are extracted from this incoming stream. The configuration files requires to specify for each event, the name of the event, pairs key-datatype for payload, pairs key-datatype for context, name of inherited events. For each event configured it is automatically created: (i) a create schema
statement; (ii) an insert into
statement from the Event
stream, checking existence of keys in the payload to identify items and selecting event properties in payload/context.
- Syntax of
events.txt
files is:event-name
>{payload-key
:type
[,payload-key:type]*
}>{context-key
:type
[,context-key:type]*
}>{inherits-event-name
[,inherits-event-name]*
}
The rim
folder contains a modified version of the HotR.O.D. demo app from Jaeger:
- composed of four micro-services deployable separately;
- enabling configurable introduction of source of latency/errors (see related ConfigMap in
kube
folder for available modifications); - enabling possibility to generate load tests through embedding of the makerequests.js library;
- instrumented with OpenTracing to report also Kubernetes metadata.
For more details see Rim repository.
An environment for experiments can be run downloading kustomize
and setting up kubectl
to communicate with a Kubernetes cluster. kube
folder contains manifests.
cd kube
kustomize build | kubectl apply -f -
The basic deployment is based on a micro-service application running in a Kubernetes cluster and comprises:
- the Rim app with each service scaled up to 3 replicas;
- components talking with the Kubernetes API to emit additional cluster metrics ([node-exporter](GitHub - prometheus/node_exporter: Exporter for machine metrics; https://github.com/prometheus/node_exporter), kube-state-metrics) and logs (event-router);
- Kaiju basic deployment with one module for each type;
- the Telegraf and Fluentd agents properly configured on each node;
- the custom agent to collect traces deployed as sidecar in pods of Rim.
To launch a golang-based load test created with the Rim
application you should build the image hotrod-load
in folder kube/load
and deploy the Kubernetes Job using the provided manifest.