- Learn about EFK Stack (Elasticsearch, FluentD, and Kibana)
- Configure and deploy our application to log prediction events to Elastic Search
- Visualize events on Kibana dashboard
- Learn how to close the data feedback loop
To close the data feedback loop, we can log events in production to collect data about how our model is performing against real data. This data can later be curated and labeled to improve the dataset used during training. This allows us to continuously improve our models in production!
In this workshop, we use the EFK stack for our monitoring and observability infrastructure. It is composed of three main components:
- Elasticsearch: an open source search engine.
- FluentD: an open source data collector for unified logging layer.
- Kibana: an open source web UI that makes it easy to explore and visualize the data indexed by Elasticsearch.
-
In GoCD, click on the little gear symbol () next to
ci-workshop-app-X
to edit your deployment pipeline configuration. -
Open the "Environment Variables" tab and configure the FluentD host and port:
FLUENTD_HOST = elastic-stack-fluentd-elasticsearch.elk.svc.cluster.local
FLUENTD_PORT = 24224
-
Save and return to the Dashboard page.
-
Trigger a new application deployment pipeline and wait for it to succeed.
-
Visit your application in production to make a few predictions.
-
Visit the Kibana URL http://kibana.cd4ml.net and click on the "Discover" menu.
-
In the search field, find the entries tagged with your user, with a query string
tag:"userX.prediction"
(substituteX
with your user ID). -
Click "Refresh" and you should see your predictions logged!
- Done!
NOTE: after the end of the workshop, we delete all the infrastructure and GoCD pipelines for security and cost reasons.
You don’t need to use the same tools we chose to implement CD4ML. Get in touch with us if you want to learn how to run this workshop with your teams in your company!