As a part of this tutorial, we will see how to build data pipeline versioning using Kolle without writing any single line of code. Blue-green deployment will be used for data pipeline versioning. Blue green deployment is fast, easy to deploy and rollback.
Domain: Insurance policy
Source data: Datasets
Version_0 -> Version_1 -> version_2
- Build data pipeline from semi structure data
- Deploy pipeline as version_0
- Copy from current version_0 to new version_1
- Deploy new version_1 and switch triffic to version_1
- Remove version_0 from system
- Create new version_2 from current version_1
- Rollback version_2 and continue with version_1
- Json file as data producer
- Kafka for event streaming to ingest and process data in real-time
- Kolle for metadata repository and automation