diff --git a/README.md b/README.md index 46e31fa6..e9b9719d 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,41 @@ # Tortoise -**Tortoise is under the active development and not production ready yet.** - Tortoise Get cute Tortoises into your Kubernetes garden and say goodbye to the days optimizing your rigid autoscalers. +## Motivation + +At Mercari, the responsibilities of the Platform team and the service development teams are clearly distinguished. Not all service owners possess expert knowledge of Kubernetes. + +Also, Mercari has embraced a microservices architecture, currently managing over 1000 Deployments, each with its dedicated development team. + +To effectively drive FinOps across such a sprawling landscape, +it's clear that the platform team cannot individually optimize all services. +As a result, they provide a plethora of tools and guidelines to simplify the process of the Kubernetes optimization for service owners. + +But, even with them, manually optimizing various parameters across different resources, +such as resource requests/limits, HPA parameters, and Golang runtime environment variables, presents a substantial challenge. + +Furthermore, this optimization demands engineering efforts from each team constantly - +adjustments are necessary whenever there’s a change impacting a resource usage, which can occur frequently: +Changes in implementation can alter resource consumption patterns, fluctuations in traffic volume are common, etc. + +Therefore, to keep our Kubernetes clusters optimized, it would necessitate mandating all teams to perpetually engage in complex manual optimization processes indefinitely, +or until Mercari goes out of business. + +To address these challenges, the platform team has embarked on developing Tortoise, +an automated solution designed to meet all Kubernetes resource optimization needs. + +This approach shifts the optimization responsibility from service owners to the platform team (Tortoises), +allowing for comprehensive tuning by the platform team to ensure all Tortoises in the cluster adapts to each workload. +On the other hand, service owners are required to configure only a minimal number of parameters +to initiate autoscaling with Tortoise, significantly simplifying their involvement. + +See more details in the blog post: +- [Tortoise: Outpacing the Optimization Challenges in Kubernetes at Mercari](https://engineering.mercari.com/en/blog/entry/20240206-3a12bb1288/) +- [人間によるKubernetesリソース最適化の”諦め”とそこに見るリクガメの可能性](https://engineering.mercari.com/blog/entry/20240206-3a12bb1288/) + ## Install You cannot get it from the breeder, you need to get it from GitHub instead. @@ -19,34 +49,17 @@ make deploy You don't need a rearing cage, but need VPA in your Kubernetes cluster before installing it. -## Motivation - -Many developers are working in Mercari, and not all of them are the experts of Kubernetes. -The platform has many tools and guides to simplify the task of optimizing resource requests, -but the optimization takes engineering cost in every team constantly. - -The optimization should be done every time the situation around the service get changed, which could happen easily and frequently. -(e.g., the implementation change could change the way of consuming resources, the amount of traffic could be changed, etc) - -Also, when it comes to HorizontalPodAutoscaler(HPA), it's nearly impossible for human to optimize. -It’s not a simple problem which we just set the target utilization as high as possible – -there are many scenarios where the actual resource utilization doesn’t reach the target resource utilization in the first place -(because of multiple containers, minReplicas, unbalanced container’s size etc). +## Usage -To overcome those challenges, -the platform team start to have Tortoise, which is the automated solution for all optimization needs to be done for Kubernetes resource. +As described in [Motivation](#motivation) section, Tortoise exposes many global parameters to a cluster admin, while it exposes few parameters in Tortoise resource. -It aims to move the responsibility of optimizing the workloads from the application teams to tortoises (Platform team). -Application teams just need to set up Tortoise, and the platform team will never bother them again for the resource optimization - -all actual optimization is done by Tortoise automatically. +### Cluster admin -See a detailed motivation in the blog post: -- [Tortoise: Outpacing the Optimization Challenges in Kubernetes at Mercari](https://engineering.mercari.com/en/blog/entry/20240206-3a12bb1288/) -- [人間によるKubernetesリソース最適化の”諦め”とそこに見るリクガメの可能性](https://engineering.mercari.com/blog/entry/20240206-3a12bb1288/) +See [Admin guide](./docs/admin-guide.md) to understand how to configure the tortoise controller to make it fit your workloads in one cluster. -## Usage +### Tortoise users -Tortoise has a very simple interface: +Tortoise CRD itself has a very simple interface: ```yaml apiVersion: autoscaling.mercari.com/v1beta3 @@ -62,12 +75,11 @@ spec: name: sample ``` -Then, Tortoise creates fully managed autoscalers (HPA and VPA). - -Despite its simple appearance, it contains a rich collection of historical data on resource utilization beneath its shell, +Then, Tortoise creates HPA and VPA under the hood. +Despite its simple appearance, each tortoise stores a rich collection of historical data on resource utilization beneath its shell, and cleverly utilizes them to manage parameters in autoscalers. -Please refer to [User guide](./docs/user-guide.md) for other parameters. +Please refer to [User guide](./docs/user-guide.md) to learn more about other parameters. ## Documentations