Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blue green strategy: add delay before deleting venerable app #569

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

Cocossoul
Copy link
Contributor

@Cocossoul Cocossoul commented Sep 11, 2024

Some apps can take a few moments to be fully up even after Cloudfoundry declared them "started" : database schema migrations, establishing connections to clients...

We had some reports of downtime because the venerable app was deleted when the new app was not fully "started" (even if Cloudfoundry was showing it started : it's not a bug in Cloudfoundry more like a practical issue on our side)

The workaround we have found is to add a preconfigured delay before killing the venerable app, and I think it can be useful to other in our case.

DefaultBindTimeout = 5 * time.Minute
DefaultStageTimeout = 15 * time.Minute
DefaultAppPort = 8080
DefaultBlueGreenPostStartupWaitTime = 10
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should set it to 0 by default in order for it to not impact people who don't want it

@Cocossoul Cocossoul changed the title blue green strategy: add time before deleting venerable app blue green strategy: add delay before deleting venerable app Sep 11, 2024
@loafoe
Copy link
Member

loafoe commented Oct 22, 2024

@Cocossoul could you describe this flag in the documentation as well? Otherwise LGTM 👍🏻

@loafoe loafoe self-requested a review October 22, 2024 11:22
@Cocossoul
Copy link
Contributor Author

Cocossoul commented Oct 25, 2024

@sleungcy does the documentation I added seems alright ? I'm not really sure how to go about this, I'm open to suggestions

I'm planning on using the "workarounds" section for #570 option too

@sleungcy
Copy link
Collaborator

sleungcy commented Oct 25, 2024

@Cocossoul I'd say make sense to have this default to 0, and only enable for the applications needing the extra time.

However, I have one concern, the applications should not have returned a successful healthcheck until it's really ready to service traffic. If the applications returned it's correct healthcheck status, you should not have this scenario where the venerable app is destroyed before the green version of the app is ready.

The extra time each application needs will be hard to gauge. The setup time may vary between landscapes, locations, regions, and app versions. Tracking and maintaining this delay will add technical debt and overhead to the teams. Hardcoding it to large values and setting it once might seem like a solution, but it results in two copies of the app running, consuming costs and quotas. In large deployments, one application can have over 100 instances, each with 8GB of memory. Doubling this amount will significantly impact the quota available for other applications. It is preferable to delete the old app immediately once the new app is ready.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants