Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change provider balance on slow delivery every 5 minutes #4287

Merged
merged 2 commits into from
Dec 16, 2024

Conversation

CrystalPea
Copy link
Contributor

@CrystalPea CrystalPea commented Dec 3, 2024

Instead of every 10 minutes - this way we react faster, and fewer text messages get delayed.

This becomes more important as we now send lots of login codes for various organisations.

Also rebalance back to the middle every 15 minutes, instead of every 1 hour.

Those two changes together make our providers balancing act more dynamic - hopefully it will help us deliver messages quickly when one of the providers has delivery issues, while also giving them traffic back more quickly when they recover.

Check commit messages for more detail.

@CrystalPea CrystalPea force-pushed the change_provider_weighing_more_often_when_needed branch 2 times, most recently from 305bdbc to 029a8bf Compare December 6, 2024 17:18
Instead of every 10 minutes - this way we react faster, and
fewer text messages get delayed.

This becomes more important as we now send lots of login codes
for various organisations.

Here are some considerations for the change:
1. if we move from 10 minutes to 5 minutes, wouldn’t we use the notifications in our decision that were used in previous weighing too?

This is a valid question. This is happening already with current set up tho - as we look for last 15 minutes of sms, but we are changing weighing every 10 minutes.
So this wouldn’t be a new issue. Also this can be solved to either prioritise the best possible service for our users, or to prioritise fairest possible outcome for our providers. I believe getting text messages delivered promptly is more important than if a provider briefly loses more traffic because they were slow within last 15 minutes. Also if the slow down was just a short blip, on a subsequent delivery speed check some of the notifications will be from after the blip passed, and that should mean their priority has less chance of reducing further.

2. we should learn more before we merge it

On one side, fair, on the other, I kinda think it’s not a fundamental change, but rather a tuning, and we did give it some thought already, no?
We identified a problem (we adjust ratios too slowly when there is a period of slow delivery for one of our providers), we got an adjustment that should improve this, and we thought of potential flip-sides above (providers ratio will be lowered more quickly when they are slow, that means we will lower their ratio one more time than we would do before after their delivery speed recovers).
@CrystalPea CrystalPea force-pushed the change_provider_weighing_more_often_when_needed branch from 029a8bf to 6a422f8 Compare December 6, 2024 17:23
We used to only do it every one hour - let's go down to every
15 minutes, to balance with shorter intervals for taking traffic
away on slow delivery.
@CrystalPea CrystalPea force-pushed the change_provider_weighing_more_often_when_needed branch from 6a422f8 to c3822c1 Compare December 10, 2024 17:40
@CrystalPea CrystalPea merged commit f97c64f into main Dec 16, 2024
4 checks passed
@CrystalPea CrystalPea deleted the change_provider_weighing_more_often_when_needed branch December 16, 2024 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants