Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Batch email notifications to prevent email flooding #132

Open
nemesifier opened this issue Sep 23, 2020 · 7 comments · Fixed by #276
Open

[feature] Batch email notifications to prevent email flooding #132

nemesifier opened this issue Sep 23, 2020 · 7 comments · Fixed by #276
Assignees
Labels
enhancement New feature or request important

Comments

@nemesifier
Copy link
Member

nemesifier commented Sep 23, 2020

Whenever there are infrastructure issues that cause lots of alerts, email boxes start to refuse receiving alert emails, which raise errors but is also bad because the email address sending the alerts could be flagged as spam.

The solution for this problem is to add the possibility to batch emails in a way that if there are more than X emails to the same user in less than X minutes, the subsequent emails get batched so they don't get sent singularly but rather summarized in a single email sent after the situation cools off (say X minutes after the user is not receiving emails).

A default configuration could be as follows:

  • if more than 1 email has been sent to a specific user in the last 30 minutes
  • subsequent emails to the same user get squashed in a summary, sending is paused for 30 minutes
  • if another email is sent to the user, it will also be added to the summary which is on hold for sending and the timer will be reset (eg: 30 more minutes), this repeats until the timer expires and the summary is finally sent

It's just an idea for now, we can improve it further.

A good example of an open source application which does this is sentry:

Screenshot from 2023-02-07 16-09-16

@nemesifier nemesifier added the enhancement New feature or request label Sep 23, 2020
@pandafy pandafy added this to the 0.4.0 milestone Jan 26, 2022
@nemesifier nemesifier removed this from the 0.4.0 milestone Jan 31, 2022
@nemesifier nemesifier changed the title [feature] Possibility to batch emails [feature] Batch email notifications to prevent email flooding Feb 7, 2023
@pandafy
Copy link
Member

pandafy commented May 28, 2024

In our Weekly GSoC Meeting, I and @Dhanus3133 discussed this issue. Dhanus proposed this prototype for implementing this feature.

We brainstormed the following implementation in the meeting:

  1. When send_email_notification is executed
    • store the timestamp of the notification in the cache using user_id as cache key
    • do not send the email to the user
    • do not set emailed=True on the notification
    • schedule a task batch_email_notification to execute after 30 minutes from now
  2. When the send_email_notification is executed again for the same user
    • check if the last notification time is present in the cache, if not go to step 1
    • if the last notification time is less than 30 minutes ago, then return from the function. This means the current notification will be sent in the batch notification email
    • if the last notification time is more than 30 minutes ago, overwrite the time and schedule the batch_email_notification task to execute after 30 minutes from now

batch_email_notification task:

  • this task will loop over all the unsent notification of the user in the last 30 mintues Notification.objects.filter(emailed=False, recipient_id=user_id, created_gte=now()-timedelta(minutes=30))
  • we will create a separate template for sending batch notifications
  • add all the unsent notifications to the email and the email to the user.

@pandafy
Copy link
Member

pandafy commented May 28, 2024

  • if another email is sent to the user, it will also be added to the summary which is on hold for sending and the timer will be reset (eg: 30 more minutes), this repeats until the timer expires and the summary is finally sent

@nemesifier this may lead to a user never receiving a notification, if the system manages to generate atleast one notification in the 30 minutes interval.

Maybe, we can implement some kind of backoff? E.g. first batch notification is sent at 30 minutes, second after 60 minutes from the last one and so on. I think, 30 minutes is a good cooldown time for batching. We may not even need to implement the backoff mechanism.

@pandafy
Copy link
Member

pandafy commented May 28, 2024

Thinking out loud

We should not include all the notification in the email because it may lead to a very long email. Maybe, we should should show only 10 notifications in the email like in the shared screenshot and provide a URL in the end "view all notifications". We should show the correct number of notifications in the email though.

@Dhanus3133
Copy link
Member

batch_email_notification task:
this task will loop over all the unsent notification of the user in the last 30 mintues Notification.objects.filter(emailed=False, recipient_id=user_id, created_gte=now()-timedelta(minutes=30))

By this, we will send the emails to all the messages created recently, but we should also validate the user email verified and the organization email preference for the messages.

If these must be even validated then I think of two different approaches we can use.

  1. Validate these details again in tasks before sending an email.
  2. We Can store the Notification IDs in a cache and use it in the celery task so this can avoid querying each notification everytime.

Let me know which one suits good.

@pandafy
Copy link
Member

pandafy commented Jun 6, 2024

I don't think we need to verify email preference again in the task. We are already doing it in the send_notification_email handler. That should be sufficient.

@Dhanus3133
Copy link
Member

Dhanus3133 commented Jun 6, 2024

I don't think we need to verify email preference again in the task. We are already doing it in the send_notification_email handler. That should be sufficient.

Doesn't this cause an issue when we have it for two different organizations with different email preferences as we only filter it by recipient_id and not by org.

@pandafy
Copy link
Member

pandafy commented Jun 6, 2024

Doesn't this cause an issue when we have it for two different organizations with different email preferences as we only filter it by recipient_id and not by org.

A really good catch @Dhanus3133 👏🏼

I think, we can store the notification IDs in the cache as you proposed then.

nemesifier added a commit that referenced this issue Jul 30, 2024
Implements and closes #132.

---------

Co-authored-by: Federico Capoano <[email protected]>
Co-authored-by: Gagan Deep <[email protected]>
Dhanus3133 added a commit that referenced this issue Aug 21, 2024
Implements and closes #132.

---------

Co-authored-by: Federico Capoano <[email protected]>
Co-authored-by: Gagan Deep <[email protected]>
@nemesifier nemesifier moved this to To do general in Roadmap Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request important
Projects
Status: To do general
Development

Successfully merging a pull request may close this issue.

3 participants