Add an exponential backoff to harvester keep-alive notifications #326

jinnatar · 2022-01-14T21:46:06Z

At first notifications will arrive roughly at the same speed as before
but the delay starts increasing gradually over time. Keep-alive is still
tested with the same frequency but failures in between notifications are
only logged with the next notification threshold included.

Fixes #317

Also includes merging Docker changes from main -> dev
A Docker image has been built for testing: ghcr.io/artanicus/chiadog:keepalive_antispam-f73494
Example log output with extra debug and a shorter 60s threshold for the benefit of better seeing what's going on:

[2022-01-14 23:14:09] [    INFO] --- Connected HDD? The total plot count increased from 0 to 153. (non_decreasing_plots.py:33)
[2022-01-14 23:15:50] [ WARNING] --- Your harvester appears to be offline! No events for the past 101 seconds. (Iteration 0) (keep_alive_monitor.py:111)
[2022-01-14 23:16:51] [ WARNING] --- Your harvester appears to be offline! No events for the past 161 seconds. (Iteration 1) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:17:20.696072 (keep_alive_monitor.py:111)
[2022-01-14 23:17:51] [ WARNING] --- Your harvester appears to be offline! No events for the past 221 seconds. (Iteration 1) (keep_alive_monitor.py:111)
[2022-01-14 23:18:52] [ WARNING] --- Your harvester appears to be offline! No events for the past 282 seconds. (Iteration 2) (keep_alive_monitor.py:111)
[2022-01-14 23:19:52] [ WARNING] --- Your harvester appears to be offline! No events for the past 342 seconds. (Iteration 3) (keep_alive_monitor.py:111)
[2022-01-14 23:20:52] [ WARNING] --- Your harvester appears to be offline! No events for the past 403 seconds. (Iteration 4) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:20:54.446072 (keep_alive_monitor.py:111)
[2022-01-14 23:21:52] [ WARNING] --- Your harvester appears to be offline! No events for the past 463 seconds. (Iteration 4) (keep_alive_monitor.py:111)
[2022-01-14 23:22:53] [ WARNING] --- Your harvester appears to be offline! No events for the past 524 seconds. (Iteration 5) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:23:26.321072 (keep_alive_monitor.py:111)
[2022-01-14 23:23:53] [ WARNING] --- Your harvester appears to be offline! No events for the past 584 seconds. (Iteration 5) (keep_alive_monitor.py:111)
[2022-01-14 23:24:53] [ WARNING] --- Your harvester appears to be offline! No events for the past 644 seconds. (Iteration 6) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:27:14.133572 (keep_alive_monitor.py:111)
[2022-01-14 23:25:53] [ WARNING] --- Your harvester appears to be offline! No events for the past 704 seconds. (Iteration 6) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:27:14.133572 (keep_alive_monitor.py:111)
[2022-01-14 23:26:54] [ WARNING] --- Your harvester appears to be offline! No events for the past 764 seconds. (Iteration 6) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:27:14.133572 (keep_alive_monitor.py:111)
[2022-01-14 23:27:54] [ WARNING] --- Your harvester appears to be offline! No events for the past 824 seconds. (Iteration 6) (keep_alive_monitor.py:111)
[2022-01-14 23:28:54] [ WARNING] --- Your harvester appears to be offline! No events for the past 884 seconds. (Iteration 7) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:32:55.852322 (keep_alive_monitor.py:111)
[2022-01-14 23:29:54] [ WARNING] --- Your harvester appears to be offline! No events for the past 945 seconds. (Iteration 7) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:32:55.852322 (keep_alive_monitor.py:111)
[2022-01-14 23:30:54] [ WARNING] --- Your harvester appears to be offline! No events for the past 1005 seconds. (Iteration 7) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:32:55.852322 (keep_alive_monitor.py:111)
[2022-01-14 23:31:54] [ WARNING] --- Your harvester appears to be offline! No events for the past 1065 seconds. (Iteration 7) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:32:55.852322 (keep_alive_monitor.py:111)
[2022-01-14 23:32:54] [ WARNING] --- Your harvester appears to be offline! No events for the past 1125 seconds. (Iteration 7) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:32:55.852322 (keep_alive_monitor.py:111)
[2022-01-14 23:32:58] [ WARNING] --- Experiencing networking issues? Harvester did not participate in any challenge for 1128 seconds. It's now working again. (time_since_last_farm_event.py:40)
[2022-01-14 23:33:55] [    INFO] --- incident for EventService.HARVESTER is over (keep_alive_monitor.py:115)
[2022-01-14 23:35:55] [ WARNING] --- Your harvester appears to be offline! No events for the past 95 seconds. (Iteration 0) (keep_alive_monitor.py:111)
[2022-01-14 23:36:56] [ WARNING] --- Your harvester appears to be offline! No events for the past 156 seconds. (Iteration 1) To avoid flooding you, the next notification won't be sent before 2022-01-14 23:37:25.442751 (keep_alive_monitor.py:111)
[2022-01-14 23:37:56] [ WARNING] --- Your harvester appears to be offline! No events for the past 216 seconds. (Iteration 1) (keep_alive_monitor.py:111)
[2022-01-14 23:38:56] [ WARNING] --- Your harvester appears to be offline! No events for the past 276 seconds. (Iteration 2) (keep_alive_monitor.py:111)
[2022-01-14 23:39:02] [ WARNING] --- Experiencing networking issues? Harvester did not participate in any challenge for 282 seconds. It's now working again. (time_since_last_farm_event.py:40)
[2022-01-14 23:39:56] [    INFO] --- incident for EventService.HARVESTER is over (keep_alive_monitor.py:115)

…i#307) This commits adds new notification for plot increases (e.g connecting HDD) and makes the notifications for increases/decreases configurable via the config for every integration.

* Add a scripts/linux/chiadog.service systemd example that attempts to run chiadog in a more isolated environment. Create a new limited user each run (making much of the filesystem readonly), and set `.chia/mainnet` folders other than `log` to inaccessible. * Move the offset file - previously, the `debug.log.offset` file was kept in the chiadog directory, but when running in a read only filesystem we can't write there. Create a temporary directory when running, and store the offset file there. This also means we no longer need to delete the offset file on startup.

Chia is [migrating their keyfiles](https://github.com/Chia-Network/chia-blockchain/wiki/Passphrase-Protected-Chia-Keys-and-Key-Storage-Migration) to `~/.chia_keys`. Block this folder from access in the systemd service.

…mi#314) Sometimes it is useful to disable SMTP authentication for sending emails. For example, if someone is using a local postfix server, the default install doesn't require/enable auth. When this new config option is set to enable_smtp_auth: false, the username_smtp and password_smtp config options are ignored.

martomi

Thanks for following up with this! I have some feedback that needs to be addressed before I can upstream this. Let me know what you think.

src/notifier/__init__.py

martomi · 2022-01-22T23:16:34Z

src/notifier/keep_alive_monitor.py

+                    if self._keep_alive_iteration[service] > 0:
+                        logging.info(f"incident for {service} is over")
+                    self._keep_alive_iteration[service] = 0
+                    self._keep_alive_incident_time[service] = None


Now that I'm looking at this, it's actually making this class more complex than I'd like (and really, a pain to maintain). It would be justified to refactor this into its own class. This will allow us to cover the functionality by unit tests too. I'd be hesitant to merge before we have some simple unit-test coverage.

As I see it, It should keep state for a single service, and then we can create instances for every service here. I'd propose the following interface (name can be better):

class EventThrottleHelper: def __init__(self, interval_seconds: int): self.event_counter = 0 self.first_event_time = None self.interval_seconds = interval_seconds def should_trigger_event(self): time_now = datetime.now() # Auto-detect when to reset event_counter and # first_event_time based on current time, # interval_seconds and first_event_time

where you initialize the class with the self._last_keep_alive_threshold_seconds[service] and everything else can be encapsulated.

The change in KeepAliveMonitor would then be very minimal, just extend the if-statement on line 80:

if seconds_since_last > self._last_keep_alive_threshold_seconds[service] and \ throttle_helper[service].should_trigger_event():

.github/workflows/publish-image.yml

At first notifications will arrive roughly at the same speed as before but the delay starts increasing gradually over time. Keep-alive is still tested with the same frequency but failures in between notifications are only logged with the next notification threshold included. Fixes martomi#317

This wasn't caught earlier since the data from the Event isn't used yet. This is purely for future proofing so that the iteration count is available, even if not included in the message.

This had no practical application yet

kanasite and others added 9 commits August 15, 2021 11:14

Tests: Add block found tests (martomi#278)

5510d0d

Parsers: Fix block parser (martomi#293)

8c0a419

Bugfix: Check for non MacOS operating systems before running apt-get

84b69c0

Add Pushcut notifier as an alternative to Pushover (martomi#304)

0f3b07a

Add IFTTT notifying functionality (martomi#305)

aa1308d

Config: Toggle notifications for plot increases and decreases (martom…

972797f

…i#307) This commits adds new notification for plot increases (e.g connecting HDD) and makes the notifications for increases/decreases configurable via the config for every integration.

Add ~/.chia_keys to list of InaccessiblePaths in systemd

51c118f

Chia is [migrating their keyfiles](https://github.com/Chia-Network/chia-blockchain/wiki/Passphrase-Protected-Chia-Keys-and-Key-Storage-Migration) to `~/.chia_keys`. Block this folder from access in the systemd service.

jinnatar marked this pull request as ready for review January 14, 2022 21:54

martomi requested changes Jan 22, 2022

View reviewed changes

jinnatar added 6 commits January 23, 2022 09:29

Minor reformatting so Black is happy

24e4e6b

Correct typing for exponential_backoff()

9efd5a2

Fix iteration passing to the Event

6d6e3ca

This wasn't caught earlier since the data from the Event isn't used yet. This is purely for future proofing so that the iteration count is available, even if not included in the message.

Type annotate exponential_backoff params

6f42fa5

Don't record iteration count in the Event

260abab

This had no practical application yet

jinnatar force-pushed the keepalive_antispam branch from f734943 to 260abab Compare January 23, 2022 07:43

jinnatar marked this pull request as draft January 23, 2022 07:45

martomi force-pushed the dev branch from b4342fe to e8fb6ca Compare July 2, 2022 10:10

martomi deleted the branch martomi:dev July 2, 2022 14:32

martomi closed this Jul 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add an exponential backoff to harvester keep-alive notifications #326

Add an exponential backoff to harvester keep-alive notifications #326

jinnatar commented Jan 14, 2022 •

edited

Loading

martomi left a comment

martomi Jan 22, 2022

Add an exponential backoff to harvester keep-alive notifications #326

Add an exponential backoff to harvester keep-alive notifications #326

Conversation

jinnatar commented Jan 14, 2022 • edited Loading

martomi left a comment

Choose a reason for hiding this comment

martomi Jan 22, 2022

Choose a reason for hiding this comment

jinnatar commented Jan 14, 2022 •

edited

Loading