Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(iroh-dns-server)!: eviction of stale zonestore entries #2997

Merged
merged 26 commits into from
Dec 5, 2024
Merged

Conversation

rklaehn
Copy link
Contributor

@rklaehn rklaehn commented Dec 3, 2024

Description

Configurable eviction of stale zonestore entries.

This works by taking a snapshot of the database in regular intervals and checking if a record is possibly expired. For all possibly expired records a fire and forget message will be sent to the io write actor to check again. The actor will check the expiry again (entry could have been renewed since the last snapshot) and then deletes it if the final check confirms expiry.

Between expiry checks there is a configurable delay, so the thread is not constantly spinning checking for expiry.

We use a second thread since we don't want to block writing new entries by sifting through old entries.

Breaking Changes

  • struct config::Config has a new field zone_store
  • struct metrics::Metrics has a new field store_packets_expired

Notes & open questions

Note: there are two ways to do eviction. One is to carefully keep track of the time for each entry by having a second table that has (timestamp, key) as the key and () as the value. Then you could just evict without doing a full scan by sorting by time ascending.

The downside of course is that you need an entire new table, and you need to update this table every time you update an entry (delete (old time, id), insert (new time, id).

So I decided to just do a full scan instead for simplicity. We can change it if it becomes a problem.

Hm, maybe we should avoid the full scan after all. I can imagine the thing being a bit less responsive than usual while the scan is ongoing. Another idea would be to have an "event log" where you just store (time, id) -> () and then use that to look for eviction candidates. Don't bother cleaning up this event log on every update.

I have now implemented a second table. It is a multimap table from timestamp to id. This gets updated on every write (slight perf downside here), and can be used to scan for evictions without having to do a full scan.

There is quite a lot of code just to expose these config options. We could also omit this and just use reasonable defaults.

Change checklist

  • Self-review.
  • Documentation updates following the style guide, if relevant.
  • Tests if relevant.
  • All breaking changes documented.

@rklaehn rklaehn changed the base branch from main to rklaehn/batch-writes December 3, 2024 16:05
@rklaehn rklaehn requested a review from matheus23 December 3, 2024 16:35
Base automatically changed from rklaehn/batch-writes to main December 3, 2024 16:35
# Conflicts:
#	Cargo.lock
#	iroh-dns-server/Cargo.toml
#	iroh-dns-server/benches/write.rs
#	iroh-dns-server/src/store/signed_packets.rs
Copy link

github-actions bot commented Dec 3, 2024

Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2997/docs/iroh/

Last updated: 2024-12-05T14:31:09Z

Copy link

github-actions bot commented Dec 3, 2024

Netsim report & logs for this PR have been generated and is available at: LOGS
This report will remain available for 3 days.

Last updated for commit: db783b9

@rklaehn rklaehn changed the title feat(iroh-dns-server): eviction of stale zonestore entries feat(iroh-dns-server)!: eviction of stale zonestore entries Dec 3, 2024
a second table that has time -> Set<Key>
@rklaehn rklaehn requested a review from Arqu December 5, 2024 09:02
@rklaehn rklaehn marked this pull request as ready for review December 5, 2024 12:54
Copy link
Collaborator

@Arqu Arqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Arqu
Copy link
Collaborator

Arqu commented Dec 5, 2024

Aside from the messed up git history I guess :D

Copy link
Contributor

@matheus23 matheus23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is quite a lot of code just to expose these config options. We could also omit this and just use reasonable defaults.

I think the options are neat, but I'll let @Arqu comment more on these preferences.

Some slight worry about keeping invariants, other than that, I think this looks good :)

EDIT: I wrote that comment before Arqu commented, seems like it's fine by him :) 👍

iroh-dns-server/src/store/signed_packets.rs Show resolved Hide resolved
iroh-dns-server/src/store/signed_packets.rs Outdated Show resolved Hide resolved
since time is probably unique in 99.9% of all cases, it does not make a diff...
@rklaehn rklaehn added this pull request to the merge queue Dec 5, 2024
Merged via the queue into main with commit 74884f1 Dec 5, 2024
25 of 26 checks passed
Copy link
Contributor

@matheus23 matheus23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would still like abstracting out insert_packet and remove_packet calls, but generally happy with this

@Arqu Arqu deleted the eviction branch December 5, 2024 21:30
if existing.more_recent_than(&packet) {
res.send(false).ok();
continue;
} else {
replaced = true;
// remove the packet from the update time index
tables.update_time.remove(&packet.timestamp().to_be_bytes(), key.as_bytes())?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be tables.update_time.remove(&existing.timestamp().to_be_bytes(), key.as_bytes())?; (existing instead of packet)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

5 participants