Range Selector #902

KickedDroid · 2023-03-31T20:30:47Z

KickedDroid
Mar 31, 2023

*This is pertaining to the Range Based Set Reconciliation protocol.

TLDR; Using Small Ranges for lite nodes/embedded devices/ low bandwidth areas

This is less of a feature and more of a discussion as I am just spit balling.

Here are some questions I have

Will iroh have the functionality to select an initial range programmatically?
Would using these guidelines help find an optimal starting range?
Would this be slower than just starting with halving the sets?
Can you send many overlapping ranges?
How to prevent duplicates when requesting from many provs

Selecting an initial range guidelines:

Set size: a lossyish approach for smaller sets(files) , a large range size may be more appropriate, while larger sets(files) may benefit from smaller range sizes.
Number of differences: Sets with a higher number of differences may benefit from smaller range sizes to reduce the number of missing elements that need to be transmitted.
Available bandwidth: If bandwidth is limited, smaller range sizes may be more suitable as they prevent duplicated chunks being sent over the wire.
Computational power: Larger range sizes may require more computational power to compute the hash values of the ranges, so the available computational resources should be considered when selecting the range size.
Number of Provider Sets: For example if you're asking for a file from one node just use a large range or split the sets in half recursively. But if there are 3+ nodes with the providing the file it would make more sense to use a small range to begin with and maybe different ranges per provider, in my head at least lol.

Example

Lets imagine a low resource node requesting a large Set of blobs. A possible approach to finding the optimal range size would be to start with a small range size and gradually increase it while measuring the performance (i.e., the number of transmissions required and duplicates) for each range size. Once the performance begins to degrade or level off, that range size could be considered the optimal range size for the set reconciliation. The node would terminate when the optimal range size is found, or when the range size exceeds a predefined maximum value, or the transfer is completed.

Note that the specific implementation details and performance metrics may vary depending on the specific application but you could just always fall back to the recursive version as well by default. :)

b5 · 2023-05-01T15:05:49Z

b5
May 1, 2023
Maintainer

Picking up on this a month later, we have a plan for some of the primitives required to do set reconciliation, specifically a protocol for doing whole collections, sub-collections, and sub-sequences of bytes within blobs. We haven't documented this properly yet, but Rüdiger from our team goes into it in detail: https://youtu.be/bK9KDJxCfzI?t=271

This isn't set reconciliation, we'd need to build that on top. We have an early proof of concept that takes the first steps to implement CAR Mirror for the WNFS project: https://github.com/n0-computer/appa/blob/c5afcecbed0bab4b515c9e171e6a5d937ab4f440/src/main.rs#L272-L301

@AIDXNZ, the appa project is an example of "Custom Requests", an abstraction atop the Iroh data transfer protocol we're designing that would let you tackle some of these exact explorations like latency-based optimization as an extension.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Range Selector #902

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

Range Selector #902

KickedDroid Mar 31, 2023

TLDR; Using Small Ranges for lite nodes/embedded devices/ low bandwidth areas

Selecting an initial range guidelines:

Example

Replies: 1 comment

b5 May 1, 2023 Maintainer

KickedDroid
Mar 31, 2023

b5
May 1, 2023
Maintainer