Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] feat: willow implementation #2231

Closed
wants to merge 813 commits into from
Closed

[wip] feat: willow implementation #2231

wants to merge 813 commits into from

Conversation

Frando
Copy link
Member

@Frando Frando commented Apr 25, 2024

Description

Work in progress! Implementation of https://willowprotocol.org/ for iroh

Protocol

The protocol module contains the data structures and basic interactions as defined by the Willow specs. It is IO-less and not generic, all types are concrete.

Session

The session module contains the implementation of the WGPS protocol session.

  • Session setup
  • Private area intersection
  • Set reconciliation
  • Resource control
    • Binding resources
    • Freeing resources
    • Issue initial guarantees
    • Respect issued guarantees
    • Issue new guarantees
  • Live data mode between two peers
    • Subscribe to new inserts and send to active sessions
    • Only send entries for matcing shared areas of interest
  • Verified streaming for payloads with bao outboards embedded
  • Payload requests
  • Live data mode in a swarm setting

Net

The net module opens sessions over iroh-net QUIC connections.

Store

The store module contains backing stores for entries and keys

  • Store traits and generic functionality
    • Entry store trait
    • Key store trait
    • Use iroh-blobs for payload storage
    • Store capabilities by hash and do reference counting
    • Expose hook to prevent GC of payloads and/or do reference counting for payloads
  • PoC memory store
  • move store to own thread & sessions to local pools
  • redb store
    • port @rklaehn's redb PoC
    • figure out if the traits need to change for our delayed commit strategy
    • add commit to the store trait
  • subscriptions
    • reliable/persistent subscriptions

Engine

  • PeerManager (1 session peer peer only)
  • Intents
    • changing between ReconcileOnce and Live modes

Other

  •  How can builder APIs be done nicely over FFI

Breaking Changes

Once we integrate, this will be a very much breaking changes for everything around docs: This will break not only APIs, but also the protocol and the storage. We cannot offer automatic migration even, because signatures change. How exactly we deal with this situation is tbd. Likely we will offer an out-of-tree tool to migrate data from the old iroh-docs to the new willow iroh-docs.

Notes & open questions

Change checklist

  • Self-review.
  • Documentation updates if relevant.
  • Tests if relevant.
  • All breaking changes documented.

@Frando Frando marked this pull request as draft April 25, 2024 10:23
Frando and others added 14 commits July 5, 2024 23:44
## Description

Is the tagline correct?
Can we enable this discord widget?

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
## Description

- Enable binding of the RPC endpoint to arbitrary addresses
- Add Dockerfile for iroh

## Breaking Changes

- changed `iroh::node::Node::my_rpc_port` is now `my_rpc_addr` and
returns `Option<SocketAddr>`
- `iroh::node::Builder::rpc_endpoint` takes `Option<SocketAddr> instead
of `Option<u16>`
- added `iroh::node::Builder::enable_rpc_with_addr` allowing to specify
on which address + port to bind to

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [x] Self-review.
- [x] Documentation updates if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.

---------

Co-authored-by: dignifiedquire <[email protected]>
## Description

Fix n0-computer/iroh-ffi#152

## Breaking Changes

None

## Notes & open questions

## Change checklist

- [x] Self-review.
- [x] ~~Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.~~
- [x] Tests if relevant.
- [x] All breaking changes documented.
## Description

Windows CI was so slow, the transfer was measured in *minutes*, not
*seconds* and printed as such to the CLI.

The CLI tests only expected "seconds" to be printed though. 😬 

Also went ahead and improved performance of `make_test_file` because
damn this test is already slow enough.

## Breaking Changes

None

## Notes & open questions

Idk, do you have any? :P

## Change checklist

- [X] Self-review.
- ~~[ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.~~
- ~~[ ] Tests if relevant.~~
- ~~[ ] All breaking changes documented.~~
## Description

Using [swarm-discovery](https://github.com/rkuhn/swarm-discovery), this
PR implements a discovery service for `LocalSwarmDiscovery`, allowing
nodes on the same local network to discover each other without using the
outside internet.

closes #2354

## Notes & open questions

1) Is `LocalNodeDiscovery` too much of a mouthful? `swarm-discovery`
says, of itself, "This library offers a lightweight discovery service
based on mDNS." Soooo, how comfortable are we just calling this `mdns`?
edit: calling this `LocalSwarmDiscovery` & not `mdns`

2) There are some const to bikeshed:
```rust
/// The n0 local node discovery name
const N0_MDNS_SWARM: &str = "iroh.local.node.discovery";
edit: now N0_LOCAL_SWARM: &str = "iroh.loca.swarm"

/// Provenance string
const PROVENANCE: &str = "local.node.discovery";
edit: now "local.swarm.discovery"

/// How long we will wait before we stop sending discovery items
const DISCOVERY_DURATION: Duration = Duration::from_secs(10);
```

3) For the `local_swarm_discovery` example, I pulled some code from
`iroh-cli` into a new `iroh-progress`, so that I could re-use some
progress terminal output it in the example. `iroh-progress` might not be
the right name, but what do we think of a separate crate for putting in
reusable UI-ish pieces. If we aren't on board with this, I can just copy
and paste the logic in for now & remove the crate.
edit: Removed this in favor of copy and pasting the code

4) I've added `LocalSwarmDiscovery` as a default discovery service, I'm
assuming that's what we are going for here?

## Change checklist

- [x] Self-review.
- [x] tests
- [x] Documentation updates if relevant.
## Description

<!-- A summary of what this pull request achieves and a rough list of
changes. -->
GitHub Actions `cargo deny` is failing at
https://github.com/n0-computer/iroh/actions/runs/9928231752/job/27438879671

% `cargo update -p bytes`
```
    Updating crates.io index
     Locking 1 package to latest compatible version
    Updating bytes v1.6.0 -> v1.6.1
note: pass `--verbose` to see 152 unchanged dependencies behind latest
```
https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md to get the
fix in
* tokio-rs/bytes#718
## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
## Description

Modifies the `flaky` workflow to generate a report with all the tests
that failed per matrix combo, so that we can receive it on discord. This
requires to modify the `tests` workflow as well to generate `libtest`
json reports.

Reports are uploaded as job artifacts with an unique name, and retained
for a day.

From my last commit at the time of writing, this is the report we get:
```md
Flaky tests failure:

- **ubuntu-latest all stable**
iroh-net::iroh_net::discovery::local_swarm_discovery::tests::test_local_swarm_discovery
ubuntu-latest default stable
iroh-cli::cli::cli_bao_store_migration
- **windows-latest all stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest default stable**
iroh-cli::cli::cli_provide_file_resume
iroh-cli::cli::cli_provide_tree_resume
- **windows-latest none stable**
iroh-cli::cli::cli_provide_tree_resume
iroh-cli::cli::cli_provide_file_resume

See https://github.com/n0-computer/iroh/actions/workflows/flaky.yaml
```
which reads clear enough in discord imo

## Breaking Changes
n/a

## Notes & open questions

- there is another test report which is more widely used but for which
there is not a single decent parser I could find. The format is Jenkins'
junit xml format. From my searches, everyone knows how to produce these
files, but not how to read them. Since this is xml, which is not exactly
compatible with serde, reading these kind of files was a task I
considered not worth doing for what we actually want to achieve, which
is simply more visibility over flaky tests.
- That being said, `libtest` format is not stable, and the `nextest`
feature to obtain it is unstable as well. It does not seem to change
quickly or drastically at all, so I think we will be fine for some time.
Being json the underlying format, it should be easy to adapt if
necessary.

## Change checklist

- [x] Self-review.
- [ ] ~~Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.~~
- [ ] ~~Tests if relevant.~~
- [ ] ~~All breaking changes documented.~~
Cargo.toml Outdated Show resolved Hide resolved
@@ -0,0 +1,155 @@
//! Types for forms for entries
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of these, but form seems a little confusing as a name, I don't have a better one (yet) though

let (mut send, recv) = conn.open_bi().await?;
send.write_u8(ch.id()).await?;
trace!(?ch, "opened bi stream");
Result::<_, anyhow::Error>::Ok((ch, Some((send, recv))))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can write anyhow::Ok as a shorthand

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worked in most places but not all, not sure why.

send_stream: &mut SendStream,
recv_stream: &mut RecvStream,
) -> anyhow::Result<InitialTransmission> {
let our_nonce: AccessChallenge = rand::random();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please pass an rng in for all randomness used, both for testing and for different environments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it to pass the nonce into the net functions.
In the upper layer (PeerManager) it's not yet pluggable.

dignifiedquire and others added 24 commits November 4, 2024 20:51
## Description

icu switched from some draft unicode licence to unicode-3.0. It's
basically MIT with some seasoning.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
## Description

Can now be used to generate artifacts without having to batch it
together with making a release.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
## Description

<!-- A summary of what this pull request achieves and a rough list of
changes. -->

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
## Description

This is getting very annoying but very inconvenient to test and docs &
alignment between GHA and platform is less than ideal.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
## Description

This enables the unstable rustfmt option to format the code in docs.
Should make our examples a lot more consistent.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

This is unstable, so there might be some churn or bad results.  But I
don't think the churn should be much of an issue for us.

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [x] Tests if relevant.
- [x] All breaking changes documented.
## Description

This is it, as tested manually in
#2897

https://github.com/n0-computer/iroh/actions/runs/11682746623/job/32530542557?pr=2897

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [ ] Self-review.
- [ ] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] Tests if relevant.
- [ ] All breaking changes documented.
…obs (#2874)

## Description

Move blobs and tags rpc client and server to iroh-blobs.

Depends on n0-computer/iroh-blobs#7

Todo:

- [x] merge n0-computer/iroh-blobs#7

## Breaking Changes

- Lots of types in client::blobs and client::tags become reexports
- blob share goes away, since it requires the node client

## Notes & open questions

I want to keep the client::blobs and client::tags modules
self-contained, so the idea is that these will reexport all the things
from iroh-blobs::rpc::client that a user will need (except I have
probably forgotten something). Maybe I should use wildcard exports here
even though people dislike them... ?

~~The client::blobs::Client itself is *not* a reexport but a newtype to
hide the ugly type parameters. Same for client::tags::Client.~~

With the changes in quic-rpc, these are now just module reexports!

The Blobs protocol handler now takes an Endpoint, since that was needed
to implement one of the functions.

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [x] Tests if relevant.
- [x] All breaking changes documented.
## Description

This crate is unmaintained, but used indirectly by some dependencies.
No big deal.

## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

<!-- Any notes, remarks or open questions you have to make about the PR.
-->

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [x] Tests if relevant.
- [x] All breaking changes documented.
## Description

Adds a new crate `iroh-relay`. It's structured as follows:
- protos: protocols the server is able to handle, this includes the
relay protocol, stun, and partially disco.
- server: the server code, as it was already in the og mod
- client: the client code, as it was already in the og mod
- server binary (bin `iroh-relay`) as it was previously in iroh-net's
bin.

This moves the code around in the least disruptive possible way to give
us a starting point over which to improve this. Check the notes and open
questions for future work

## Breaking Changes

- `iroh_net::relay` is removed. `RelayUrl`, `RelayMode`, `RelayNode` and
`RelayMap` are moved to the top (`iroh_net`). All other members of this
module are now moved to the new crate `iroh-relay`

## Notes & open questions

- Disco includes a transaction id defined in terms of stun's transaction
id. Every use of this is now directly from `stun_rs` to make it clear
this is not defined in terms of the relay's stun server. The relay can't
read the contents of disco packages so I consider this the right thing
to do, even if the types are actually the same.
- The relay needs to identify Disco packages to queue them separately.
Disco is not yet its own crate because we do not know its future. Based
on this I chose to duplicate the code that does the identification. But
just this signals we should start the work on deciding what disco looks
like in a post ts world to de-duplicate this code and use it from a
single source. The code as it stands is highly unlikely to change;
however, any change to the protocol's wrapper would now need to be
changed in two places, which is ofc not ideal.
- The relay's "protocol" includes quite a lot of non protocol code, that
probably belongs more to the server. I was really tempted to try to move
this code to better places but in reality this is not the time. The
purpose of this PR is among many others, to allow for future work like
this.
- What used to be called the relay module had a lot of relay-related
code without a clear objective or offering. This includes for example,
all the relay map related code. The relay as a protocol, server, and
client, has absolutely no need for these types. Relay topology is
entirely a `iroh`/`iroh-net` topic. On the same page is the
infra-related code. Nothing in the protocol, server or client is
different depending on the infra values. This is entirely a
`iroh`/`iroh-net` topic. Based on these arguments, these types,
functions, etc that were part of the relay module have actually stayed
in `iroh-net` instead.
- I did a couple passes of `udeps`, but it's messy. Optional
dependencies are their own feature so they aren't flagged as unused.
There are (in general, not in the relay) dependencies that differ in
targets, and the features combinations also affect this as well. So the
deps that remain are a "reasonable best effort" in accordance with our
CI checks. In reality, - and in particular when checking feature
combinations with `fc` - there's a lot of work left to do on that front.
- The crate's version is 0.29 because there's no point in setting the
version to 0.28, which will never be released, to then later be changed
to 0.29. This might be unexpected but it's the option that reduces
senseless future work.
- dns code is duplicated. This is unfortunate but since it's only used
in tests to configure the client there's really not much to do here.
Most of what this does are fixes we should try to upstream

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [ ] ~~Tests if relevant.~~
- [x] All breaking changes documented.
also uses `AbortOnDropHandle` to better cleanup tasks

Closes #2909

---------

Co-authored-by: Divma <[email protected]>
## Description

By using nextest to kill slow tests we get logs when a test is killed.
This is better than the github timeout kicking in as then we get no
feedback on what was stuck.

Tests are now marked as slow after 20s and killed after 60s.


## Breaking Changes

<!-- Optional, if there are any breaking changes document them,
including how to migrate older code. -->

## Notes & open questions

discovery::tests::endpoint_discovery_combined_wrong_only seems like it
is a slow tests needing about 30s to complete. To be fair I think it's
fine to call that out as a slow test, so I'm fine with that.

It is possible that this 60s timeout will introduce some new flakyness
to the slow CI machines. We'll have to see and potentially tweak this a
bit.

## Change checklist

- [x] Self-review.
- [x] Documentation updates following the [style
guide](https://rust-lang.github.io/rfcs/1574-more-api-documentation-conventions.html#appendix-a-full-conventions-text),
if relevant.
- [x] Tests if relevant.
- [x] All breaking changes documented.
Depends on n0-computer/iroh-docs#5

---------

Co-authored-by: Diva M <[email protected]>
Co-authored-by: Ruediger Klaehn <[email protected]>
@matheus23
Copy link
Contributor

what the fuck github

@matheus23
Copy link
Contributor

Oh wow now i can reopen. Interesting

@matheus23
Copy link
Contributor

LOL except I can't ?????

@matheus23
Copy link
Contributor

Anyhow. I didn't intend it to close right now, but intended on closing today.

The implementation is getting extracted in the same vein as iroh-docs, iroh-blobs and iroh-gossip.
Work is going to continue at https://github.com/n0-computer/iroh-willow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.