Infrastructure needs #1

mguentner · 2017-01-29T17:26:09Z

This is collection of things we might need to get binary cache distribution up and running using IPFS.

1 Server (good peering with S3/Cloudfront) that publishes to IPFS from Hydra/S3:
- 2 TB+ Storage

3+ Initial pin server that also can be used as a gateway (all content is pinnend, no delay like on ipfs.io)

Europe 1
Europe 2
US 1

The publish and the pin servers could be linked together using cjdns to improve routing between the "core" distribution infrastructure.

IPFS infrastructure repo:
https://github.com/ipfs/infrastructure

vcunat · 2017-05-09T20:38:10Z

Do you/someone have an idea of the space requirements? We could start small, e.g. just x86_64-linux, etc. I think I might setup a server in CZ, at least a day-only one for initial experimentation.

mguentner · 2017-05-09T20:59:01Z

We had a server running and the test results are documented here:
https://github.com/NixIPFS/infrastructure/blob/master/ipfs_mirror/logbook_20170311.txt

We were lucky to have 256 GB ram (yes...no typo) and 48 cores (yes...again no typo) - so we had no bottlenecks™.

You need roughly 4 GB for a nixos-small release and ~ 60 GB disk space for a normal release. The delta depends on how many new/changed .nars have been built. Double everything since the current version of
the scripts do not reference the data but add them to the data directory of IPFS.
Please note that this only includes output paths and build inputs are not included (yet).
Memory wise 4 GB+ are enough.

The scripts (NixIPFS/nixipfs-scripts) still need a garbage collector for old releases.
This is trivial to implement and should be configurable (say keep 5 -small releases and 3 normal ones)

The documented shortcomings of IPFS will be addressed in ipfs/kubo/pull/3867 - I suggest you wait with your experiment until this is merged or create a custom IPFS package containing the patches.
However, if you do not plan to mirror your result to another machine then this shouldn't bother you.

Please also have a look at NixIPFS/notes#2

vcunat · 2017-05-09T21:16:59Z

Thanks, I had read the last link and now also the other two.

So the experiment has ended and nothing is running ATM? Do I understand it correctly that the high CPU requirements only happenned during import of new paths and should be better soon after IPFS bitswap is improved? For a future stabler server I'm thinking of a dedicated raspberry (4-core) with an external rotating drive.

mguentner · 2017-05-09T21:40:50Z

Nope, nothing is running currently since the server went away (did no investigate further since the bitswap stuff was a show stopper).

You need quite some processing power to hash all the content. The RPi seems like a good idea, however the import could be beyond the limits of this platform - so it will take quite some time. On the
machine mentioned above, a fresh import took like 2.0 h (including downloading ~ 70GB from S3 - the location was AMS, so 💨 )

@ CPU:
The hashing part during the import and the concurrent operations by nixipfs-scripts require a lot of processing time, you can however divide the work onto two machines, one IPFS server and one importer (VM, single purpose installation (like put it into the multi-user.target) or simply a container on your workstation) - the importer will then access the API that is exposed by the IPFS server (tunnel through ssh / wireguard / the like)

The bitswap "load explosion" is caused by too much DHT chatter, caused by the pin of the hash(es) since the pinner will broadcast its updated wantlist immediately. If the pinner is connected to a lot of other nodes in the swarm this will cause a lot of traffic (IPFS currently connects to "all the nodes", no routing/forwarding within the swarm is implemented yet)
This traffic then leads to a high CPU load. So, on smaller platforms the pin/download speed is reduced which is caused by the "management/dht" traffic (bureaucracy 🖇️)

CMCDragonkai · 2017-05-19T06:51:04Z

Is there a chance that the same IPFS node for pinning downstream packages can also be used for upstream packages? By downstream I mean the binary cache, and by upstream I mean the all the fetchurls.. etc.

vcunat · 2017-05-19T08:24:47Z

Yes, I certainly counted on that.

mguentner · 2017-05-19T09:31:09Z

I don't see a reason why this shouldn't be possible. However, there is currently no way of caching the "upstream" (I like the expression) packages. Once this is figured out (NixIPFS/notes/issues/1), this can be implemented in e.g. nixipfs-scripts
(First we need a way to figure out what set of upstream paths belong to a certain set of downstream paths, then we can have them available together....I wonder how many GB (TB?) all upstream paths of nixpkgs require)

CMCDragonkai · 2017-05-19T10:36:56Z

@mguentner That's what we're working on atm. During our work on Forge Package Archiving (the name I gave to ipfsing upstream nix packages), we discovered we had to get a more deeper integration into ipfs, right now we have a haskell multiaddr implementation (forked and extended from basile-henry's version) https://github.com/MatrixAI/haskell-multiaddr (which I intend to make the official haskell impl of Multiaddr), and @plintx is working on integrating multistream and multistream muxer.

vcunat · 2017-05-20T18:06:10Z

Nixpkgs already has maintainers/scripts/find-tarballs.nix for determining sources to mirror on the current infrastructure (Amazon). EDIT: the size will be large, but it won't grow much when adding multiple channels/evaluations.

mguentner · 2017-07-08T09:17:45Z

@vcunat FYI: The bitswap session PR has been merged

mguentner · 2017-08-04T10:01:43Z

Bitswap sessions should improve the transfer speed, however since the exporter / initial seeder will be known, IPFS can be run with ipfs daemon --routing=none on that node.
This flag turns of the DHT discovery / management process so nodes need to be connected manually.
ipfs swarm connect /ip4/x.x.x.x/tcp/4001/ipfs/Qmxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

The process of a full pin would be as follows:

# ipfs running normally, announcing to DHT
systemctl stop ipfs
systemctl start ipfs-no-routing
ipfs swarm connect /ip4/INITIAL_SEEDER_IP/tcp/4001/ipfs/INITIAL_SEEDER_HASH
./pin_nixos.sh
# pin done
systemctl stop ipfs-no-routing
# start ipfs, announce new hashes to DHT
systemctl start ipfs

This would speed up full pins/syncs while allowing partial or even full syncs via the DHT.

mguentner mentioned this issue Feb 4, 2017

RFC: Add IPFS to Nix NixOS/nix#1167

Closed

ghost mentioned this issue May 19, 2017

Integrating IPFS (Haskell Implementation) MatrixAI/Forge-Package-Archiving#1

Closed

NixIPFS deleted a comment from grahamc Jul 8, 2017

vcunat mentioned this issue Sep 16, 2017

cache.nixos.org is down NixOS/nixpkgs#29389

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infrastructure needs #1

Infrastructure needs #1

mguentner commented Jan 29, 2017 •

edited

Loading

vcunat commented May 9, 2017 •

edited

Loading

mguentner commented May 9, 2017 •

edited

Loading

vcunat commented May 9, 2017 •

edited by mguentner

Loading

mguentner commented May 9, 2017 •

edited

Loading

CMCDragonkai commented May 19, 2017

vcunat commented May 19, 2017

mguentner commented May 19, 2017

CMCDragonkai commented May 19, 2017 •

edited

Loading

vcunat commented May 20, 2017 •

edited

Loading

mguentner commented Jul 8, 2017

mguentner commented Aug 4, 2017

Infrastructure needs #1

Infrastructure needs #1

Comments

mguentner commented Jan 29, 2017 • edited Loading

vcunat commented May 9, 2017 • edited Loading

mguentner commented May 9, 2017 • edited Loading

vcunat commented May 9, 2017 • edited by mguentner Loading

mguentner commented May 9, 2017 • edited Loading

CMCDragonkai commented May 19, 2017

vcunat commented May 19, 2017

mguentner commented May 19, 2017

CMCDragonkai commented May 19, 2017 • edited Loading

vcunat commented May 20, 2017 • edited Loading

mguentner commented Jul 8, 2017

mguentner commented Aug 4, 2017

mguentner commented Jan 29, 2017 •

edited

Loading

vcunat commented May 9, 2017 •

edited

Loading

mguentner commented May 9, 2017 •

edited

Loading

vcunat commented May 9, 2017 •

edited by mguentner

Loading

mguentner commented May 9, 2017 •

edited

Loading

CMCDragonkai commented May 19, 2017 •

edited

Loading

vcunat commented May 20, 2017 •

edited

Loading