Skip to content

Commit

Permalink
Merge pull request #78 from KYVENetwork/feat/archway-ssync
Browse files Browse the repository at this point in the history
Feat/archway ssync
  • Loading branch information
christopherbrumm authored Sep 28, 2023
2 parents 05f8e65 + 177a91b commit 9a6af8f
Show file tree
Hide file tree
Showing 18 changed files with 1,122 additions and 298 deletions.
1 change: 1 addition & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ Get familiar with KYVE and explore its main concepts.
## Explore the Stack

- **[KYVE](https://www.kyve.network/datalake)** - The Web3 data lake for fetching, storing, and validating data
- **[KSYNC](https://github.com/KYVENetwork/ksync)** - Rapidly sync validated blocks and snapshots from KYVE to every Tendermint based Blockchain Application

Check failure on line 30 in docs/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/index.md#L30

[Vale.Terms] Use 'blockchain' instead of 'Blockchain'.
Raw output
{"message": "[Vale.Terms] Use 'blockchain' instead of 'Blockchain'.", "location": {"path": "docs/index.md", "range": {"start": {"line": 30, "column": 135}}}, "severity": "ERROR"}
- **[Data Pipeline](https://www.kyve.network/datapipeline)** - An ELT pipeline for accessing KYVE data
- More products coming soon ...

Expand Down
151 changes: 0 additions & 151 deletions docs/tools/KSYNC/examples.md

This file was deleted.

12 changes: 8 additions & 4 deletions docs/tools/KSYNC/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ sidebar_position: 1

# Installation

## Install with Go (recommended)

To install the latest version of `ksync`, run the following command:

```bash
Expand All @@ -13,10 +15,12 @@ go install github.com/KYVENetwork/ksync/cmd/ksync@latest
To install a previous version, you can specify the version.

```bash
go install github.com/KYVENetwork/ksync/cmd/ksync@v0.1.0
go install github.com/KYVENetwork/ksync/cmd/ksync@v0.5.0
```

Run `ksync version` to check the ksync version.
Run `ksync version` to verify the installation.

## Install from source

You can also install from source by pulling the ksync repository and switching to the correct version and building

Check failure on line 25 in docs/tools/KSYNC/installation.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/installation.md#L25

[Vale.Spelling] Did you really mean 'ksync'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'ksync'?", "location": {"path": "docs/tools/KSYNC/installation.md", "range": {"start": {"line": 25, "column": 49}}}, "severity": "ERROR"}
as follows:
Expand All @@ -28,9 +32,9 @@ git checkout tags/vx.x.x -b vx.x.x
make ksync
```

This will build ksync in `/build` directory. Afterwards you may want to put it into your machine's PATH like
This will build ksync in `/build` directory. Afterwards, you may want to put it into your machine's PATH like

Check failure on line 35 in docs/tools/KSYNC/installation.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/installation.md#L35

[Vale.Spelling] Did you really mean 'ksync'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'ksync'?", "location": {"path": "docs/tools/KSYNC/installation.md", "range": {"start": {"line": 35, "column": 17}}}, "severity": "ERROR"}
as follows:

```bash
cp build/ksync ~/go/bin/ksync
```
```
26 changes: 23 additions & 3 deletions docs/tools/KSYNC/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,16 +4,36 @@ sidebar_position: 0

# Overview

### Rapidly sync validated blocks and snapshots from KYVE to every Tendermint based Blockchain Application.

Check failure on line 7 in docs/tools/KSYNC/overview.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/overview.md#L7

[Vale.Terms] Use 'blockchain' instead of 'Blockchain'.
Raw output
{"message": "[Vale.Terms] Use 'blockchain' instead of 'Blockchain'.", "location": {"path": "docs/tools/KSYNC/overview.md", "range": {"start": {"line": 7, "column": 85}}}, "severity": "ERROR"}

## What is KSYNC?

Since KYVE is validating and archiving blocks from several blockchains permanently this data can be used to bootstrap nodes. This is especially helpful since most nodes today are pruning nodes and therefore finding peers which have the requested blocks becomes harder each day. With KSYNC nodes can retrieve the data from KYVE and directly feed the blocks into every Tendermint based Blockchain Application in order to sync blocks and join the network.
Since KYVE is validating and archiving blocks and state-sync snapshots from several blockchains permanently this data can be
used to bootstrap nodes. This is especially helpful since most nodes today are pruning nodes and therefore
finding peers which have the requested blocks becomes harder each day. With KSYNC nodes can retrieve
the data from KYVE and directly feed the blocks into every Tendermint based Blockchain Application in order

Check failure on line 14 in docs/tools/KSYNC/overview.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/overview.md#L14

[Vale.Terms] Use 'blockchain' instead of 'Blockchain'.
Raw output
{"message": "[Vale.Terms] Use 'blockchain' instead of 'Blockchain'.", "location": {"path": "docs/tools/KSYNC/overview.md", "range": {"start": {"line": 14, "column": 77}}}, "severity": "ERROR"}
to sync blocks and join the network. Furthermore, any Tendermint based application can rapidly join the network by
applying state-sync snapshots which are permanently archived on Arweave.

:::info
You can find the source code and additional docs in the GitHub repository [here](https://github.com/KYVENetwork/ksync).
:::

## How does it work?

KSYNC comes with three sync modes which can be applied depending on the type of application. There is DB-SYNC which syncs blocks by directly communicating with the app and writing the data directly to the database and then there P2P-SYNC where KSYNC mocks a peer in the network which has all the required blocks, streaming them over the dedicated block channels over to the node.
KSYNC basically replaces the inbuilt tendermint process and communicates with the app directly over the Tendermint

Check failure on line 24 in docs/tools/KSYNC/overview.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/overview.md#L24

[Vale.Terms] Use 'Tendermint' instead of 'tendermint'.
Raw output
{"message": "[Vale.Terms] Use 'Tendermint' instead of 'tendermint'.", "location": {"path": "docs/tools/KSYNC/overview.md", "range": {"start": {"line": 24, "column": 38}}}, "severity": "ERROR"}
Socket Protocol (TSP) with the [ABCI](https://github.com/tendermint/spec/blob/master/spec/abci/abci.md) interface.
Once KSYNC has retrieved the requested blocks for the application from a permanent storage provider like Arweave it
executes them against the app and stores all relevant information in the blockstore and state.db databases directly. The

Check failure on line 27 in docs/tools/KSYNC/overview.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/overview.md#L27

[Vale.Spelling] Did you really mean 'blockstore'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'blockstore'?", "location": {"path": "docs/tools/KSYNC/overview.md", "range": {"start": {"line": 27, "column": 74}}}, "severity": "ERROR"}
same applies to _state-sync_ snapshots, where KSYNC offers the snapshots over the ABCI methods against the app.

After a node has been successfully synced with KSYNC the node simply can fetch remaining blocks and switch to live mode
like it would have if synced normally. This makes operating nodes way cheaper and even may make archival nodes
obsolete since blocks archived by KYVE can then be safely dropped in the nodes and synced again once needed
with this tool.

Overview of how KSYNC interacts with the tendermint application:

Check failure on line 35 in docs/tools/KSYNC/overview.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/overview.md#L35

[Vale.Terms] Use 'Tendermint' instead of 'tendermint'.
Raw output
{"message": "[Vale.Terms] Use 'Tendermint' instead of 'tendermint'.", "location": {"path": "docs/tools/KSYNC/overview.md", "range": {"start": {"line": 35, "column": 42}}}, "severity": "ERROR"}

After a node has been successfully synced with KSYNC the node simply can fetch remaining blocks and switch to live mode like it would have if synced normally. This makes operating nodes way cheaper and even may make archival nodes obsolete since blocks archived by KYVE can then be safely dropped in the nodes and synced again once needed with this tool.
<p align="center">
<img width="70%" src="/img/db_sync.png" />
</p>
37 changes: 37 additions & 0 deletions docs/tools/KSYNC/protocol_validators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
---
sidebar_position: 3
---

# For KYVE Protocol Validators
This section includes all commands used by KYVE Protocol Validators to participate in _state-sync_ data pools.

## SERVE-SNAPSHOTS

This command is essential for running as a protocol node in a _state-sync_ pool since this will serve the snapshots to the

Check failure on line 10 in docs/tools/KSYNC/protocol_validators.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/protocol_validators.md#L10

[KYVE.substitution] Use 'protocol validator' instead of 'protocol node'
Raw output
{"message": "[KYVE.substitution] Use 'protocol validator' instead of 'protocol node'", "location": {"path": "docs/tools/KSYNC/protocol_validators.md", "range": {"start": {"line": 10, "column": 44}}}, "severity": "ERROR"}
protocol node. Basically, KSYNC will sync the blocks with _block-sync_ and waits for the ABCI app to create the snapshots,

Check failure on line 11 in docs/tools/KSYNC/protocol_validators.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/protocol_validators.md#L11

[KYVE.substitution] Use 'protocol validator' instead of 'protocol node'
Raw output
{"message": "[KYVE.substitution] Use 'protocol validator' instead of 'protocol node'", "location": {"path": "docs/tools/KSYNC/protocol_validators.md", "range": {"start": {"line": 11, "column": 1}}}, "severity": "ERROR"}
once created they are exposed over a REST API server which the protocol node can then query.

Check failure on line 12 in docs/tools/KSYNC/protocol_validators.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/protocol_validators.md#L12

[KYVE.substitution] Use 'protocol validator' instead of 'protocol node'
Raw output
{"message": "[KYVE.substitution] Use 'protocol validator' instead of 'protocol node'", "location": {"path": "docs/tools/KSYNC/protocol_validators.md", "range": {"start": {"line": 12, "column": 64}}}, "severity": "ERROR"}

To start with default settings serve the snapshots with:

```bash
ksync serve-snapshots --binary="/path/to/<binaryd>" --home="/path/to/.<home>" --snapshot-pool-id=<pool-id> --block-pool-id=<pool-id>
```

Once you see that KSYNC is syncing blocks you can open `https://localhost:7878/list_snapshots`. In the beginning it should
return an empty array, but after the first snapshot height is reached (check the interval in the data pool settings) you
should see a first snapshot object in the response.

### Changing snapshot api server port

Check failure on line 24 in docs/tools/KSYNC/protocol_validators.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/protocol_validators.md#L24

[Vale.Spelling] Did you really mean 'api'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'api'?", "location": {"path": "docs/tools/KSYNC/protocol_validators.md", "range": {"start": {"line": 24, "column": 23}}}, "severity": "ERROR"}

You can change the snapshot api server port with the flag `--snapshot-port=<port>`

Check failure on line 26 in docs/tools/KSYNC/protocol_validators.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/protocol_validators.md#L26

[Vale.Spelling] Did you really mean 'api'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'api'?", "location": {"path": "docs/tools/KSYNC/protocol_validators.md", "range": {"start": {"line": 26, "column": 29}}}, "severity": "ERROR"}

### Enabling metrics server and manage port

You can enable a metrics server running by default on `http://localhost:8080/metrics` by add the flag `--metrics`.
Furthermore, can you change the port of the metrics server by adding the flag `--metrics-port=<port>`

### Manage pruning

By default, pruning is enabled. That means that all blocks, states and snapshots prior to the snapshot pool height
are automatically, deleted, saving a lot of disk space. If you want to disable it add the flag `--pruning=false`

104 changes: 104 additions & 0 deletions docs/tools/KSYNC/settings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
---
sidebar_position: 4
---

# Settings

## Backups

Even with the right setup and careful maintenance, it's possible to encounter app-hash errors or other unexpected problems that can lead to node collisions and resyncs from Genesis. Especially when you're dealing with syncing an archival node, it's a good idea to create periodic backups of the node's data.

Check failure on line 9 in docs/tools/KSYNC/settings.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/settings.md#L9

[Vale.Spelling] Did you really mean 'resyncs'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'resyncs'?", "location": {"path": "docs/tools/KSYNC/settings.md", "range": {"start": {"line": 9, "column": 161}}}, "severity": "ERROR"}

KSYNC offers precisely this option for creating backups. There are two different methods to utilize this:

### 1. BLOCK-SYNC-Backups

With _block-sync_, nodes can be synced by KSYNC from any height up to the latest height available by the storage pool.

Check failure on line 15 in docs/tools/KSYNC/settings.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/settings.md#L15

[KYVE.substitution] Use 'data pool' instead of 'storage pool'
Raw output
{"message": "[KYVE.substitution] Use 'data pool' instead of 'storage pool'", "location": {"path": "docs/tools/KSYNC/settings.md", "range": {"start": {"line": 15, "column": 106}}}, "severity": "ERROR"}
Backups can be created automatically at an interval, with the following parameters:

```bash
--home string 'home directory of the node (e.g. ~/.osmosisd)'
--backup-interval int 'block interval to write backups of data directory (set 0 to disable backups)'
--backup-keep-recent int 'number of latest backups to be keep (0 to keep all backups)'
--backup-compression string 'compression type used for backups ("tar.gz","zip"), if not compression given the backup will be stored uncompressed'
--backup-dest string 'path where backups should be stored [default = ~/.ksync/backups]'
```

When the specified `backup-interval` is reached (`height % backup-interval = 0`), KSYNC temporarily pauses the sync process and creates a backup.
These backups are duplicates of the node's data directory (e.g. `~/.osmosisd/data`). If compression is enabled (e.g. using `--backup-compression="tar.gz"`), the backup is compressed and the original uncompressed version is deleted after successful compression in a parallel process.

Check failure on line 27 in docs/tools/KSYNC/settings.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/settings.md#L27

[Google.Latin] Use 'for example' instead of 'e.g.'.
Raw output
{"message": "[Google.Latin] Use 'for example' instead of 'e.g.'.", "location": {"path": "docs/tools/KSYNC/settings.md", "range": {"start": {"line": 27, "column": 60}}}, "severity": "ERROR"}

Check failure on line 27 in docs/tools/KSYNC/settings.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/settings.md#L27

[Google.Latin] Use 'for example' instead of 'e.g.'.
Raw output
{"message": "[Google.Latin] Use 'for example' instead of 'e.g.'.", "location": {"path": "docs/tools/KSYNC/settings.md", "range": {"start": {"line": 27, "column": 113}}}, "severity": "ERROR"}

#### Usage

Because backups are disabled by default, it's only required to set ``backup-interval``, whereas the other flags are optional.
Since the creation of a backup takes steadily longer as the data size grows, it is recommended to choose an interval of more than `20000` blocks.

Example command to run _block-sync_ with compressed backups:
```bash
ksync block-sync --binary="/path/to/<binaryd>" --home="/path/to/.<home>" --block-pool-id=<pool-id> --target-height=<height>
--backup-interval=50000 --backup-compression="tar.gz"
```

### 2. Backup-Command

The backup functionality can of course also be used with a standalone command. In this case everything runs in one process
where the following flags can be used:

```bash
--home string 'home directory of the node (e.g. ~/.osmosisd)'
--backup-keep-recent int 'number of latest backups to be keep (0 to keep all backups)'
--backup-compression string 'compression type used for backups ("tar.gz","zip"), if not compression given the backup will be stored uncompressed'
--backup-dest string 'path where backups should be stored [default = ~/.ksync/backups]'
```

#### Usage

```bash
ksync backup --home="/Users/christopher/.osmosisd" --compression="tar.gz"
```

## Overwrite default endpoints

KSYNC retrieves data from different sources, including a KYVE chain and a storage provider endpoint. Depending on the specified `chain-id`, the default KYVE **chain endpoints** are:

- **Mainnet (`kyve-1`)**: https://api-eu-1.kyve.network
- **Testnet (`kaon-1`)**: https://api-eu-1.kaon.kyve.network
- **Devnet (`korellia`)**: https://api.korellia.kyve.network

Whereas the default **storage provider endpoints** are:
- **Arweave (`1`)**: https://arweave.net
- **Bundlr (`2`)**: https://arweave.net
- **KYVE Storage Provider (`3`)**: https://storage.kyve.network _(shouldn't be overwritten)_

For several reasons, you can overwrite the default endpoints with your preferred ones. For this purpose, only add the following flags to all commands that are using the listed endpoints:

```bash
--chain-rest string overwrite KYVE chain rest endpoint
--storage-rest string overwrite storage provider rest endpoint
```

### Example

Use the KYVE chain US endpoint to _block_sync_ your Osmosis node:

Check failure on line 80 in docs/tools/KSYNC/settings.md

View workflow job for this annotation

GitHub Actions / vale

[vale] docs/tools/KSYNC/settings.md#L80

[Vale.Spelling] Did you really mean 'block_sync'?
Raw output
{"message": "[Vale.Spelling] Did you really mean 'block_sync'?", "location": {"path": "docs/tools/KSYNC/settings.md", "range": {"start": {"line": 80, "column": 36}}}, "severity": "ERROR"}

```bash
ksync block-sync --chain-rest="https://api-us-1.kyve.network" --binary="/Users/alice/osmosisd" --home="/Users/alice/.osmosisd" --block-pool-id=1 --target-height=42000
```

## Metrics

You can enable useful metrics through the `--metrics` flag for all syncing commands. By default, it's exposed on ``http://localhost:8080/metrics`` and you can specify a custom port with ``--metrics-port``.

The exposed metrics include the following information:

```json
{
"latest_block_hash": "A6C59D5F7487B95B32B71EB97F8FE0EE7BE7B512044FC53B6C4A706594167AF9",
"latest_app_hash": "6BF3787314EC5C1B8FF08334193A31EF562CFE6700C3E6B604C31FD053F7FAF4",
"latest_block_height": "180",
"latest_block_time": "2021-06-18T22:03:40.861352885Z",
"earliest_block_hash": "C8DC787FAAE0941EF05C75C3AECCF04B85DFB1D4A8D054A463F323B0D9459719",
"earliest_app_hash": "E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855",
"earliest_block_height": "1",
"earliest_block_time": "2021-06-18T17:00:00Z",
"catching_up": true
}
```
Loading

1 comment on commit 9a6af8f

@vercel
Copy link

@vercel vercel bot commented on 9a6af8f Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.