-
Notifications
You must be signed in to change notification settings - Fork 193
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
book: frequently asked questions and answers
- Loading branch information
1 parent
4905b6c
commit d4ac2c4
Showing
7 changed files
with
304 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -40,3 +40,4 @@ deps-bundle.tar.zst | |
/book/.vitepress/cache | ||
/book/.vitepress/dist | ||
/book/node_modules | ||
/book/bun.lockb |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,65 @@ | ||
# Frequently Asked Questions | ||
|
||
::: details What hardware do I need to run Frankendancer? | ||
|
||
The current Frankendancer hardware requirements are the same | ||
as that of an Agave validator. Refer to the [Hardware](./getting-started.md#hardware-requirements) | ||
section in the [Getting Started](./getting-started.md) guide | ||
for more details. | ||
|
||
::: | ||
|
||
::: details How can I obtain the Frankendancer binaries? | ||
|
||
Frankendancer does not currently provide pre-built binaries. | ||
It is recommended to build the binaries on the same host where | ||
you are planning to run the validator. Frankendancer detects | ||
system properties and tries to build a binary tuned for the | ||
particular host. Take a look at the [getting started](./getting-started.md) | ||
guide for requirements and instructions. | ||
|
||
::: | ||
|
||
::: details What branch or tag should I build from? | ||
|
||
You can always checkout the `v0.1` tag, which will point to the | ||
latest release. For more information, refer to the [releases](./getting-started.md#releases) | ||
section. | ||
|
||
::: | ||
|
||
::: details How do I resolve errors encountered while starting up Frankendancer? | ||
|
||
The Frankendancer binary `fdctl` tries to provide helpful error | ||
messages to identify the problem and sometimes even suggests | ||
solutions. Take a look at the [troubleshooting](./troubleshooting.md) | ||
guide for some easy steps that can mitigate some common issues. | ||
|
||
::: | ||
|
||
::: details Can Agave and Frankendancer use the same ledger and snapshots? | ||
|
||
Yes, Frankendancer is fully compatible with both the snapshot | ||
and the ledger formats of the Agave validator. | ||
|
||
::: | ||
|
||
::: details How can I monitor the status of my Frankendancer node? | ||
|
||
You can use most of the regular monitoring tools and commands | ||
that you typically would use with an Agave validator to monitor | ||
Frankendancer as well. Refer to the [monitoring](./monitoring.md) | ||
guide for some helpful commands. | ||
|
||
::: | ||
|
||
::: details Why is my node still delinquent? | ||
|
||
There could be several reasons, some of which include the validator | ||
being unable to catchup and the validator not voting properly among | ||
others. Take a look at the [tuning](./tuning.md) guide for some | ||
tips on how to configure Frankendancer to increase the performance | ||
of the replay stage so the validator catches up faster. Also make | ||
sure that your node is staked and the stake is active. | ||
|
||
::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
# Monitoring | ||
|
||
The Frankendancer validator can be monitored quite similar to an | ||
Agave validator. | ||
|
||
## Pre-requisite | ||
|
||
Be sure to build the `solana` binary, i.e. specify `solana` as a | ||
target to the `make` command. The binary should be in the same | ||
directory as `fdctl`. If you have not added that directory to the | ||
`PATH` environment variable, replace `solana` with the full path | ||
to the binary in the following commands. | ||
|
||
::: tip NOTE | ||
|
||
Note that this list is not exhaustive. Some commands may not | ||
work without RPC enabled on your validator. Check out the | ||
comments in the `rpc` section of the `default.toml` file to | ||
configure it according to your needs. | ||
|
||
::: | ||
|
||
## Solana Commands | ||
|
||
* Ensure the validator has joined gossip | ||
|
||
```sh [bash] | ||
solana -ut gossip | grep <PUBKEY> | ||
``` | ||
|
||
* Ensure the validator is caught up | ||
|
||
```sh [bash] | ||
solana -ut catchup --our-localhost | ||
``` | ||
|
||
* Ensure the validator is voting | ||
|
||
```sh [bash] | ||
solana -ut validators | grep <PUBKEY> | ||
``` | ||
|
||
* Ensure the validator is producing blocks | ||
|
||
```sh [bash] | ||
solana -ut block-production | grep <PUBKEY> | ||
``` | ||
|
||
::: tip NOTE | ||
|
||
You can also use the `agave-validator --ledger <PATH> monitor` | ||
command with Frankendancer. For that, you need to build the | ||
`agave-validator` binary from the `agave` repository. | ||
|
||
::: | ||
|
||
## Frankendancer Metrics | ||
|
||
* Look at the prometheus metrics (on the same host) | ||
|
||
```sh [bash] | ||
curl http://localhost:7999/metrics | ||
``` | ||
|
||
* Running the Frankendancer monitor | ||
|
||
```sh [bash] | ||
fdctl monitor --config ~/config.toml | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
# Troubleshooting | ||
|
||
This page has a collection of common troubleshooting steps when operators | ||
encounter errors while building and running Frankendancer. If these do | ||
not address the problem, send a message in the `#firedancer-operators` | ||
channel on the Solana Tech Discord or file an issue on GitHub. | ||
|
||
## Building | ||
|
||
### General Recommendations | ||
|
||
* It is always a good idea to retry building everything again from scratch. | ||
Do a fresh clone of the repository, following the instructions in the | ||
[Getting Started](./getting-started.md#prerequisites) guide. Remember to | ||
check if you're using a supported compiler and to run `./deps.sh`! | ||
|
||
* If you're updating an existing repository clone, be sure to update | ||
the solana submodule _after_ pulling the latest changes. For example: | ||
|
||
```sh [bash] | ||
~/firedancer $ git fetch | ||
~/firedancer $ git checkout v0.1 | ||
~/firedancer $ git submodule update | ||
``` | ||
|
||
### Specific Errors | ||
|
||
* Missing `cargo` binary from rust toolchain | ||
|
||
```sh [bash] | ||
error: the 'cargo' binary, normally provided by the 'cargo' component, is not applicable to the '1.75.0-x86_64-unknown-linux-gnu' toolchain | ||
+ exec cargo +1.75.0 build --profile=release-with-debug --lib -p solana-validator | ||
error: the 'cargo' binary, normally provided by the 'cargo' component, is not applicable to the '1.75.0-x86_64-unknown-linux-gnu' toolchain | ||
make: *** [src/app/fdctl/Local.mk:107: cargo-validator] Error 1 | ||
``` | ||
|
||
This typically happens due to a race condition between trying to install the | ||
correct version of the rust toolchain and using it. Separately re-installing | ||
the toolchain fixes it (replace `1.75.0` with the appropriate version): | ||
|
||
```sh [bash] | ||
rustup toolchain uninstall 1.75.0-x86_64-unknown-linux-gnu | ||
rustup toolchain install 1.75.0-x86_64-unknown-linux-gnu | ||
``` | ||
|
||
## Configuring | ||
|
||
### General Recommendations | ||
|
||
* If there are errors during `fdctl configure init all --config | ||
~/config.toml`, consider running `fdctl configure fini all --config | ||
~/config.toml` to remove all existing configuration and try the `init` | ||
command again. You can also re-run a specific configure stage, for | ||
example, `fdctl configure init workspace --config ~/config.toml`. | ||
|
||
* Make sure the `config.toml` specified during this command is the | ||
same as the one specified with the `run` command. Also make sure | ||
that the content is valid TOML. | ||
|
||
* Read the output of the command carefully, `fdctl` often prints out | ||
a helpful message that contains suggestions on how to resolve some | ||
errors. Be sure to try them out! | ||
|
||
## Running | ||
|
||
### General Recommendations | ||
|
||
* Always run `fdctl configure init all --config ~/config.toml` before | ||
running the `fdctl run --config ~/config.toml`. If using a systemd unit, | ||
specify both of the commands together for starting Frankendancer. | ||
|
||
* Make sure the `~/config.toml` being used is the same in the `configure` | ||
and `run` commands. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
# Tuning | ||
|
||
## Tiles | ||
|
||
To stay caught up with the cluster, the replay stage needs enough | ||
cores and processing power. If you see your validator falling | ||
behind with the default configuration, consider trying out the | ||
following: | ||
|
||
### Increase Shred Tiles | ||
|
||
Example Original Config: | ||
|
||
```toml | ||
[layout] | ||
affinity = "1-18" | ||
quic_tile_count = 2 | ||
verify_tile_count = 4 | ||
bank_tile_count = 4 | ||
solana_labs_affinity = "19-31" | ||
``` | ||
|
||
Example New Config: | ||
|
||
```toml | ||
[layout] | ||
affinity = "1-18" | ||
quic_tile_count = 2 | ||
verify_tile_count = 5 | ||
bank_tile_count = 2 | ||
shred_tile_count = 2 | ||
solana_labs_affinity = "19-31" | ||
``` | ||
|
||
This takes a core from the `bank` tile (transaction execution) and | ||
gives it to another `shred` tile (turbine and shred processing). It | ||
takes another core from another `bank` tile and gives it to a `verify` | ||
(signature verification) tile. | ||
|
||
### Increase Cores for Solana Labs | ||
|
||
Example Original Config: | ||
|
||
```toml | ||
[layout] | ||
affinity = "1-18" | ||
quic_tile_count = 2 | ||
verify_tile_count = 5 | ||
bank_tile_count = 2 | ||
shred_tile_count = 2 | ||
solana_labs_affinity = "19-31" | ||
``` | ||
|
||
Example New Config: | ||
|
||
```toml | ||
[layout] | ||
affinity = "1-16" | ||
quic_tile_count = 1 | ||
verify_tile_count = 4 | ||
bank_tile_count = 2 | ||
shred_tile_count = 2 | ||
solana_labs_affinity = "17-31" | ||
``` | ||
|
||
This takes 1 core from the `quic` tile and another from the `verify` | ||
tile gives them both to the solana labs threads (where the replay stage | ||
runs). | ||
|
||
## QUIC | ||
|
||
There is a lot of QUIC traffic in the cluster. If the validator is | ||
having a hard time establishing QUIC connections, it might end up | ||
getting less transactions. Some parameters that can be tuned to address | ||
this are (these 2 parameters need to be the same value): | ||
|
||
```toml | ||
[tiles.quic] | ||
max_concurrent_connections = 2048 | ||
max_concurrent_handshakes = 2048 | ||
``` |