Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atlas killing RPC #16

Open
AlexRubik opened this issue Apr 6, 2024 · 3 comments
Open

Atlas killing RPC #16

AlexRubik opened this issue Apr 6, 2024 · 3 comments

Comments

@AlexRubik
Copy link

AlexRubik commented Apr 6, 2024

After some amount of hours, my RPC dies with a "No available UDP ports in (8000, 10000)" error. On latest versions of everything I use: Yellowstone GRPC, Jupiter v6 API, Jito Validator Fork. I have no firewall rules. According to this error, ports are being opened but not being closed? Maybe the quic client needs to time out faster?

NUM_LEADERS=4
TPU_CONNECTION_POOL_SIZE=4

[2024-04-06T03:19:05.991578373Z INFO  solana_metrics::metrics] datapoint: loaded-programs-cache-stats slot=258599051i hits=10674i misses=0i evictions=0i reloads=0i insertions=0i lost_insertions=0i replace_entry=0i one_hit_wonders=0i prunes_orphan=0i prunes_environment=0i empty_entries=0i
[2024-04-06T03:19:05.995420336Z INFO  solana_quic_client::quic_client] Timedout sending data 141.98.216.132:8016
   0: rust_begin_unwind
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/std/src/panicking.rs:595:5
   1: core::panicking::panic_fmt
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/panicking.rs:67:14
   2: core::result::unwrap_failed
             at /rustc/cc66ad468955717ab92600c770da8c1601a4ff33/library/core/src/result.rs:1652:5
   3: solana_quic_client::nonblocking::quic_client::QuicLazyInitializedEndpoint::create_endpoint
   4: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
   5: <futures_util::future::future::map::Map<Fut,F> as core::future::future::Future>::poll
   6: <solana_quic_client::nonblocking::quic_client::QuicClientConnection as solana_connection_cache::nonblocking::client_connection::ClientConnection>::send_data::{{closure}}
   7: tokio::runtime::park::CachedParkThread::block_on
   8: tokio::runtime::context::runtime::enter_runtime
   9: tokio::runtime::runtime::Runtime::block_on
  10: <solana_quic_client::quic_client::QuicClientConnection as solana_connection_cache::client_connection::ClientConnection>::send_data
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
[2024-04-06T03:19:06.008977821Z ERROR solana_metrics::metrics] datapoint: panic program="validator" thread="solWarmQuicSvc" one=1i message="panicked at quic-client/src/nonblocking/quic_client.rs:111:14:
    QuicLazyInitializedEndpoint::create_endpoint bind_in_range: Custom { kind: Other, error: \"No available UDP ports in (8000, 10000)\" }" location="quic-client/src/nonblocking/quic_client.rs:111:14" version="1.17.28 (src:286fd575; feat:3746964731, client:JitoLabs)"
[2024-04-06T03:19:06.014510904Z INFO  solana_quic_client::quic_client] Timedout sending data 141.98.216.132:8014
[2024-04-06T03:19:06.017823392Z INFO  solana_quic_client::quic_client] Timedout sending data 141.98.216.132:8016
[2024-04-06T03:19:06.017834193Z INFO  solana_quic_client::quic_client] Timedout sending data 74.118.143.73:11228
[2024-04-06T03:19:06.017825907Z INFO  solana_quic_client::quic_client] Timedout sending data 74.118.143.73:11228
[2024-04-06T03:19:06.017859140Z INFO  solana_quic_client::quic_client] Timedout sending data 141.98.216.132:8009
[2024-04-06T03:19:06.017863438Z INFO  solana_quic_client::quic_client] Timedout sending data 141.98.216.132:8014
[2024-04-06T03:19:06.017910638Z INFO  solana_quic_client::quic_client] Timedout sending data 141.98.216.132:8009

Validator startup args:

solana-validator --expected-genesis-hash 5eykt4UsFv8P8NJdTREpY1vzqKqZKvdpKuc147dw2N9d --entrypoint 'entrypoint2.mainnet-beta.solana.com:8001' --entrypoint 'entrypoint3.mainnet-beta.solana.com:8001' --entrypoint 'entrypoint.mainnet-beta.solana.com:8001' --entrypoint 'entrypoint4.mainnet-beta.solana.com:8001' --entrypoint 'entrypoint5.mainnet-beta.solana.com:8001' --no-voting --ledger /mnt/ledger --accounts /mnt/accounts --rpc-port 8899 --identity /home/ubuntu/validator-keypair.json --log /home/ubuntu/solana-validator.log --maximum-local-snapshot-age 3000 --wal-recovery-mode skip_any_corrupted_record --full-rpc-api --block-engine-url 'https://amsterdam.mainnet.block-engine.jito.wtf' --allow-private-addr --minimal-snapshot-download-speed 95985760 --tip-payment-program-pubkey T1pyyaTNZsKv2WcRAB8oVnk93mLJw2XzjtVYqCsaHqt --tip-distribution-program-pubkey 4R3gSG8BpU4t19KYj8CfnbtRpnT8gtk4dvTHxVRwc2r7 --limit-ledger-size 55000000 --geyser-plugin-config "/home/ubuntu/yellowstone-grpc/yellowstone-grpc-geyser/config.json" --account-index program-id --account-index-include-key TokenkegQfeZyiNwAJbNbGKPFXCWuBvf9Ss623VQ5DA --account-index-include-key 3tZPEagumHvtgBhivFJCmhV9AyhBHGW9VgdsK52i4gwP --account-index-include-key AddressLookupTab1e1111111111111111111111111 --accounts-db-cache-limit-mb 150000 --accounts-index-memory-limit-mb 128000 --private-rpc --rpc-send-retry-ms 5

@vovkman
Copy link
Contributor

vovkman commented Apr 6, 2024

I wouldn't recommend running this on the same server as your RPC. I have never seen this error myself, but also haven't tried running atlas on the same server as RPC

@jesuspc
Copy link

jesuspc commented Apr 15, 2024

Hey! I've had the same problem recently a couple of times after I started using the atlas-txn-sender, also running the process in the same machine as the RPC. Interestingly I receive a notifications from my cloud provider about protection from a DDOS event at the same times.

@Solarb80
Copy link

I'm seeing the same problem. If I use systemd to prevent atlas-txn-sender from binding ports in the validator's range I can it attempts to bind the ports at solana-quic-client-1.17.28/src/nonblocking/quic_client.rs. Unfortunately the port range the quic client uses is hardcoded to 8000-10000. It's not clear to me why the connection cache seems to be holding more and more ports over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants