Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexes v3 #78

Merged
merged 64 commits into from
Feb 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
f738749
Remove compactindex (8 byte)
gagliardetto Nov 14, 2023
0dae7ce
Add compactindexsized
gagliardetto Nov 14, 2023
78d8860
Replace compactindex with compactindexsized (8 bytes)
gagliardetto Nov 14, 2023
c4d050e
Delete compactindex36
gagliardetto Nov 14, 2023
73406be
Use compactindexsized instead of compactindex36
gagliardetto Nov 14, 2023
77e15b9
compactindexsized: add arbitrary metadata in the header
gagliardetto Nov 14, 2023
a90c9ce
Cleanup tests
gagliardetto Nov 14, 2023
fa63b7a
Add metadata shortcut
gagliardetto Nov 14, 2023
d68a8ef
Better errors; more methods
gagliardetto Nov 15, 2023
1f6b6de
Cleanup
gagliardetto Nov 17, 2023
c3e4e3a
Refactor meta
gagliardetto Nov 17, 2023
bfd8795
Refactor bucketteer
gagliardetto Dec 4, 2023
cf3e130
Refactor indexes
gagliardetto Dec 4, 2023
1fa3195
Refactor lassie fetch
gagliardetto Dec 4, 2023
659f455
Cleanup flags
gagliardetto Dec 4, 2023
dbddacc
Remove deprecated commands
gagliardetto Dec 4, 2023
9ad7951
Remove deprecated
gagliardetto Dec 4, 2023
5838ffe
Refactor cache
gagliardetto Dec 4, 2023
072b6e1
Cleanup max cache
gagliardetto Dec 4, 2023
e36a709
cid_to_offset => cid_to_offset_and_size
gagliardetto Dec 4, 2023
6ee3412
Use github.com/valyala/fasthttp/reuseport
gagliardetto Dec 5, 2023
f11c35f
Cleanup readahead
gagliardetto Dec 5, 2023
4e9d6b7
Equalize usage of index metadata
gagliardetto Dec 5, 2023
93ac5dc
Add index meta to gsfa and sigexists
gagliardetto Dec 5, 2023
5c31f84
Cleanup indexing
gagliardetto Dec 5, 2023
696dfdf
Cleanup gsfa
gagliardetto Dec 5, 2023
2c845bd
GSFA: If no found signatures, return empty result instead of error.
gagliardetto Dec 5, 2023
a8e3938
Merge remote-tracking branch 'upstream/main' into indexes-v3
gagliardetto Dec 5, 2023
0915197
Improve index filenames
gagliardetto Dec 5, 2023
9df2aa2
Add libraries to manage split car files
gagliardetto Dec 5, 2023
7f6a462
Stub jsonParsed
gagliardetto Dec 5, 2023
643f2d7
Statically link the ffi library
gagliardetto Dec 5, 2023
805a239
Cleanup rust
gagliardetto Dec 5, 2023
bc92daa
Cleanup
gagliardetto Dec 5, 2023
204e8f3
Cleanup FFI
gagliardetto Dec 5, 2023
b26ab62
Make network and epoch options more intuitive
gagliardetto Dec 5, 2023
3c87215
Cleanup
gagliardetto Dec 5, 2023
190f353
Add getFirstAvailableBlock; closes #41
gagliardetto Dec 5, 2023
fef1620
Add getSlot; maybe closes #42
gagliardetto Dec 5, 2023
69d1c1d
Add comment
gagliardetto Dec 5, 2023
d09abf8
Seal all indexes at the same time
gagliardetto Dec 10, 2023
77d2443
Refactor config
gagliardetto Dec 10, 2023
c636295
Add support for remote split car pieces
gagliardetto Dec 11, 2023
3a3e63c
Fix tests
gagliardetto Dec 14, 2023
a18e67e
Add json tags
gagliardetto Dec 18, 2023
e813ba2
Fix filename of sig-exists indexes
gagliardetto Dec 19, 2023
b1ad395
Add check-deals command
gagliardetto Jan 12, 2024
bdfafc4
More logs for check-deals
gagliardetto Jan 16, 2024
f0eabe7
Check all pieces before returning an error (all errors together)
gagliardetto Jan 17, 2024
c996ad0
Add deprecated indexes
gagliardetto Jan 18, 2024
4aeb564
chmod +x
gagliardetto Jan 18, 2024
4bd8ca6
Add support for deprecated indexes
gagliardetto Jan 18, 2024
184ed32
Add more documentation to README file.
gagliardetto Jan 18, 2024
753f197
Improve address lookup support for jsonParsed format.
gagliardetto Jan 18, 2024
5d09877
Fix encodeTransactionResponseBasedOnWantedEncoding
gagliardetto Jan 18, 2024
bc6eeec
Add index magic replacer
gagliardetto Jan 21, 2024
b838cea
Cleanup
gagliardetto Jan 21, 2024
5267ce5
check-deals: add whitelist of providers
gagliardetto Jan 26, 2024
d33ec4d
Improve docs
gagliardetto Jan 26, 2024
321e2bb
Cleanup
gagliardetto Jan 26, 2024
042ceb1
Use go 1.21.x
gagliardetto Jan 26, 2024
b30e35f
Fix tests
gagliardetto Jan 26, 2024
056fe16
Add versioning to config, and support for data.car.from_pieces.piece_…
gagliardetto Jan 28, 2024
ff81ac5
Update docs: add version
gagliardetto Jan 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/build-release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: main
on:
push:
tags:
- "v*.*.*"
- 'v*.*.*'

jobs:
build:
Expand All @@ -17,7 +17,7 @@ jobs:
- name: Setup go env
uses: actions/setup-go@v3
with:
go-version: '1.20'
go-version: '1.21'
check-latest: true

- name: Build cli
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ jobs:
test:
strategy:
matrix:
go-version: [1.20.x]
go-version: [1.21.x]
os: [ubuntu-latest]
runs-on: ${{ matrix.os }}
steps:
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,4 @@
/bin/
*.car
_site
/.cargo
29 changes: 23 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,24 +1,41 @@
DEFAULT:compile

IPLD_SCHEMA_PATH := ledger.ipldsch
LD_FLAGS := "-X main.GitCommit=`git rev-parse HEAD` -X main.GitTag=`git symbolic-ref -q --short HEAD || git describe --tags --exact-match || git rev-parse HEAD`"
BASE_LD_FLAGS := -X main.GitCommit=`git rev-parse HEAD` -X main.GitTag=`git symbolic-ref -q --short HEAD || git describe --tags --exact-match || git rev-parse HEAD`

ROOT_DIR := $(dir $(realpath $(lastword $(MAKEFILE_LIST))))

build-rust-wrapper:
rm -rf txstatus/lib
cd txstatus && cargo build --release --lib --target=x86_64-unknown-linux-gnu --target-dir=target
cbindgen ./txstatus -o txstatus/lib/transaction_status.h --lang c
echo "build-rust-wrapper done"
jsonParsed-linux: build-rust-wrapper
# build faithful-cli with jsonParsed format support via ffi (rust)
rm -rf ./bin/faithful-cli_jsonParsed
# static linking:
cp txstatus/target/x86_64-unknown-linux-gnu/release/libdemo_transaction_status_ffi.a ./txstatus/lib/libsolana_transaction_status_wrapper.a
LD_FLAGS="$(BASE_LD_FLAGS) -extldflags -static"
go build -ldflags=$(LD_FLAGS) -tags ffi -o ./bin/faithful-cli_jsonParsed .
echo "built old-faithful with jsonParsed format support via ffi (rust)"
compile:
@echo "\nCompiling faithful-cli binary for current platform ..."
go build -ldflags=$(LD_FLAGS) -o ./bin/faithful-cli .
go build -ldflags="$(BASE_LD_FLAGS)" -o ./bin/faithful-cli .
chmod +x ./bin/faithful-cli
compile-all: compile-linux compile-mac compile-windows
compile-linux:
@echo "\nCompiling faithful-cli binary for linux amd64 ..."
GOOS=linux GOARCH=amd64 go build -ldflags=$(LD_FLAGS) -o ./bin/linux/amd64/faithful-cli_linux_amd64 .
GOOS=linux GOARCH=amd64 go build -ldflags="$(BASE_LD_FLAGS)" -o ./bin/linux/amd64/faithful-cli_linux_amd64 .
chmod +x ./bin/linux/amd64/faithful-cli_linux_amd64
compile-mac:
@echo "\nCompiling faithful-cli binary for mac amd64 ..."
GOOS=darwin GOARCH=amd64 go build -ldflags=$(LD_FLAGS) -o ./bin/darwin/amd64/faithful-cli_darwin_amd64 .
GOOS=darwin GOARCH=amd64 go build -ldflags="$(BASE_LD_FLAGS)" -o ./bin/darwin/amd64/faithful-cli_darwin_amd64 .

@echo "\nCompiling faithful-cli binary for mac arm64 ..."
GOOS=darwin GOARCH=arm64 go build -ldflags=$(LD_FLAGS) -o ./bin/darwin/arm64/faithful-cli_darwin_arm64 .
GOOS=darwin GOARCH=arm64 go build -ldflags="$(BASE_LD_FLAGS)" -o ./bin/darwin/arm64/faithful-cli_darwin_arm64 .
compile-windows:
@echo "\nCompiling faithful-cli binary for windows amd64 ..."
GOOS=windows GOARCH=amd64 go build -ldflags=$(LD_FLAGS) -o ./bin/windows/amd64/faithful-cli_windows_amd64.exe .
GOOS=windows GOARCH=amd64 go build -ldflags="$(BASE_LD_FLAGS)" -o ./bin/windows/amd64/faithful-cli_windows_amd64.exe .
test:
go test -v ./...
bindcode: install-deps
Expand Down
171 changes: 139 additions & 32 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,116 @@ This repo provides the `faithful-cli` command line interface. This tool allows y
- getBlock
- getTransaction
- getSignaturesForAddress
- getBlockTime
- getGenesisHash (for epoch 0)
- getFirstAvailableBlock
- getSlot
- getVersion

## RPC server

The RPC server is available via the `faithful-cli rpc` command.

The command accepts a list of [epoch config files](#epoch-configuration-files) and dirs as arguments. Each config file is specific for an epoch and provides the location of the block/transaction data and the indexes for that epoch. The indexes are used to map Solana block numbers, transaction signatures and addresses to their respective CIDs. The indexes are generated from the CAR file and can be generated via the `faithful-cli index` command (see [Index generation](#index-generation)).

It supports the following flags:

- `--listen`: The address to listen on, e.g. `--listen=:8888`
- `--include`: You can specify one or more (reuse the same flag multiple times) glob patterns to include files or dirs that match them, e.g. `--include=/path/epoch-*.yml`.
- `--exclude`: You can specify one or more (reuse the same flag multiple times) glob patterns to exclude files or dirs that match them, e.g. `--exclude=/something-*/epoch-*.yml`.
- `--debug`: Enable debug logging.
- `--proxy`: Proxy requests to a downstream RPC server if the data can't be found in the archive, e.g. `--proxy=/path/to/my-rpc.json`. See [RPC server proxying](#rpc-server-proxying) for more details.
- `--gsfa-only-signatures`: When enabled, the RPC server will only return signatures for getSignaturesForAddress requests instead of the full transaction data.
- `--watch`: When specified, all the provided epoch files and dirs will be watched for changes and the RPC server will automatically reload the data when changes are detected. Usage: `--watch` (boolean flag). This is useful when you want to provide just a folder and then add new epochs to it without having to restart the server.
- `--epoch-load-concurrency=2`: How many epochs to load in parallel when starting the RPC server. Defaults to number of CPUs. This is useful when you have a lot of epochs and want to speed up the initial load time.
- `--max-cache=<megabytes>`: How much memory to use for caching. Defaults to 0 (no limit). This is useful when you want to limit the memory usage of the RPC server.

NOTES:

- By default, the RPC server doesn't support the `jsonParsed` format. You need to build the RPC server with the `make jsonParsed-linux` flag to enable this.

## Epoch configuration files

To run a Faithful RPC server you need to specify configuration files for the epoch(s) you want to host. An epoch config file looks like this:

```yml
epoch: 0 # epoch number (required)
version: 1 # version number (required)
data: # data section (required)
car:
# Source the data from a CAR file (car-mode).
# The URI can be a local filepath or an HTTP url.
# This makes the indexes.cid_to_offset_and_size required.
# If you are running in filecoin-mode, you can omit the car section entirely.
uri: /media/runner/solana/cars/epoch-0.car
filecoin:
# filecoin-mode section: source the data directly from filecoin.
# If you are running in car-mode, you can omit this section.
# if enable=true, then the data will be sourced from filecoin.
# if enable=false, then the data will be sourced from a CAR file (see 'car' section above).
enable: false
genesis: # genesis section (required for epoch 0 only)
# Local filepath to the genesis tarball.
# You can download the genesis tarball from
# wget https://api.mainnet-beta.solana.com/genesis.tar.bz2
uri: /media/runner/solana/genesis.tar.bz2
indexes: # indexes section (required)
cid_to_offset_and_size:
# Required when using a CAR file; you can provide either a local filepath or a HTTP url.
# Not used when running in filecoin-mode.
uri: '/media/runner/solana/indexes/epoch-0/epoch-0-bafyreifljyxj55v6jycjf2y7tdibwwwqx75eqf5mn2thip2sswyc536zqq-mainnet-cid-to-offset-and-size.index'
slot_to_cid:
# required (always); you can provide either a local filepath or a HTTP url:
uri: '/media/runner/solana/indexes/epoch-0/epoch-0-bafyreifljyxj55v6jycjf2y7tdibwwwqx75eqf5mn2thip2sswyc536zqq-mainnet-slot-to-cid.index'
sig_to_cid:
# required (always); you can provide either a local filepath or a HTTP url:
uri: '/media/runner/solana/indexes/epoch-0/epoch-0-bafyreifljyxj55v6jycjf2y7tdibwwwqx75eqf5mn2thip2sswyc536zqq-mainnet-sig-to-cid.index'
sig_exists:
# required (always); you can provide either a local filepath or a HTTP url:
uri: '/media/runner/solana/indexes/epoch-0/epoch-0-bafyreifljyxj55v6jycjf2y7tdibwwwqx75eqf5mn2thip2sswyc536zqq-mainnet-sig-exists.index'
gsfa: # getSignaturesForAddress index
# optional; must be a local directory path.
uri: '/media/runner/solana/indexes/epoch-0/gsfa/epoch-0-bafyreifljyxj55v6jycjf2y7tdibwwwqx75eqf5mn2thip2sswyc536zqq-gsfa.indexdir'
```

NOTES:

- The `uri` parameter supports both HTTP URIs as well as file based ones (where not specified otherwise).
- If you specify an HTTP URI, you need to make sure that the url supports HTTP Range requests. S3 or similar APIs will support this.

## Index generation

To run the old-faithful RPC server you need to generate indexes for the CAR files. You can do this via the `faithful-cli index` command.

- `faithful-cli index all <car-file> <output-dir>`: Generate all **required** indexes for a CAR file.
- `faithful-cli index gsfa <car-file> <output-dir>`: Generate the gsfa index for a CAR file.

NOTES:

- You need to have the CAR file available locally.
- The `cid_to_offset_and_size` index has an older version, which you can specify with `cid_to_offset` instead of `cid_to_offset_and_size`.

Flags:

- `--tmp-dir=/path/to/tmp/dir`: Where to store temporary files. Defaults to the system temp dir. (optional)
- `--verify`: Verify the indexes after generation. (optional)
- `--network=<network>`: Which network to use for the gsfa index. Defaults to `mainnet` (other options: `testnet`, `devnet`). (optional)

## RPC server proxying

The RPC server provides a proxy mode which allows it to forward traffic it can't serve to a downstream RPC server. To configure this, simply provide the command line argument `--proxy=/path/to/faithful-proxy-config.json` pointing it to a config file. The config file should look like this:

```json
{
"target": "https://api.mainnet-beta.solana.com",
"headers": {
"My-Header": "My-Value"
},
"proxyFailedRequests": true
}
```

The `proxyFailedRequests` flag will make the RPC server proxy not only RPC methods that it doesn't support, but also retry requests that failed to be served from the archives (e.g. a `getBlock` request that failed to be served from the archives because that epoch is not available).

### RPC server from old-faithful.net

Expand Down Expand Up @@ -47,19 +157,15 @@ $ ../tools/download-gsfa.sh 0 ./epoch0

If you have a local copy of a CAR archive and the indexes and run a RPC server servicing data from them. For example:

```
/usr/local/bin/faithful-cli rpc-server-car \
```bash
/usr/local/bin/faithful-cli rpc \
--listen $PORT \
epoch-455.car \
epoch-455.car.*.cid-to-offset.index \
epoch-455.car.*.slot-to-cid.index \
epoch-455.car.*.sig-to-cid.index \
epoch-455.car-*-gsfa-index
/path/to/epoch-455.yml
```

You can download the CAR files either via Filecoin or via the bucket provided by Triton. There are helper scripts in the `tools` folder. To download the full epoch data:

```
```bash
$ mkdir epoch0
$ cd epoch0
$ ../tools/download-epoch.sh 0
Expand All @@ -68,7 +174,7 @@ $ ../tools/download-gsfa.sh 0
```

Once files are downloaded there are also utility scripts to run the server:
```
```bash
$ ./tools/run-rpc-server-local.sh 0 ./epoch0
```

Expand All @@ -80,24 +186,25 @@ The filecoin RPC server allows provide getBlock, getTransaction and getSignature

You can run it in the following way:

```
faithful-cli rpc-server-filecoin -config 455.yml
```bash
faithful-cli rpc 455.yml
```

The config file points faithful to the location of the required indexes (`455.yaml`):
```
```bash
indexes:
slot_to_cid: './epoch-455.car.bafyreibkequ55hyrhyk6f24ctsofzri6bjykh76jxl3zju4oazu3u3ru7y.slot-to-cid.index'
sig_to_cid: './epoch-455.car.bafyreibkequ55hyrhyk6f24ctsofzri6bjykh76jxl3zju4oazu3u3ru7y.sig-to-cid.index'
sig_exists: './epoch-455.car.bafyreibkequ55hyrhyk6f24ctsofzri6bjykh76jxl3zju4oazu3u3ru7y.sig-exists.index'
gsfa: './epoch-455.car.gsfa.index'
```

Due to latency in fetching signatures, typically the getSignaturesForAddress index needs to be stored in a local directory, but the other indexes can be fetched via HTTP or via local file system access. If you provide a URL, you need to make sure that the url supports HTTP Range requests. S3 or similar APIs will support this.

There is a mode in which you can use a remote gSFA index, which limits it to only return signatures and not additional transaction meta data. In this mode, you can use a remote gSFA index. To enable this mode run faithful-cli in the following way:

```
faithful-cli rpc-server-filecoin -config 455.yml -gsfa-only-signatures=true
```bash
faithful-cli rpc -gsfa-only-signatures=true 455.yml
```

### Filecoin fetch via CID
Expand All @@ -110,7 +217,7 @@ The production RPC server is accessible via `faithful-cli rpc`. More documentati

### Limitations

The testing server (`rpc-server-car` and `rpc-server-filecoin`) only supports single Epoch access. The production server supports handling a full set of epochs.
The (deprecated) testing server (`rpc-server-car` and `rpc-server-filecoin`) only supports single Epoch access. The production server supports handling a full set of epochs.

Filecoin retrievals without a CDN can also be slow. We are working on integration with Filecoin CDNs and other caching solutions. Fastest retrievals will happen if you service from local disk.

Expand All @@ -127,8 +234,8 @@ Indexes will be needed to map Solana's block numbers, transaction signatures and
- slot-to-cid: Lookup a CID based on a slot number
- tx-to-cid: Lookup a CID based on a transaction signature
- gsfa: An index mapping Solana addresses to a list of singatures
- cid-to-offset: Index for a specific CAR file, used by the local rpc server (see above) to find CIDs in a car file
- sig-exists: An index to speed up lookups for signatures when using multiepoch support in the production server
- cid-to-offset-and-size: Index for a specific CAR file, used by the local rpc server (see above) to find CIDs in a car file
- sig-exists: An index to speed up lookups for signatures when using multiepoch support in the production server.

### Archive access

Expand All @@ -142,7 +249,7 @@ The data that you will need to be able to run a local RPC server is:
1) the Epoch car file containing all the data for that epoch
2) the slot-to-cid index for that epoch
3) the tx-to-cid index for that epoch
4) the cid-to-offset index for that epoch car file
4) the cid-to-offset-and-size index for that epoch car file
5) the sig-exists index for that epoch (optional, but important to speed up multiepoch fetches)
6) Optionally (if you want to support getSignaturesForAddress): the gsfa index

Expand Down Expand Up @@ -180,7 +287,7 @@ The data generation flow is illustrated below:
Once you have downloaded rocksdb ledger archives you can run the Radiance tool to generate a car file for an epoch. Make sure you have all the slots available in rocksdb ledger archive for the epoch. You may need to download multiple ledger snapshots in order to have a full set of slots available. Once you know you have a rocksdb that covers all the slots for the epoch run the radiance tool like follows:

```
radiance car create2 107 --db=46223992/rocksdb --out=/storage/car/epoch-107.car
radiance car create 107 --db=46223992/rocksdb --out=/storage/car/epoch-107.car
```

This will produce a car file called epoch-107.car containing all the blocks and transactions for that epoch.
Expand All @@ -189,36 +296,36 @@ This will produce a car file called epoch-107.car containing all the blocks and

Once the radiance tooling has been used to prepare a car file (or if you have downloaded a car file externally) you can generate indexes from this car file by using the `faithful-cli`:

```
```bash
NAME:
faithful index
faithful CLI index - Create various kinds of indexes for CAR files.

USAGE:
faithful index command [command options] [arguments...]
faithful CLI index command [command options] [arguments...]

DESCRIPTION:
Create various kinds of indexes for CAR files.

COMMANDS:
cid-to-offset
slot-to-cid
sig-to-cid
all
gsfa
sig-exists
cid-to-offset
slot-to-cid
sig-to-cid
all Create all the necessary indexes for a Solana epoch.
gsfa
sig-exists
help, h Shows a list of commands or help for one command

OPTIONS:
--help, -h show help
```

For example, to generate the three indexes cid-to-offset, slot-to-cid, sig-to-cid, sig-exists you would run:
For example, to generate the three indexes cid-to-offset-and-size, slot-to-cid, sig-to-cid, sig-exists you would run:

```
faithful-cli index all epoch-107.car .
```bash
faithful-cli index all epoch-107.car /storage/indexes/epoch-107
```

This would generate the indexes in the current dir for epoch-107.
This would generate the indexes in `/storage/indexes/epoch-107` for epoch-107.

## Contributing

Expand Down
6 changes: 6 additions & 0 deletions adapters.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,12 @@ func byteSliceAsIntegerSlice(b []byte) []uint64 {
// adaptTransactionMetaToExpectedOutput adapts the transaction meta to the expected output
// as per what solana RPC server returns.
func adaptTransactionMetaToExpectedOutput(m map[string]any) map[string]any {
{
_, ok := m["blockTime"]
if !ok {
m["blockTime"] = nil
}
}
meta, ok := m["meta"].(map[string]any)
if !ok {
return m
Expand Down
2 changes: 1 addition & 1 deletion bucketteer/bucketteer.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ func Magic() [8]byte {
return _Magic
}

const Version = uint64(1)
const Version = uint64(2)

func sortWithCompare[T any](a []T, compare func(i, j int) int) {
sort.Slice(a, func(i, j int) bool {
Expand Down
Loading
Loading