Monorepo with components required for the implementation of DeRisk on Solana.
- [
Database
] Database for both raw and processed data. - [
Raw Data Fetching
] Scripts that fetch raw transactions and dex data and store it the database. - [
Data Processing
] Scripts that process both raw data and previously processed data. - [
API
] API with enpoints for fetching data from the database. - [
Frontend
] Frontend with visualizations of the processed data.
Each component is described in detail below.
The project uses Postgres 15 as database. The schema can be found in db/schema.sql
. To start the database run:
docker build -t db -f Dockerfile.db .
export POSTGRES_USER=<username>
export POSTGRES_PASSWORD=<password>
# default port for Postgres is 5432
docker run -p 5432:5432 db
For fetching transactions, the following ENV variables are needed:
POSTGRES_USER
name of the database userPOSTGRES_PASSWORD
database user's passwordPOSTGRES_HOST
host address, IP address or DNS of the databasePOSTGRES_DB
database nameAUTHENTICATED_RPC_URL
URL of the node provider, includingthe RPC token, used to initialize the Solana clientRATE_LIMIT
maximum number of RPC calls allowed per second
Then, run the following commands:
docker build -t raw-data-fetching -f Dockerfile.raw-data-fetching .
docker run -e POSTGRES_USER=<username> -e POSTGRES_PASSWORD=<password> -e POSTGRES_HOST=<host> -e POSTGRES_DB=<db_name> -e AUTHENTICATED_RPC_URL=<rpc_url> -e RATE_LIMIT=<rate_limit> raw-data-fetching
The data collection works as follows:
- Signatures of transactions are fetched in batches and stored in the database.
- For each slot containing at least 1 signature, all transactions are fetched, but only those which match the previously saved signatures are kept and saved in the database.
The data is being fetched from the newest transactions at the time when the script is run (T0
). When the available history preceding T0
is stored in the database, the process starts again at time T1
, fetching all data between T0
and T1
. With the rate limit set high enough, the process approaches a state when the script feeds the database with nearly real-time data. If restarted, the fetching of the data continues from the last transaction stored in the database.
For updating CLOB DEXes data following environmental variables are required:
POSTGRES_USER
name of the database userPOSTGRES_PASSWORD
database user's passwordPOSTGRES_HOST
host address, IP address or DNS of the databasePOSTGRES_DB
database nameAUTHENTICATED_RPC_URL
Solana rpc url
Then, run the following commands:
docker build --file ./Dockerfile.ob-liq-fetcher -t ob-liq-fetcher .
docker run -d -e POSTGRES_USER=$POSTGRES_USER -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD -e POSTGRES_HOST=$POSTGRES_HOST -e POSTGRES_DB=$POSTGRES_DB -e AUTHENTICATED_RPC_URL=$AUTHENTICATED_RPC_URL ob-liq-fetcher
Data collected from following CLOBs:
The orderbook_liquidity table is structured in a following way:
timestamp
: Stores time of the data record.dex
: Identifies the DEX from which the orderbook data were collected.pair
: Represents the token pair of the liquidity pool in following format<symbol_x>/<symbol_y>
e.g."SOL/USDC"
.market_address
: Holds the market's pubkey.bids
andasks
: Hold lists of two sized tuples where first entry in the tuple is price level and the second entry is the amount of liquidity on given level.
For updating AMMs' pools data following environmental variables are required:
POSTGRES_USER
name of the database userPOSTGRES_PASSWORD
database user's passwordPOSTGRES_HOST
host address, IP address or DNS of the databasePOSTGRES_DB
database nameAUTHENTICATED_RPC_URL
Solana rpc url
Then, run the following commands:
docker build --file ./Dockerfile.update-amm-pools -t amms .
docker run -d --name amms -e POSTGRES_USER=$POSTGRES_USER -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD -e POSTGRES_HOST=$POSTGRES_HOST -e POSTGRES_DB=$POSTGRES_DB amms
Note that updating of few AMMs' pools is done with their corresponding TypeScript SDK. These are updated in separate Docker container, which is run followingly:
docker build --file ./Dockerfile.ts-fetching -t ts-fetcher .
docker run -d -e POSTGRES_USER=$POSTGRES_USER -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD -e POSTGRES_HOST=$POSTGRES_HOST -e POSTGRES_DB=$POSTGRES_DB -e AUTHENTICATED_RPC_URL=$AUTHENTICATED_RPC_URL ts-fetcher
Data collected from following DEXes:
Data Collection Strategy:
The data collection process involves querying each DEX's first-party API for current pool statuses. This process is automated through a scheduled task that runs the above Docker container, ensuring data freshness. The approach is as follows:
- Initialization: On startup, the Docker container queries the DEX APIs to fetch the initial state of all liquidity pools.
- Continuous Update: The container polls the APIs every 300 seconds to capture any changes in pools' liquidity.
- Data Transformation: Before insertion into the database, data undergoes normalization to fit the amm_liquidity table schema, ensuring consistency across different DEX sources.
For LIFINITY, Sentre, Saber and Invariant, the data collection is done directly using on-chain data through their respective SDK. This is done in regular intervals in a Docker container. Generally speaking, the process is very similar to previous one:
- Initialization: On startup, the Docker container fetches initial on-chain state of all liquidity pools.
- Continuous Update: The container fetches the liqduity every 300 seconds to capture any changes in pools' liquidity.
- Data Transformation: Before insertion into the database, data undergoes normalization to fit the amm_liquidity table schema, ensuring consistency across different DEX sources.
The amm_liquidity table is structured to accommodate data from multiple DEXes efficiently. Key fields include:
timestamp
: Stores time of the data record.dex
: Identifies the DEX from which the pool data was sourced, e.g., Orca, Raydium, Meteora.pair
: Represents the token pair of the liquidity pool in following format<symbol_x>-<symbol_y>
e.g."SOL-USDC"
.market_address
: Holds the pool's pubkey.token_x
andtoken_y
: Capture the amounts of the tokens in the liquidity pool.token_x_decimals
andtoken_y_decimals
: Define the decimal precision for each token in the pair.additional_info
: Additional info provided by DEXes.
Parsing of collected transactions is dockerized. Each protocol has a dedicated Dockerfile:
Dockerfile.mango-parser
for Mango MarketsDockerfile.solend-parser
for SolendDockerfile.marginfi-parserV2
for MarginFiDockerfile.kamino-parser
for Kamino
Example:
docker build -t mango-parser -f Dockerfile.mango-parser .
docker run -e POSTGRES_USER=<db_user> \
-e POSTGRES_PASSWORD=<db_password> -e POSTGRES_HOST=<db_host> \
-e POSTGRES_DB=<db_name> mango-parser
The orchestration of data parsing is managed by Dockerfile.parsing-pipeline
. To run the pipelines, an environment variable PROTOCOL
must be set, specifying which protocol's parser to execute. Example of usage can be find below.
Transaction parsing are handled in the following classes:
KaminoTransactionParserV2
is a class designed to parse transactions for the Kamino protocol.
It decodes transaction instructions, associates them with corresponding log messages, and stores relevant data in a database.
- Transaction Decoding: Decodes and validates transaction instructions against the Kamino program ID.
- Event Handling: Parses specific transaction types (events) and stores structured data for each recognized instruction.
- Data Storage: Saves parsed data to database models designed for different aspects of transaction data.
This parser specifically handles the following types of events within the Kamino protocol:
initObligation
: Initializes a new lending obligation.initReserve
: Sets up a new reserve.- Transactional events such as:
depositObligationCollateral
withdrawObligationCollateral
depositReserveLiquidity
redeemReserveCollateral
borrowObligationLiquidity
repayObligationLiquidity
depositReserveLiquidityAndObligationCollateral
withdrawObligationCollateralAndRedeemReserveCollateral
liquidateObligationAndRedeemReserveCollateral
flashRepayReserveLiquidity
flashBorrowReserveLiquidity
- These events handle the collateral and debt change within the Kamino ecosystem.
The parser stores data in the following database tables:
kamino_obligation_v2
: Stores information about lending obligations.kamino_reserve_v2
: Contains details about liquidity reserves.kamino_parsed_transactions_v2
: Logs detailed transaction data including the type of event, associated accounts, amounts, and other transaction metadata.
MangoTransactionParserV2
is designed to parse and decode
transaction events specifically for the Mango protocol.
It handles various log events related to trading and lending activities affecting size of debt and its collateralization on the MangoV4 platform.
- Transaction Decoding: Decodes transaction events using a predefined IDL (Interface Description Language) and associates them with the Mango program ID.
- Event Handling: Identifies and processes all relevant event types within the Mango ecosystem.
This parser handles numerous event types, including but not limited to:
- Trading events like
FillLog
,FillLogV2
,FillLogV3
, andPerpTakerTradeLog
. - Loan and collateral events such as
WithdrawLoanLog
,DepositLog
,WithdrawLog
, andTokenCollateralFeeLog
. - Perpetual and futures related logs like
PerpUpdateFundingLog
,PerpLiqBaseOrPositivePnlLog
,PerpLiqNegativePnlOrBankruptcyLog
, andPerpForceClosePositionLog
. - Flash loan events such as
FlashLoanLog
,FlashLoanLogV2
,FlashLoanLogV3
. - Logs related to conditional swaps, including
TokenConditionalSwapCreateLog
,TokenConditionalSwapTriggerLog
, andTokenConditionalSwapCancelLog
.
Parsed events are stored in the MangoParsedEvents
table, which includes fields like:
transaction_id
: Unique identifier of the transaction.event_name
: Name of the event parsed.event_data
: Detailed structured data associated with the event.
MarginfiTransactionParserV2
is a specialized class designed to decode and parse transaction instructions associated with the Marginfi protocol.
- Transaction Decoding: Decodes transaction instructions that match the Marginfi program ID using the provided IDL.
- Event Handling: Parses a defined set of transaction events related to Marginfi operations, ensuring that each is correctly interpreted and processed.
- Data Storage: Structures and stores parsed transaction data in relevant database tables, facilitating further analysis and record-keeping.
The parser recognizes and handles:
- Account initialization and management:
marginfiAccountInitialize
,lendingPoolAddBank
- Lending operations:
lendingAccountDeposit
,lendingAccountWithdraw
,lendingAccountBorrow
,lendingAccountRepay
- Account and position management:
lendingAccountCloseBalance
,lendingAccountLiquidate
- Emission-related transactions:
lendingAccountWithdrawEmissions
,lendingAccountSettleEmissions
- Flashloan transactions:
lendingAccountStartFlashloan
,lendingAccountEndFlashloan
Parsed data is saved to several database models tailored for different aspects of transaction data:
MarginfiLendingAccountsV2
: Captures data about lending accounts.MarginfiBankV2
: Stores information about banks within the Marginfi protocol.MarginfiParsedTransactionsV2
: Logs comprehensive details about each parsed transaction event, including transaction identifiers, event names, and associated data.
To utilize this parser, instantiate it with the path to the Marginfi IDL and the program's public key.
The SolendTransactionParser
is designed to parse and decode transaction instructions associated with the Solend protocol.
- Transaction Decoding: Decodes transaction instructions based on a predefined set of instruction types using Solend's program ID.
- Event Handling: Identifies and processes a wide range of financial operations such as deposits, withdrawals, loans, and repayments.
This parser handles several types of events linked to the core functionality of Solend, including:
- Market and reserve initialization:
init_lending_market
,init_reserve
- Liquidity management:
deposit_reserve_liquidity
,redeem_reserve_collateral
,withdraw_obligation_collateral_and_redeem_reserve_liquidity
- Obligation management:
init_obligation
,deposit_obligation_collateral
,withdraw_obligation_collateral
,refresh_obligation
- Loan operations:
borrow_obligation_liquidity
,repay_obligation_liquidity
,liquidate_obligation
- Flash loan operations:
flash_loan
,flash_borrow_reserve_liquidity
,flash_repay_reserve_liquidity
- Other relevant transactions like
forgive_debt
andredeem_fee
The parsed data is stored in tables such as solend_reserves
and solend_sbligations
:
Instantiate the parser with the Solend program's public key.
Data parsing orchestration is assured by Dockerfile.parsing-pipeline
.
To run pipelines ENV variable PROTOCOL
is required.
Example:
docker build -t parsing-pipeline -f Dockerfile.parsing-pipeline .
docker run -e PROTOCOL=,protocol_name. POSTGRES_USER=<db_user> \
-e POSTGRES_PASSWORD=<db_password> -e POSTGRES_HOST=<db_host> \
-e POSTGRES_DB=<db_name> parsing-pipeline
Allowed protocol names: mango
, kamino
, marginfi
and solend
.
Loans states relevant for the slot witch the last available parsed transactions are computed utilizing Dockerfile.event_processing
. To run the container, an ENV variable PROTOCOL
is required. The loan state for the given protocol will then be computed and saved to the database. The computation is achieved by "replaying" all historical transactions that adjust the holdings of tokens of the given protocol. Deposit events increase the users' collateral, the withdrawal events decrease it. Borrowing events increase the users' debt, the repayment events decrease it. Liquidation events decrease both the users' collateral and debt.
The kamino_loan_states
, mango_loan_states
, marginfi_loan_states
and solend_loan_states
tables are structured in the following way:
slot
: Stores the slot for which the data is relevant.protocol
: Identifies the protocol, i.e.kamino
,mango
,marginfi
, orsolend
.user
: Identifies the user.collateral
: Contains information about the user's collateral in a dictionary format where the keys are token addresses and values are amounts.debt
: Contains information about the user's debt in a dictionary format where the keys are token addresses and values are amounts.
Loans states relevant for the slot witch the last available parsed transactions are computed utilizing Dockerfile.event_processing
. To run the container, an ENV variable PROTOCOL
is required. The loan state for the given protocol will then be computed and saved to the database. The computation is achieved by "replaying" all historical transactions that adjust the holdings of tokens of the given protocol. Deposit events increase the users' collateral, the withdrawal events decrease it. Borrowing events increase the users' debt, the repayment events decrease it. Liquidation events decrease both the users' collateral and debt.
While every lending protocol implements their own version of health factor, it is important to have one unified metric that can be used to compare user healths across the whole ecosystem. For that reason we compute the standardized health factor that is, for every user, defined as total risk-adjusted collateral / total risk-adjusted debt
. The standardized health factor ranges from zero to infinity and factors below one signify that the loan can be liquidated. Standardized health factors are stored in {protocol}_health_ratios
tables (protocol can be "solend", "kamino" etc.). They are updated whenever the health factor changes by 1% or more from the latest stored value if the health factor is >= 1.2, or with every 0.25% change if it's below 1.2.
The kamino_health_factors
, mango_health_factors
, marginfi_health_factors
and solend_health_factors
tables are structured in the following way:
slot
: Stores the slot for which the data is relevant.protocol
: Identifies the protocol, i.e.kamino
,mango
,marginfi
, orsolend
.user
: Users addresshealth_factor
: Health factor as defined by the corresponding lending protocolstd_health_factor
: Standardized health factorcollateral
: Users total collateral in USDrisk_adjusted_collateral
: Users total collateral in USD, risk adjusteddebt
: Users total debt in USDrisk_adjusted_debt
: Users total debt in USD, risk adjustedtimestamp
: When was the entry generatedlast_update
: Timestamp of the last update
Liquidable debts relevant for the slot witch the last available loan states are computed utilizing Dockerfile.liquidable_debt_processing
. To run the container, an ENV variable PROTOCOL
is required. The liquidable debts for the given protocol will then be computed and saved to the database. The computation is achieved by taking all loan states of the given protocol and simulating liquidations that would occur had the collateral token price reached any price level in a range from 0 to the current price + 30%. We then sum (accross all users of the given lending protocol) all debt that the liquidators would need to repay in order to liquidate all liquidable loans at the given price level. This way, we obtain the total liquidable debt at the given price for the given protocol. Then, we take the differences between individual price levels, these differences represent the amounts of debt liquidated at the given price level, subject to the assumption that all loans that were liquidable at higher collateral token prices were in fact liquidated. These amounts are then stored in the database.
The kamino_liquidable_debts
, mango_liquidable_debts
, marginfi_liquidable_debts
and solend_liquidable_debts
tables are structured in the following way:
slot
: Stores the slot for which the data is relevant.protocol
: Identifies the protocol, i.e.kamino
,mango
,marginfi
, orsolend
.collateral_token
: Collateral token for which the liquidable debt is measureddebt_token
: Debt token for which the liquidable debt is measuredcollateral_token_price
: Hypothetical price of collateral token for which the liquidable debt is measuredamount
: Amount of liquidable debt at the given price which wasn't liquidable at higher price levels
Amounts of tokens that are available for borrow on following protocols: Kamino, Marginfi, Solend, Mango. The supplies are stored in the db, meaning environment variables POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_HOST, POSTGRES_DB are needed, along with AUTHENTICATED_RPC_URL, which is needed in order to fetch latest onchain data. After launching the docker container and waiting for the first batch of information to be pushed to the db, the supplies will be present in public.token_lending_supplies
table.
The kamino_liquidable_debts
, mango_liquidable_debts
, marginfi_liquidable_debts
and solend_liquidable_debts
tables are structured in the following way:
slot
: Stores the slot for which the data is relevant.protocol
: Identifies the protocol, i.e.kamino
,mango
,marginfi
, orsolend
.collateral_token
: Denotes the token taking the role of collateral in the potential liquidations.debt_token
: Denotes the token taking the role of debt in the potential liquidations.collateral_token_price
: Hypothetical price of the collateral token for which we compute hypothetical amounts of liquidable debt.amount
: Amount of debt that would have to be repayed by liquidators in order to liquidate all loans liquidable had the collateral token price reached the given price.
We compute liquidity available at orderbook exchanges and swap AMMs. By available liquidity
we mean the maximum quantity that can be traded or swapped at the given venue withou moving the price by more than 5%.
We compute the available liquidity on swap AMMs by aggregating the available liquidities on every single AMM/venue. For the given AMM, given debt token, given collateral token and its price, we follow these steps:
- Fetch raw data, i.e., information about the pools (namely which tokens are in the pool and how many of them are there), and saving it to the database. This is described in the section
AMMs
in detail. - Load raw data for the pools containing the tokens of interest from the database. We load the latest data fetched, there is no need to load historical data.
- Simulate trading/swapping to learn how much of the token of interest can be obtained on the AMM without moving the price by more than 5%.
In contrast with AMMs, it's not very simple to simulate liquidity that would available on orderbook exchanges in case the price of the collateral token drops. Nevertheless, for the given exchange and given pair of tokens, we use the following approximation:
- Fetch raw data, i.e., historical snapshots of orderbooks for the tokens of interest, and saving it to the database. This is described in the section
CLOB DEXes
in detail. - Load historical snapshots from the exchange for the tokens of interest. We load historical snapshots fromt the past 5 days.
- For every snapshot, compute limit order quantity available within 5% from the mid price in the direction of interest.
- Take the resulting time series of snapshots and compute its 5% quantile which is a rough estimate of the liquidity the remains in the orderbook when the price of the collateral token drops suddenly.
In order to have unified representation of on-chain liquidity in the database, liquidity normalization is conducted via Dockerfile.liquidity-normalizer
, which fetches latest AMM/CLMM/CLOB liquidity from database, normalizes it (basically transforming all data to orderbook-like data) and then pushes it to public.dex_normalized_liquidity
table. POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_HOST, POSTGRES_DB and AUTHENTICATED_RPC_URL environment variables are needed in order to run this service.
In order to provide signals of possible under-the-water loans Derisk project also implements so called Call to Actions (CTAs), which are generated by service found in Dockerfile.call-to-actions
, which needs following environment variables:
POSTGRES_USER
name of the database userPOSTGRES_PASSWORD
database user's passwordPOSTGRES_HOST
host address, IP address or DNS of the databasePOSTGRES_DB
database name
This service fetches the available liqduidity and liquidable debt for every possible pair and then calculates price levels at which the liquidable debt exceeds the available liquidity. These price levels are then used to generate a CTA message that is stored in the database and can be queried with collateral token address, debt token address and timestamp.
The API exposes an endpoint that allows access to data in the database.
To run the API, the following environmental variables are required:
POSTGRES_USER
name of the database userPOSTGRES_PASSWORD
database user's passwordPOSTGRES_HOST
host address, IP address or DNS of the databasePOSTGRES_DB
database name
Then, run the following commands:
docker build --file ./Dockerfile.api -t api .
docker run -d --name api -e POSTGRES_USER=$POSTGRES_USER -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD -e POSTGRES_HOST=$POSTGRES_HOST -e POSTGRES_DB=$POSTGRES_DB -p 3000:3000 api
The API exposes endpoints that allows access to data in the database.
Since some response may be large, the API uses Gzip
for compression of the responses. It is advised to use the accept encoding header:
Accept-Encoding: gzip
This will speed up the resolution of requests.
Query parameters:
- none
Readiness probe, if the request succeeds, the API is ready to accept requests.
Query parameters:
start_time
- start time in unix timestamp formatend_time
- end time in unix timestamp format
Returns an array of raw transactions within block range.
Query parameters:
start_time
- start time in unix timestamp formatend_time
- end time in unix timestamp formatprotocol
- name of the protocol
Returns an array of parsed transactions within time range for the given protocol.
Query parameters:
token_x
- address of the first tokentoken_y
- address of the second token
Returns bids and asks for the given token pair.
Query parameters:
protocol
- name of the protocolcollateral_token
- collateral token addressdebt_token
- debt token address
Returns an array of liquidable debt for given protocol, collateral token and debt token.
Query parameters:
protocol
- name of the protocol
Returns an array of current loan states.
Query parameters:
protocol
- name of the protocol
Returns an array of current loan states.
Query parameters:
- none
Returns an array of call-to-action items.
The frontend is a streamlit
app which loads the latest outputs to visualize the following:
- Select boxes for the user the choose protocols and tokens of interest.
- Chart of the liquidable debt against the available supply, based on the parameters chosen above.
- Warning message informing the user about the risk of under-the-water loans.
- Table depicting utilization rates for various tokens.
- Pie charts showing each protocol's collateral, debt and supply for various tokens.
- Table depicting various statistics based on which we can compare the lending protocols, e.g., the number of users or total debt.
- Statistics on individual loans with the lowest health factor.
For running the frontend, run the following commands:
docker build -t frontend -f Dockerfile.frontend .
# default port for Streamlit is 8501
docker run -p 8501:8501 frontend