Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(watchers): align taker fee validation retries with makers #2263

Merged
merged 5 commits into from
Nov 19, 2024

Conversation

shamardy
Copy link
Collaborator

@shamardy shamardy commented Nov 4, 2024

To Test:
Already tested by @cipig

@shamardy shamardy added the in progress Changes will be made from the author label Nov 4, 2024
@shamardy shamardy changed the title fix(watchers): unify taker fee validation retry logic between maker and watcher fix(watchers): align taker fee validation retries with makers Nov 4, 2024
@shamardy shamardy added under review and removed in progress Changes will be made from the author labels Nov 4, 2024
@shamardy shamardy requested a review from cipig November 4, 2024 21:44
Copy link
Member

@cipig cipig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the fix, it now works fine, errors like this are gone

· 2024-11-03 05:53:07 +0000 [] swap_watcher:632] Entering the taker swap watcher loop LTC-segwit/DGB-segwit with taker fee hash: cf67f954baf648935ec2e419c98c74a657884867c65cc32897acf2ccd52b4495
03 05:53:07, mm2_main::lp_swap::swap_watcher:514] INFO Watcher loop for swap a569ee07-0797-4dc4-b1a1-e928b48e218c stopped with reason Error(mod:717] InvalidTakerFee("utxo_common:2086] WrongPaymentTx(\"Early confirmation: Fee tx Transaction { version: 1, n_time: None, overwintered: false, version_group_id: 0, inputs: [TransactionInput { previous_output: OutPoint { hash: c8c61d412e285087ecbd0f491835697af14ed5c01c4d14af637b7642fe3427f4, index: 0 }, script_sig: , sequence: 4294967295, script_witness: [30450221008458b766554b3d116a7da9c037b80446bbea33759481cfa39b9db267983a4da60220285b2dc2c43e2bb343c82eed65937e435039869f17a82a83d4aa1df6aa3e132301, 027245291c79cd3856857e0edcc4a479a347fc2ced65ea89236b1329fb07de6847] }], outputs: [TransactionOutput { value: 43616781, script_pubkey: 76a914ca1e04745e8ca0c60d8c5881531d51bec470743f88ac }, TransactionOutput { value: 18201068025, script_pubkey: 00149ac6fafb7e02dc5ebe1589ef9b3558486f20b63e }], lock_time: 1730612792, expiry_height: 0, shielded_spends: [], shielded_outputs: [], join_splits: [], value_balance: 0, join_split_pubkey: 0000000000000000000000000000000000000000000000000000000000000000, join_split_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, binding_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, zcash: false, posv: false, str_d_zeel: None, tx_hash_algo: DSHA256 } confirmed before min_block 20271407\")"))

@cipig
Copy link
Member

cipig commented Nov 6, 2024

still happens sometimes:

· 2024-11-05 20:57:07 +0000 [] swap_watcher:640] Entering the taker swap watcher loop KMD/RVN with taker fee hash: 582548f2b194d9f4881708f343df5e7413d07b7e92b5168552b00e1e085f0aa4
· 2024-11-05 20:57:07 +0000 [] swap_watcher:640] Entering the taker swap watcher loop KMD/XVC with taker fee hash: da618167a5c50a606fc5d42f2350f8fad5b7c5ad94595375667cf36a76f2d1c8

05 20:57:07, mm2_main::lp_swap::swap_watcher:201] ERROR Error validating taker fee: utxo_common:2086] Early confirmation: Fee tx Transaction { version: 1, n_time: Some(1730835680), overwintered: false, version_group_id: 0, inputs: [TransactionInput { previous_output: OutPoint { hash: 584c0503458ef57c73a14719d8261d7a7a2f6e887807dab14bdb6a0837071bfb, index: 0 }, script_sig: 4730440220679900925bfedf55438731dd132461b26adc74eb9602eb7b4b7100a9b123c4d9022005f7ed079aef489bca4a1d1728bf97f11a1db614b65a771752dc0e969c504d62012103ce12053f0513e3a58e19c34655433024b00d717568fd4092120dde413f0e4084, sequence: 4294967295, script_witness: [] }], outputs: [TransactionOutput { value: 3866362, script_pubkey: 76a914ca1e04745e8ca0c60d8c5881531d51bec470743f88ac }, TransactionOutput { value: 3338390538, script_pubkey: 76a914b065f82277f2c4ebdf4e2e75149904b1b39b675a88ac }], lock_time: 1730835680, expiry_height: 0, shielded_spends: [], shielded_outputs: [], join_splits: [], value_balance: 0, join_split_pubkey: 0000000000000000000000000000000000000000000000000000000000000000, join_split_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, binding_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, zcash: false, posv: false, str_d_zeel: None, tx_hash_algo: DSHA256 } confirmed before min_block 1314695

idk which coin it means, it doesn't say... likely XVC

here it's CLAM or FTC

· 2024-11-05 13:40:32 +0000 [] swap_watcher:640] Entering the taker swap watcher loop KMD/CLAM with taker fee hash: 3efd336e7b18596d371a1603ade911e65806ce6c092c53f53b190dcdf39e2bb5
· 2024-11-05 13:40:35 +0000 [] swap_watcher:640] Entering the taker swap watcher loop KMD/FTC-segwit with taker fee hash: bc6170096a469eb848421c50757b7d2e41e23d1b239efbab67ae5a14d2756fe3

05 13:40:35, mm2_main::lp_swap::swap_watcher:201] ERROR Error validating taker fee: utxo_common:2086] Early confirmation: Fee tx Transaction { version: 1, n_time: None, overwintered: false, version_group_id: 0, inputs: [TransactionInput { previous_output: OutPoint { hash: f88e6c56411f7179e6f25466dbdb9f64b7d4dba04bb2a730e8cde721105f116a, index: 1 }, script_sig: , sequence: 4294967295, script_witness: [3045022100cda18fc6f87db1b86e81f6917ca1dc6e486b0ca7ca9b5ac02b3375858399b38d022012820ccde1e50482fbe4d4bd392a2849aff9441bd3bff373e70d71b6e538cb0e01, 028a645cdda5e7785e600cde1f773c1f8f4ed5e6a209c445ea70c7e4fd2cde1cbd] }], outputs: [TransactionOutput { value: 75289575, script_pubkey: 76a914ca1e04745e8ca0c60d8c5881531d51bec470743f88ac }, TransactionOutput { value: 3512585762, script_pubkey: 00149551029d9b85a16f0b72491eb7b63e384b6a6ebf }], lock_time: 1730812809, expiry_height: 0, shielded_spends: [], shielded_outputs: [], join_splits: [], value_balance: 0, join_split_pubkey: 0000000000000000000000000000000000000000000000000000000000000000, join_split_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, binding_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, zcash: false, posv: false, str_d_zeel: None, tx_hash_algo: DSHA256 } confirmed before min_block 5417134

@shamardy
Copy link
Collaborator Author

shamardy commented Nov 8, 2024

@cipig for this #2263 (comment) do we get

INFO Watcher loop for swap {uuid} stopped with reason Error

Or only an error log and it continues?

@cipig
Copy link
Member

cipig commented Nov 8, 2024

it's only that error, i checked log file... eg

· 2024-11-07 23:33:15 +0000 [] swap_watcher:640] Entering the taker swap watcher loop KMD/NENG with taker fee hash: f1563d40e56aae002fa1c509f54a854c5a6a1877c4509efef60cbbae49e6bc72
07 23:33:15, mm2_main::lp_swap::swap_watcher:201] ERROR Error validating taker fee: utxo_common:2086] Early confirmation: Fee tx Transaction { version: 1, n_time: None, overwintered: false, version_group_id: 0, inputs: [TransactionInput { previous_output: OutPoint { hash: 5840e59103591c1764781be6de9062bb20219d7aef30cace2bb7aedca13102b0, index: 2 }, script_sig: 4730440220035bb489c9e6e6396be6197de2e8f2f3ef57debcb2991309bdc40b002adb786302205bab1f32db8e2ecdd0cf6e02b5d8dc1d3be1019af3865b4726acff34656409f9012103dab17a731857e96ed1c1f0b00db861bbb8224cc6188977be0270ea098f3b52cd, sequence: 4294967295, script_witness: [] }], outputs: [TransactionOutput { value: 408145278411, script_pubkey: 76a914ca1e04745e8ca0c60d8c5881531d51bec470743f88ac }, TransactionOutput { value: 5992238441747, script_pubkey: 76a9145cfc9c05a09f824be0fce2ae3217efe3b60b3ddf88ac }], lock_time: 1731022261, expiry_height: 0, shielded_spends: [], shielded_outputs: [], join_splits: [], value_balance: 0, join_split_pubkey: 0000000000000000000000000000000000000000000000000000000000000000, join_split_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, binding_sig: 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000, zcash: false, posv: false, str_d_zeel: None, tx_hash_algo: DSHA256 } confirmed before min_block 5732394
07 23:33:54, libp2p_gossipsub::behaviour:3557] INFO IAmrelay: Adding peer: PeerId("12D3KooWNBQWEnXRBUn5oV362aw9K9J29Ci8qmDTXGJP8J21w29h") to the relays list
07 23:34:31, mm2_main::lp_network:196] ERROR Message cannot continue the process due to: utxo_common:2928] InvalidInput("MalformedData")
07 23:34:32, mm2_main::lp_network:196] ERROR Message cannot continue the process due to: utxo_common:2928] InvalidInput("UnexpectedEnd")

i guess the InvalidInput errors are not related to this... first time i saw those
but i use a custom RUST_LOG, to reduce logs: export RUST_LOG="info,mm2_main::mm2::rpc=off,mm2_libp2p::peers_exchange=off,rustls::check=off,rustls::conn=off,atomicdex_gossipsub::behaviour=warn,mm2_libp2p::atomicdex_behaviour=warn,coins::utxo::utxo_builder::utxo_coin_builder=warn", so could also be that it filtered out some log entries... will unset on one seed/watcher and restart

@shamardy
Copy link
Collaborator Author

shamardy commented Nov 8, 2024

i guess the InvalidInput errors are not related to this... first time i saw those

It's related to this #1238 , are you any coins in native mode on this KDF node?

@cipig
Copy link
Member

cipig commented Nov 8, 2024

It's related to this #1238 , are you any coins in native mode on this KDF node?

No, it's a watcher/seed node and all coins/tokens are enabled with electrum/RPC.

@shamardy
Copy link
Collaborator Author

shamardy commented Nov 8, 2024

No, it's a watcher/seed node and all coins/tokens are enabled with electrum/RPC.

ooh, it's a seed node that's why it receives this message. Will add a fix to not process this for seed nodes that are not running the coin in native mode.

@shamardy
Copy link
Collaborator Author

shamardy commented Nov 8, 2024

ooh, it's a seed node that's why it receives this message. Will add a fix to not process this for seed nodes that are not running the coin in native mode.

It will still process the tx_enum_from_bytes to make sure it's a valid transaction/msg before propagation to other nodes So you will still get these errors for invalid transactions.

@shamardy
Copy link
Collaborator Author

shamardy commented Nov 8, 2024

@onur-ozkan should we also set to_propagate to true here

Some(lp_healthcheck::PEER_HEALTHCHECK_PREFIX) => {
lp_healthcheck::process_p2p_healthcheck_message(&ctx, message).await
},

if let Err(e) = coin.tx_enum_from_bytes(&message.data) {
log::error!("Message cannot continue the process due to: {:?}", e);
return;
};

let fut = coin.send_raw_tx_bytes(&message.data);
ctx.spawner().spawn(async {
if coin.is_utxo_in_native_mode() {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about that, Should we allow seed nodes to send failed swap transactions to electrums? My opinion is no, but opening this to discussion. There is also this comment #1238 (comment) to think about.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are we sending the tx only when running in top of native?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use this in case the electrum/s used by the user in the swap is down or censoring transactions, if any one node in the network have the coin enabled with a native daemon, it will be an additional way to get the transaction through. After your electrums manager PR, we can add all the available electrums to the user's list and since all electrums are tried for broadcasting, this message will be only used as fallback if all electrums fail. Now, one of the seed nodes can have a different electrum which is why I thought about "Should we allow seed nodes to send failed swap transactions to electrums?" but the drawback of this is that this will be resource exhaustive for electrums which is another reason we only send the tx to nodes running the coin on top of native daemons.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean "send failed swap transactions"? (Are those refund transactions - why then should they be treated differently)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are spend or refund transactions. If electrums censor or fail the broadcasting of these transactions, we rely on nodes running native daemon to get it into the mempool. It should be combined with spv validation in the future, so even if an electrum returns transaction broadcasted successfully, we make sure by validating it and if spv fails we try this p2p transaction helper as a last resort.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was also wondered what would happen if several nodes in native mode are requested to send same transaction (like banning nodes).
Apparently not, as I can see in kmd daemons code: if a transaction is relayed from some node's mempool to other nodes (which also have the same transaction in mempool) they do not consider the originating node as misbehaving.

@shamardy shamardy added the P1 label Nov 11, 2024
@onur-ozkan
Copy link
Member

@onur-ozkan should we also set to_propagate to true here

Some(lp_healthcheck::PEER_HEALTHCHECK_PREFIX) => {
lp_healthcheck::process_p2p_healthcheck_message(&ctx, message).await
},

As they are being sent to specific topic that is supposed to be subscribed from only one peer, no.

@shamardy
Copy link
Collaborator Author

As they are being sent to specific topic that is supposed to be subscribed from only one peer, no.

So this is not for propagating from seed nodes? I thought it meant that, will recheck.

@shamardy
Copy link
Collaborator Author

From libp2p gossipsub behaviour

    /// This function should be called when [`Config::validate_messages()`] is `true` in order to
    /// propagate messages. Messages are stored in the ['Memcache'] and validation is expected to be
    /// fast enough that the messages should still exist in the cache.
    ///
    /// Calling this function will propagate a message stored in the cache, if it still exists.
    /// If the message still exists in the cache, it will be forwarded and this function will return true,
    /// otherwise it will return false.
    pub fn propagate_message(
        &mut self,
        message_id: &MessageId,
        propagation_source: &PeerId,
    ) -> Result<bool, PublishError> {

We set validate_messages to true here


That's why we propogate after validation
if to_propagate && i_am_relay {
propagate_message(&ctx, message_id, peer_id);
}

@onur-ozkan
Copy link
Member

onur-ozkan commented Nov 11, 2024

From libp2p gossipsub behaviour

    /// This function should be called when [`Config::validate_messages()`] is `true` in order to

    /// propagate messages. Messages are stored in the ['Memcache'] and validation is expected to be

    /// fast enough that the messages should still exist in the cache.

    ///

    /// Calling this function will propagate a message stored in the cache, if it still exists.

    /// If the message still exists in the cache, it will be forwarded and this function will return true,

    /// otherwise it will return false.

    pub fn propagate_message(

        &mut self,

        message_id: &MessageId,

        propagation_source: &PeerId,

    ) -> Result<bool, PublishError> {

We set validate_messages to true here

That's why we propogate after validation

if to_propagate && i_am_relay {
propagate_message(&ctx, message_id, peer_id);
}

Oh it's for the seeenodes. How did this work until now? If it's not being propagated, how did non-seed nodes receive hcheck messages?

@shamardy
Copy link
Collaborator Author

Oh it's for the seeenodes. How did this work until now? If it's not being propagated, how did non-seed nodes receive hcheck messages?

No idea, maybe you tested with only one seed node and one light node.

@shamardy
Copy link
Collaborator Author

shamardy commented Nov 11, 2024

I think we should separate validation from actually processing messages as well in all process p2p functions, will look into it and open an issue if needed.

Edit: if we don't do that already somehow.

@onur-ozkan
Copy link
Member

No idea, maybe you tested with only one seed node and one light node.

Yeaah, most likely. So, the message should be propagated.

onur-ozkan
onur-ozkan previously approved these changes Nov 12, 2024
Copy link
Member

@onur-ozkan onur-ozkan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM other than one nit


pub(crate) const TAKER_FEE_VALIDATION_ATTEMPTS: usize = 6;
pub(crate) const TAKER_FEE_VALIDATION_RETRY_DELAY: f64 = 10.;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub(crate) const TAKER_FEE_VALIDATION_RETRY_DELAY: f64 = 10.;
pub(crate) const TAKER_FEE_VALIDATION_RETRY_DELAY_SECS: f64 = 10.;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here 515f6ae

@onur-ozkan
Copy link
Member

LGTM other than one nit

Also, of course, to_propagate = true needs to be handled for hcheck messages.

@shamardy shamardy added in progress Changes will be made from the author and removed under review labels Nov 13, 2024
@shamardy
Copy link
Collaborator Author

Also, of course, to_propagate = true needs to be handled for hcheck messages.

Will return validation errors from process_p2p_healthcheck_message to not propagate the message if validation fails as well.

@shamardy
Copy link
Collaborator Author

Will return validation errors from process_p2p_healthcheck_message to not propagate the message if validation fails as well.

Done here b81bfc6 I will revisit other messages processing and validations in a different PR.

@shamardy shamardy added under review and removed in progress Changes will be made from the author labels Nov 14, 2024
dimxy
dimxy previously approved these changes Nov 15, 2024
Copy link
Collaborator

@dimxy dimxy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -181,24 +182,31 @@ impl State for ValidateTakerFee {

async fn on_changed(self: Box<Self>, watcher_ctx: &mut WatcherStateMachine) -> StateResult<WatcherStateMachine> {
debug!("Watcher validate taker fee");
let validated_f = watcher_ctx
.taker_coin
.watcher_validate_taker_fee(WatcherValidateTakerFeeInput {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this watcher_validate_taker_fee fn used to fail because inside there is a check that tx.height should be over min_block_number. The case when the tx is in mempool (tx.height is None) is processed and validation does not fail.
But tx.height also may be not None and contain value of '-1' (at least in KMD) if the tx is in a forked branch (so tx validation may fail)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But tx.height also may be not None and contain value of '-1' (at least in KMD) if the tx is in a forked branch (so tx validation may fail)

But we use confirmations not height

if tx.confirmations > 0 {
let current_block = try_s!(coin.as_ref().rpc_client.get_block_count().compat().await);
let confirmed_at = current_block + 1 - tx.confirmations as u64;

I think it fails due to the get_block_count delay in having the current height of the blockchain, as the watcher can use a different electrum other than the taker. This is why the retries fix this.

@dimxy
Copy link
Collaborator

dimxy commented Nov 17, 2024

I suggest to use tx_from_rpc instead of taker_fee_tx in this code

EARLY_CONFIRMATION_ERR_LOG, taker_fee_tx, min_block_number
to see tx height and confirmations in logs

@shamardy shamardy merged commit 0a94102 into dev Nov 19, 2024
20 of 23 checks passed
@shamardy shamardy deleted the watcher-validate-fee-attempts branch November 19, 2024 17:16
dimxy added a commit that referenced this pull request Nov 26, 2024
* dev:
  chore(release): update v2.2.0-beta date (#2277)
  chore(release): add changelog entries for v2.2.0-beta (#2240)
  fix(watchers): align taker fee validation retries with makers (#2263)
  feat(tokens): custom token activation for evm (#2141)
  use safer subtraction on healthcheck expiration check (#2272)
  fix(hd-wallet): correctly display evm addr in `get_new_address` response (#2264)
Alrighttt pushed a commit that referenced this pull request Nov 26, 2024
This also fixes propagation of health check messages.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants