-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the chain shouldn't bill contracts on offline nodes #873
Comments
Edit: OUTDATED I had a look in code and, indeed, node state is not taken into account while billing (node state is only checked to prevent creating node/rent contracts while a node is down). Since billing cycle is 1h, what we could do is considering to discard billing for this cycle if node state is down at end of cycle. Another important point would be to update |
Thanks for the report. In this kind of issues, it’s essential to include the contract type, ID, node ID, and network information. the origin issue related to Terraform plugin so you can also attach any relevant logs or screenshots that might help us to diagnose the issue. Now, given that we are talking about a node contract, To deploy a workload on grid node, a user needs to perform the following steps:
Now with this implementation in mind, can the chain bill a contract while a node is offline? This does not mean that the user will pay for the service during downtime. The node is responsible for updating the chain with consumption reports, and no new reports can be sent while the node is down. Therefore, no further charges will be incurred. Another guess is that node wasn't actually down, but not reachable over RMB due to this issue? in this case you can't communicate with node over RMB but node still report up-time. so how you observed that node state was down? and how you observed that the contract was billing specifically during the downtime? To be able to look into this further, I will need more information on how you arrived at this conclusion. Also I want to request @renauter input here. |
The chain doesn't track if a node is online or not, and it shouldn't since it doesn't care. The only thing provided on chain is the ability to submit uptime reports, so external tools can check if a node was online at some point in the past (and even then this is technically only an approximation). As it stands, it should be noted that "a node" is a trusted actor on chain (as per the grid design), and thus we assume it submits valid usage reports for workloads. Other than that, a node being down is billed if it is a dedicated node being rented, or public IP's are being rented. In case of the former, it has been decided when the feature is implemented that this is client responsibility to cancel the contract in this case. In case of the latter, the billing is correct as the IP is still held on chain so nobody else can use it. Unless there is some case where the chain bills contracts without proper trigger (in which case please clarify that and provide the required details to investigate that), this is all working as intended. |
Thank you all for your inputs.
So we are all good |
Can you please confirm that user will not be billed for downtime nodes in node contracts. I feel somehow confused after reading @renauter and @sameh-farouk anwsers. |
@s-areal Yes, no payment will be made for services or workloads on that node under node contracts (for any node resources), and I think @renauter shares my view. The IPs are not tied to any specific node and are allocated from the farmer IPs pool when the contract is created, and they are returned to the pool only when the contract is terminated. Therefore, we keep charging for them as long as they are reserved by the contract, if any, regardless of the node status. |
Closing for lacking essential info needed to diagnose the issue. |
I realized that we made a mistake in thinking that this design only affects rent contracts. Node contracts are also impacted. We assumed that node contracts are billed only when there is actual consumption of IP/SU/CU/NU, which is true from tfchain’s perspective. However, from zos’s perspective (@muhamadazmy can correct me), SU/CU consumption are recorded only on workload deployment, changing and deletion, regardless of the actual up time. This is the part that we overlooked. |
related to issue |
To the best of my knowledge, the (relevant) flow for node contracts should be like this:
This flow should only bill offline nodes for public IP's in use, which is correct since they are reserved on chain. NRU can't be billed since the offline node can't report NRU, and SU/CU should also not be billed. Of course, it is possible that there is a bug somewhere which causes SU/CU to be billed even if the node is down. |
@LeeSmet This argument (called it a bug) is debatable. Based on the code and code comments, it appears that this particular concern was not taken into account during the implementation phase. Originally, contracts were meant to be billed for any incurred SU/CU, NRU, or IP costs.
tfchain/substrate-node/pallets/pallet-smart-contract/src/billing.rs Lines 40 to 62 in a0d30d0
Also about your suggested flow
I proposed a similar approach, but utilizing the node uptime reports, which I believe would be more pertinent. Why do you think node uptime should be inferred from NRU reports instead? The obstacle lies in the fact that the chain runtime is unaware of the node’s online status and the actual node's uptime because we do not store this information in the chain state. Instead, we only pass it through events to the GraphQL. So I can think of two options: 2- off load this to zos service and make it responsible for trigger the billing for all contracts deployed on it and provide the number of seconds to be billed taking into consideration last report and uptime. |
I don't know how it is actually implemented, only what was discussed and agreed. In this regard, if the implementation deviates from the agreed plan than I do consider that a bug. Also, I am not aware of any fundamental changes which were considered to this flow since it was agreed on. The reason to use NRU reports is because they are a part of the billing flow regardless, and the way they track time is different. If uptime reports would be used, a contract would need to know exactly at what uptime it was deployed, when it was last billed, and need to handle node restarts (as those reset uptime). Uptime is also not tracked on chain as it stands, so that also needs to be added for this to work. On the other hand, in nru reports, the time reported is the time since last bill (capped to 1 hour iirc). This means that if a node can't submit a bill report on time, you can't get billed (since in general not submitting a report would indicate a network error, which means your workload is generally also not reachable/offline). Offloading the billing to nodes has been considered in the past, but it was generally considered against because:
|
@LeeSmet Some considerations regarding NRU reports approach.
ZOS only reports NRU when there is a consumption. if consumption is 0 then no report is sent. ZOS also does not count any private NRU, only the traffic over public IP is reported. For this approach to work, serious changes and decisions would need to be made to how ZOS submits reports and tracks metrics. Otherwise, workloads that do not have public IP addresses or have not sent or received any packets within a given time window would be considered offline, and would not be billed regardless of their online status. |
There shouldn't be a contract billing for a deployment against node of state down
The text was updated successfully, but these errors were encountered: