-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename portal_*RecursiveFindContent to portal_*GetContent #344
Rename portal_*RecursiveFindContent to portal_*GetContent #344
Conversation
Just want to point out that there was recent discussion about Might be good opportunity to clarify behavior of that endpoint as well. |
jsonrpc/src/methods/beacon.json
Outdated
"$ref": "#/components/contentDescriptors/ContentKey" | ||
} | ||
], | ||
"result": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method will also require a ContentNotFoundError, see PR: #343.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I'll add once that PR is merged.
jsonrpc/src/methods/beacon.json
Outdated
}, | ||
{ | ||
"name": "portal_beaconGetContent", | ||
"summary": "Get content from the local database if it exists, otherwise look up the target content key in the network.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we need to somewhere explicitly specify that the content needs to get verified by the RPC server and throw an error if verification fails.
Else we will end up with differences between clients like we have now for the portal_*RecursiveFindContent
method (which as @morph-dev mentioned, we should clearify also, but perhaps in another PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I'll add more detail to the description in that case.
Yes I agree, it's something that should be discussed and resolved if possible. My first impression from looking through the current JSON-RPC spec is that it isn't very clear or detailed about the expected behavior having only a single line description. For the My thoughts are that it might be useful to have |
It will be meaningful to expand the hive test coverage asap to cover these JSON-RPC endpoint semantics so that our clients behave the same, otherwise we're heading down the same path as the original batch of execution clients which all had different behavior and even different payloads for a long time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't review the actual specifics of implementation, but I support the overall direction this takes. Moving trin's current behavior towards this API seems like the right move.
I also agree on adding specific language about which endpoints should validate data would be a big improvement as well.
Yes it would be good to have hive verify the behavior of the portal JSON-RPC API in clients. Probably a good starting point would be to expand the spec description of all methods where there is any ambiguity so that all teams clearly understand the expected behavior. That can be done in a separate PR. |
I can confirm that |
^ Because of this, I don't think we should add |
@kdeme @pipermerriam @bhartnett ethereum/hive#1170 here is a test to make sure all implementations of If you could review it that would be nice as well |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
portal_*RecursiveFindContent
is supposed to do what this PR proposes. This situation was caused by us not having test coverage for this in hive. So certain implementations had different outcomes. To rectify the situation ethereum/hive#1170 I have written a new Hive test to ensure all Portal Clients match the expected behavior of checking their local DB first before initiating a RFC when the portal_*RecursiveFindContent
method is called.
I will also prioritize getting negative cases for History done (so validation is tested). Every Portal JSON-RPC method should validate the data it fetches from the network. An error should be thrown if your Portal Client can't validate something retrieved from the network. We currently do this in Trin I think this assumption is required for any JSON-RPC method that retrieves information from the network. So if there is any ambiguity in this, more tests should be written (I have been planning to do it, I just didn't think it was a priority till now) to prove validation is being done. Also we should update our documentation to make this requirement more obvious. If the data isn't validated it might as well be considered malicious. |
I don't feel strongly about this but I'll explain my reasoning for my loose support for this endpoint. The The proposed endpoint is obviously a high level API, and having it handle the negotiation of how and where it gets the content is less surprising. This allows for an endpoint with good UX (the I'm not sure these reasons should be considered compelling, but from a purely architectural and design perspective, the proposed solution seems clean. Kolby's direction of just getting the tests to enforce the current trin behavior are simple and easy and pragmatic. That's the extent to which I'll weigh in here. No need to prove me wrong as I'm not sure I'm correct. |
I don't see the value in a low-level
There would be no advantage for testing by having both Currently, in Trin, TL:DR if the name |
I would argue that the spec should be updated to make the expected behavior clear and hive testing is secondary to this but of course still required. |
There was some discussion within the Nimbus team about possible use cases of the portal JSON-RPC API for fetching history data for EIP-4444. The two possible scenarios/use cases were raised that an execution client such as Nimbus might need:
Looking through the spec it appears that we already have many endpoints that could be considered low level such as I'm also curious about the expected or current behavior of the other endpoints in Trin and other clients. For |
I would argue that if Nimbus EL was using Portal they wouldn't use the JSON-RPC API to begin with to fetch history. The encoding/decoding through the JSON-RPC would kill performance by itself. If anything, they would be using a channel directly, and Fluffy is just another running service. So, both of the use cases you listed wouldn't be done through the Portal's JSON-RPC to begin with. Nimbus EL would send messages directly to Fluffy's service removing the overhead of the JSON-RPC. So I am not sure why Portal's JSON-RPC is being conflated with use cases where it doesn't make sense to use the JSON-RPC in the first place and instead opting for a more direct connection where Nimbus would have finer control over what fluffy does and the messages sent. Maybe for a demo? But I don't think we should write our JSON-RPC specification for power users who won't use the JSON-RPC long-term anyway, if at all. Also, if we wanted to turn validation off, that would be a method param. I don't see why it would have to be its own method.
I see Use cases of
Without this functionality it is extremely hard to do either of those. For
The way this PR defines RecursiveFindContent is implementation specific. I don't think this is about exposing low-level functionality
Having an RPC endpoint that exposed a raw RFC wouldn't help test anything that As I said earlier, it doesn't make sense for Nimbus EL to use a high-level connection such as the JSON-RPC; instead, they would opt for a lower-level channel or pipe, or however they call it in nim. Where they would have finer control over what is being done. I don't think the JSON-RPC makes sense for the Execution Client use case. Intensive web3 applications and rollups often build on top of ELs (not using the JSON-RPC) for a reason.
no, as every client that calls So realistically, I don't see why Nimbus would use the JSON-RPC instead of a channel. If it is short-term for a proof of concept, then it doesn't matter if Fluffy validates the content, as they will end up replacing the JSON-RPC with a channel regardless. |
Here is my take on this by replying on many of the comments, but if that is a bit much to read, a TLDR:
Then:
This is your interpretation of the specification. Mine is different, and from the naming of the method I directly concluded that it should not look locally but only on the network.
Completely agree with this, and this is also why it is implemented in Fluffy as it is (not the validation part, this is more obvious that it needs to be added for this specific method, but also not explicitly specified).
I am not sure if there currently is one either. But, because I'm not in favour of adding code/specs that are currently not used, I don't hold this as a reason to keep the lower-level call. It can always be re-added when it turns out to be needed.
Yes, I was going to suggest the same naming change. As this method is something that and end users might/will end up using, I think a more simple name like that is valuable. In general, I think at a certain moment a bigger effort will need to be done to move certain low level API to a
It is true that this is less efficient, yet to this day every full node setup communicates via JSON-RPC (engine_api) between nodes. So there must be some value to having an API there with good UX.
Sort of, but also not really. We mostly want to avoid doing too much integration work without knowing whether this is something users want to use in an integrated setup. And at the same time, the JSON-RPC version would still be useful when users want to use another Portal client, or when Fluffy gets used with other EL clients.
I don't fully agree with this. While it is implementation specific, its result should not be. That is, you should be able to test with this call whether a node can actually find a piece of content that is somewhere available on the network. That is something that can be tested for all clients. But yes, this can be done with the current call.
I'm not saying that we need to do specific adjustments to the JSON-RPC regarding validation because Nimbus might want this. If Nimbus wants this we can add some non specified methods first to fluffy, and if it turns out to be useful for other clients, eventually this would end up in the specification. As long as that is not proven, it should not get added / affect other clients. But regarding validation, we do need to specify this better in the specifications for the current API. It is mostly "obvious" and implied for the "higher" level methods. But I wouldn't say this for all. The actual reason why Fluffy does not have this is because it was easier to test with bogus data in the early days, especially for those lower level calls like I am fine with that, but it should be explicitly stated in the specification, especially for the lower level calls. |
I am confused what this means. I have nothing against better documenting how JSON-RPC methods should behave
I would be fine with this as well
I get what you mean I agree documenting which JSON-RPC methods are supposed to validate data is important and something we are missing
I assume each Portal client will have distinct integration methods mostly likely dependent on which programming language is used
I think this point is very valid and I agree we should better document the JSON-RPC
👍
👍
That will probably be a future discussion but it will be interesting what ends up happening
I think execution clients will only opt to use Portal Implementations implemented in their native language. So I see Fluffy being used with non-nim EL's as hypothetical. If other EL's were to integrate Fluffy for performance they wouldn't use the JSON-RPC, they would use FFI bindings then interface with Fluffy directly skipping the overhead. In this case I don't see why we would disable validation on
This would be a demo/proof of concept then. It is a proof of concept I don't see how the delay of
I think everyone can say this is generally good. Good UX which is easy to understand is a win
The example given is only true in ideal conditions. In practice network conditions are not ideal which is why clients will opt for optimized/smarter RFC implementations which can't be tested. The only thing we test from raw RFC is that they can find the content in ideal conditions, we can't create a more complex hive test which takes advantage of implementation-specific details. I don't think we should standardize the RFC implementation as the diversity in implementations and strategies is good for the health of the network IMO
I think this is a reasonable approach and is commonly done by EL/CL teams
👍 I agree we need to better document our JSON-RPC specification |
I'm talking about a possible use case for the In fluffy, the "native"
I'm well aware of this. That was not my point. My point is that they didn't start with the SSZ protocol solution (which by the way is still a protocol solution, not fully integrated). That apparently was a step too much to take at once, for good reason?
I was just giving an example, of course its hypothetical, as is anything else in terms of integration, and I think having the (higher level good UX) json-rpc (or any other protocol for that matter, e.g. REST with binary/SSZ) is useful to have, even if alone for fast PoC'ing. But there is more, portal_bridges use it, its the first access to users interested in the network, etc. (regarding nimbus: I would also prefer the integration to be native, but there are other reasons why in nimbus/fluffy we want to start with the json-rpc version as PoC, but they are Nim specific and I won't bore you with them) I also didn't state that I didn't want validation on these calls, on the contrary. I state that it should be specified and that some of the lower level calls might benefit for testing purposes not to have always validation, see the example I gave. |
I don't think this creates any meaningful challenges in testing. We can get around those potential edge cases fairly easily. In practice, I haven't faced a challenge like this yet.
This is done a layer above on Trin as well
I don't think a data availability solution is comparable to the complexity and security concerns of a consensus implementation. I agree we need better documentation, but I don't this PR proposes anything not already possible to do with the JSON-RPC. I think the current RPC specification (minus documentation) is sufficient for a PoC.
I want to specify that the functionality this PR proposes already exists in Trin, this is more a question of documenting expect behaviors and renaming JSON-RPC methods for better UX. Which I fully agree with. I agree we should have a higher level good UX json-rpc. I don't think we disagree on this at all. I don't believe my complaints have ever been targeted at that. my complaints were more that there is no distinction between So I think we should
I think we both agree on these 2 points which are the main important ones? "trin_bridge" will eventually be built as an extension of Trin instead of using the JSON-RPC. Currently, we use custom
I think building a PoC using the json-rpc is very valid. I don't think we should spec out special JSON-RPC methods for the sole purpose of building PoC's though
I think some of this is confusion from when I was talking to @bhartnett
I think we should write our tests using real-world data. I understand that early on being able to quickly test things with fake data was useful, but I don't think there is a problem testing with real data now-a-days. So maybe those "benefits" were helpful in the past, but now that we have more infrastructure built we can test things just as easily with real data. |
I've pushed an update to remove the new endpoint and have renamed the portal_*RecursiveFindContent endpoints. Note that I also renamed the portal_*TraceRecursiveFindContent endpoints for consistency. I also updated the descriptions to be more detailed. Updating the descriptions of any other endpoints can be a separate PR. As a side note I noticed that the state network doesn't have a portal_stateTraceRecursiveFindContent. Is there a particular reason for this? |
It should be included. It was probably just missed when the endpoint was being added |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good.
In the future, maybe we should have a markdown file for specifying the requirements of each endpoint like the engine_api
has as it would probably be easier to read when someone is implementing the endpoints. Of course, this can be done in a future PR.
jsonrpc/src/methods/history.json
Outdated
"name": "portal_historyRecursiveFindContent", | ||
"summary": "Look up a target content key in the network", | ||
"name": "portal_historyGetContent", | ||
"summary": "Get content from the local database if it exists, otherwise look up the target content key in the network. After fetching from the network the content is validated and stored in the local database before being returned.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"summary": "Get content from the local database if it exists, otherwise look up the target content key in the network. After fetching from the network the content is validated and stored in the local database before being returned.", | |
"summary": "Get content from the local database if it exists, otherwise look up the target content key in the network. After fetching from the network the content is validated and stored in the local database if storage criteria is met before being returned.", |
Maybe we should be a little more specific, as we won't store it if the storage criteria aren't met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes good point, will update.
I'll add it in that case. |
… portal_*GetContent descriptions.
There has been no opposition to this PR in over a week and I sent a message in the discord with no opposition about merging this PR. Since there are 3 approvals and no opposition for a prolonged period of time I am going to merge the PR |
I recently started working on building a portal rpc client library to be used in Nimbus to fetch the history network data, headers, bodies and receipts from portal in preparation for EIP-4444. When implementing the code I realized that as a user of the portal RPC API it would be nice to have a higher level endpoint that is able to lookup the content in the local database first before trying to fetch the content from the network.
Without this new endpoint I had to first call
portal_historyLocalContent
, and then if a not found error was returned then callportal_historyRecursiveFindContent
to look up the content over the network. This works but doesn't store the fetched content in the local database unless I also callportal_historyStore
after getting the content from the network lookup. Storing the content in the local database after fetching from the network is a good idea so that a second call for the same content can simply get the data from the local database (assuming that the data is within the node's radius).Having to do multiple RPC calls isn't ideal and it would be helpful to have an endpoint that does the following:
This PR introduces
portal_historyGetContent
,portal_stateGetContent
andportal_beaconGetContent
RPC endpoints which should behave as described above.