Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

better error message with semantic errors #35

Open
elv-gilles opened this issue Apr 17, 2024 · 2 comments
Open

better error message with semantic errors #35

elv-gilles opened this issue Apr 17, 2024 · 2 comments

Comments

@elv-gilles
Copy link

I'm taking as example a field network_id of type uint64

NetworkId  uint64 `json:"network_id,omitempty"` 

and the following values entered mistakenly by a user:

  • 123x
  • -123
  • "123:456"

The following errors are returned

with 'encoding/json':

  • 123x: invalid character 'x' after object key:value pair
  • -123: json: cannot unmarshal number -123 into Go struct field alias.qspaces of type uint64
  • "123:456": json: cannot unmarshal string into Go struct field alias.qspaces of type uint64

In the 2 first cases, there's at least a small chance that a user could correct himself.
In the third case, the user is returned an error that has no indication of where it originated (also a user does not necessarily know about Go types).

with 'json/v2':

  • 123x: json: cannot unmarshal Go value of type config.AppConfig: json: cannot unmarshal Go value of type config.QSpaceConfig: json: cannot unmarshal Go value of type config.EthereumQSpaceConfig: jsontext: missing character ',' after object or array value
  • -123: json: cannot unmarshal Go value of type config.AppConfig: json: cannot unmarshal Go value of type config.QSpaceConfig: json: cannot unmarshal Go value of type config.EthereumQSpaceConfig: json: cannot unmarshal JSON number into Go value of type uint64: cannot parse "-123" as unsigned integer: invalid syntax
  • "123:456": json: cannot unmarshal Go value of type config.AppConfig: json: cannot unmarshal Go value of type config.QSpaceConfig: json: cannot unmarshal Go value of type config.EthereumQSpaceConfig: json: cannot unmarshal JSON string into Go value of type uint64

The verbosity of the error message in increased, but the user is still returned a message that does not help finding what and where the problem is.

Test code is in config_errors/config_test.go.

why not returning the offending value and location of the error ?

Ideally, the error would also indicate the name of the field where the failure occurs and would become:

  • "field 'network_id', invalid value '123x' at offset 133
  • "field 'network_id', invalid value '-123' at offset 134, cannot parse "-123" as unsigned integer
  • "field 'network_id', invalid value "123:456" at offset 137, cannot unmarshal JSON string into value of type uint64

With the additional change that - if the json pointer (in SemanticErrror) is available - replace field 'network_id' with path '/x/y/network_id'.

Looking at the code :

  • the field JSONPointer of SemanticErrror is never set (although the decoder has it as StackPointer())
  • the offending value is always at hand and could be included in the error (maybe with a limit of length)
  • the offset of the error could be taken from the decoder and set to the ByteOffset field of SemanticErrror
    • and change the comment of ByteOffset
      • from: ByteOffset indicates that an error occurred after this byte offset.
      • to: ByteOffset indicates that an error occurred at this byte offset.
    • note that in the 123x case, the root error jsontext.SyntacticError has its offset set

additional question: why Hyrum-proof ?

What is the reason for returning an error message (pseudo) randomly prefixed in SemanticError ?

func (e *SemanticError) Error() string {
	var sb strings.Builder
	sb.WriteString(errorPrefix)

	// Hyrum-proof the error message by deliberately switching between
	// two equivalent renderings of the same error message.
	// The randomization is tied to the Hyrum-proofing already applied
	// on map iteration in Go.
	for phrase := range map[string]struct{}{"cannot": {}, "unable to": {}} {
		sb.WriteString(phrase)
		break // use whichever phrase we get in the first iteration
	}
	...

POC / Proposal

I made some changes - limited to the 3 above cases - in this commit 6c0af11 of my json-experiment fork.
With these changes these errors now show like this:

  • 123x: json: cannot unmarshal Go value of type config.EthereumQSpaceConfig within JSON value at "/qspaces/0/ethereum/network_id": jsontext: missing character ',' after object or array value at byte offset 133
  • -123: json: cannot unmarshal JSON number into Go value of type uint64 within JSON value at "/qspaces/0/ethereum/network_id" at byte offset 134: cannot parse "-123" as unsigned integer: invalid syntax
  • "123:456": json: cannot unmarshal JSON string into Go value of type uint64 within JSON value at "/qspaces/0/ethereum/network_id" at byte offset 139: invalid value: "123:456"

which, in my opinion - looks more useful though still a bit long.

Further removing Go value of type XX within JSON from the error string would make an even better error message:

  • 123x: json: cannot unmarshal value at "/qspaces/0/ethereum/network_id": jsontext: missing character ',' after object or array value at byte offset 133
  • -123: json: cannot unmarshal value at "/qspaces/0/ethereum/network_id" at byte offset 134: cannot parse "-123" as unsigned integer: invalid syntax
  • "123:456": json: cannot unmarshal value at "/qspaces/0/ethereum/network_id" at byte offset 139: invalid value: "123:456"

What do you think ?

@dsnet
Copy link
Collaborator

dsnet commented Apr 17, 2024

Thanks for your thoughts. It's been on my TODO list to revamp how errors are populated and rendered. The current status quo isn't reflective of where we would like error reporting to get. What you proposed is fairly similar to what we would like it to eventually become.

why Hyrum-proof ?

This is precisely to provide us the ability to improve the error message in the future with less probability of breaking people who were depending on the error message being of a particular format today. While maintaining the Go stdlib, it was unfortunately common to rollback a CL because it altered the exact text of some error message, and some piece of code was depending on error string parsing to affect program logic.

We can always remove the Hyrum proofing, but not add it back. Once the error message is more stable, we can consider removing the deliberate randomization.

@elv-gilles
Copy link
Author

ok, thanks for your answer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants