Improve existing error handling #46

emhane · 2023-05-16T19:10:02Z

Propagates existing errors from where they are created to the call that triggered them to allow for easy debugging.

Included Nick's test in this issue #44 as a test in stream.rs

KolbyML

PR looks good hopefully we can get Circle CI working, and look a little more at why that test is failing at that point when it should be failing during the transmission if anything.

src/stream.rs

jacobkaufmann

a good step toward improving error handling! however, I want to make some high-level comments rather than leaving many smaller, more specific comments.

while utp probably leans too heavily on the std::io::Error, a lot of the errors introduced here are quite granular. for example, whether or not a send on some internal channel failed or succeeded should not go to the user. that is an internal implementation detail, and we do not want to have to change the error enum if we change the internals. the user should see that something failed, and a low-level log message could indicate the specific issue.

another such example is an invalid packet from the remote peer. the user does not need a granular/specific error about what aspect of the packet was invalid, but only an error that indicates that the remote peer misbehaved according to the protocol. a log message would indicate the specific aspect of the invalid packet.

we should represent what we can with std::io::Error and be conservative in the new errors that we define (that are exposed to the user), so that those errors are meaningful to the user (e.g. have too many connections) and do not reflect internal implementation details. if other relevant internal failures are not visible, then we can add logs.

src/stream.rs

src/conn.rs

emhane · 2023-05-22T07:29:46Z

a good step toward improving error handling! however, I want to make some high-level comments rather than leaving many smaller, more specific comments.

while utp probably leans too heavily on the std::io::Error, a lot of the errors introduced here are quite granular. for example, whether or not a send on some internal channel failed or succeeded should not go to the user. that is an internal implementation detail, and we do not want to have to change the error enum if we change the internals. the user should see that something failed, and a low-level log message could indicate the specific issue.

another such example is an invalid packet from the remote peer. the user does not need a granular/specific error about what aspect of the packet was invalid, but only an error that indicates that the remote peer misbehaved according to the protocol. a log message would indicate the specific aspect of the invalid packet.

we should represent what we can with std::io::Error and be conservative in the new errors that we define (that are exposed to the user), so that those errors are meaningful to the user (e.g. have too many connections) and do not reflect internal implementation details. if other relevant internal failures are not visible, then we can add logs.

I disagree with you, debugging this crate without knowing where in the code the error originates from is much more time consuming. As you can see, mostly the errors I added wrap the io error you chose. In a few places, I think you have chosen the wrong error based on what you wish the program to mean and not what it actually means, for example in the on_timeout function, actually what fails here is a counter that you have implemented and that important piece of information is abstracted away as you chose the io error TimedOut, which matches the function name well but doesn't assist in debugging.

njgheorghita · 2023-05-22T13:20:16Z

Just leaving my 2 cents here...

the user should see that something failed, and a low-level log message could indicate the specific issue.

vs.

debugging this crate without knowing where in the code the error originates from is much more time consuming.

imo this library is not "production" / still in alpha. that being so, the focus should be in aiding developers not users. improving the debugging process > ux. once the library is production ready, these errors can be revisited to improve the ux. a tracking issue to revisit the errors could be helpful so that we can move forward with these improvements in the meanwhile, even at the temporary expense of ux.

jacobkaufmann · 2023-05-23T04:25:06Z

I disagree with you, debugging this crate without knowing where in the code the error originates from is much more time consuming.

I'm not sure which point you disagree with. and I don't understand why you need to define all of these new errors to see where the error occurs in utp. you could achieve the same thing with additional logging/tracing. it would make sense to define new errors if the consumer of the library were to take different actions based on the error variant. for instance, if there are too many existing connections, then the consumer may decide to manually close some idle stream etc. in your example, would knowing that the connection timed out because a counter reached some limit change how you handle the error? if not, and you will only end up logging the error in your application, then the better approach is to emit that information from utp rather than define errors at the granularity of the implementation details.

like I said above, I see that std::io::Error is not expressive enough to represent all of the errors that would be relevant to an application, but we don't need to represent all possible errors at the level of specificity you have here. if the problem you want to solve is more visibility into what's going on, then it would be better to start with adding logging/tracing.

improving the debugging process > ux.

I would reiterate that I think we can achieve your desired improvements to debugging ability in a simpler way. if you are only going to end up logging whatever errors we define here anyway, then we can emit those within the library code. we can introduce new errors where those errors could conceivably influence application behavior.

emhane · 2023-05-23T06:43:34Z

I disagree with you, debugging this crate without knowing where in the code the error originates from is much more time consuming.

I'm not sure which point you disagree with. and I don't understand why you need to define all of these new errors to see where the error occurs in utp. you could achieve the same thing with additional logging/tracing. it would make sense to define new errors if the consumer of the library were to take different actions based on the error variant. for instance, if there are too many existing connections, then the consumer may decide to manually close some idle stream etc. in your example, would knowing that the connection timed out because a counter reached some limit change how you handle the error? if not, and you will only end up logging the error in your application, then the better approach is to emit that information from utp rather than define errors at the granularity of the implementation details.

Who is this "consumer" you're referencing to @jacobkaufmann? Please back your argument with some facts so I can better understand your firm standpoint, which is against mine and @njgheorghita's standpoint. uTP is a transport protocol, the user of a transport protocol will always be a developer. Additional to that fact, a non-developer user wouldn't be able to make more sense of some io error than any other error, in general only devs use command line tools anyway.

like I said above, I see that std::io::Error is not expressive enough to represent all of the errors that would be relevant to an application, but we don't need to represent all possible errors at the level of specificity you have here. if the problem you want to solve is more visibility into what's going on, then it would be better to start with adding logging/tracing.

Language agnostic practice is such that any place where execution of the program stops, an error should be thrown. Anywhere an error occurs but the code should continue executing, the error is written to a log owned by the program. What are you basing your standpoint on that "we don't need to represent all possible errors at the level of specificity you have here"?

improving the debugging process > ux.

I would reiterate that I think we can achieve your desired improvements to debugging ability in a simpler way. if you are only going to end up logging whatever errors we define here anyway, then we can emit those within the library code. we can introduce new errors where those errors could conceivably influence application behavior.

Why is it up to the application using uTP to handle uTPs errors in a way that is meaningful to the consumer=developer?

A reminder that it is very seldom you propagate io::Errors from the underlying libraries which in turn are propagating these errors from interactions with software outside of this program (io operations). Actually in most places you are choosing which io::Error to emit where yourself, also on operations a developer doesn't associate with io operations, which to me is malpractice which, naturally, is confusing. I understand it as you wanting to make the ux seamless with the os here like udp or tcp, but at the end of the day uTP is a user space transport protocol not a kernel space transport protocol, so there errors are expected to be user space errors.

jacobkaufmann · 2023-05-23T19:06:14Z

when I say "user" or "consumer" I mean developer i.e. a user/consumer of the library. sorry for the confusion.

but I want to refocus the discussion here. I want to reiterate that I see that std::io::Error is not expressive enough, and I'm not opposed to defining a new error enum. now correct me if I'm wrong, but the goal of the changes are to improve visibility of internal failures for debugging purposes. assuming that is true, then it would be simpler to add logging/tracing to emit the specific errors within utp. if you want to be able to debug utp, then we can emit the specific logs at the source as opposed to leaking all internal implementation details only to be logged in the application layer. I want to reiterate again that this stance does not imply that we cannot (here or elsewhere) introduce a new error type.

emhane · 2023-05-25T14:07:04Z

when I say "user" or "consumer" I mean developer i.e. a user/consumer of the library. sorry for the confusion.

but I want to refocus the discussion here. I want to reiterate that I see that std::io::Error is not expressive enough, and I'm not opposed to defining a new error enum. now correct me if I'm wrong, but the goal of the changes are to improve visibility of internal failures for debugging purposes. assuming that is true, then it would be simpler to add logging/tracing to emit the specific errors within utp. if you want to be able to debug utp, then we can emit the specific logs at the source as opposed to leaking all internal implementation details only to be logged in the application layer. I want to reiterate again that this stance does not imply that we cannot (here or elsewhere) introduce a new error type.

When the program panics, what does it matter how verbose the error is that is thrown? The program execution stops, I think that is what bothers the developer most. I would even say that enabling log for level 'error' shouldn't show these errors that make the program panic, they should only show the errors that don't make the program panic.

I am debugging the program now, and I don't think theio::Errors you have manually chosen to emit are helpful.

KolbyML · 2023-05-25T21:16:16Z

I agree with jacob. The errors this library returns are meant for the user expecting the library to be bug free.

As demostrated by me finding the overflow bug in #60 in the expainer I show how you can get a Timeout or NotConnected error, but the underlying bug is the same. Error handlying won't help us fix a buggy library debug statements and breakpoints will.

Emilia Hane added 2 commits May 16, 2023 21:04

Improve error handling

635f2af

fixup! Improve error handling

4c3a2e9

This was referenced May 17, 2023

sub-protocol data transmission ethereum/trin#746

Closed

debug utp ethereum/trin#747

Closed

KolbyML reviewed May 19, 2023

View reviewed changes

src/stream.rs Outdated Show resolved Hide resolved

emhane mentioned this pull request May 20, 2023

Bug: stream needs to be spawned in new thread #47

Closed

jacobkaufmann reviewed May 22, 2023

View reviewed changes

src/stream.rs Outdated Show resolved Hide resolved

src/stream.rs Outdated Show resolved Hide resolved

src/conn.rs Outdated Show resolved Hide resolved

Emilia Hane added 2 commits May 22, 2023 10:07

Restore symbol name

8c2c817

Remove out of pr scope test

32c9956

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve existing error handling #46

Improve existing error handling #46

emhane commented May 16, 2023

KolbyML left a comment

jacobkaufmann left a comment

emhane commented May 22, 2023

njgheorghita commented May 22, 2023

jacobkaufmann commented May 23, 2023

emhane commented May 23, 2023

jacobkaufmann commented May 23, 2023

emhane commented May 25, 2023

KolbyML commented May 25, 2023

Improve existing error handling #46

Are you sure you want to change the base?

Improve existing error handling #46

Conversation

emhane commented May 16, 2023

KolbyML left a comment

Choose a reason for hiding this comment

jacobkaufmann left a comment

Choose a reason for hiding this comment

emhane commented May 22, 2023

njgheorghita commented May 22, 2023

jacobkaufmann commented May 23, 2023

emhane commented May 23, 2023

jacobkaufmann commented May 23, 2023

emhane commented May 25, 2023

KolbyML commented May 25, 2023