Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve existing error handling #46

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

emhane
Copy link
Member

@emhane emhane commented May 16, 2023

Propagates existing errors from where they are created to the call that triggered them to allow for easy debugging.

Included Nick's test in this issue #44 as a test in stream.rs

Copy link
Member

@KolbyML KolbyML left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR looks good hopefully we can get Circle CI working, and look a little more at why that test is failing at that point when it should be failing during the transmission if anything.

src/stream.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@jacobkaufmann jacobkaufmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a good step toward improving error handling! however, I want to make some high-level comments rather than leaving many smaller, more specific comments.

while utp probably leans too heavily on the std::io::Error, a lot of the errors introduced here are quite granular. for example, whether or not a send on some internal channel failed or succeeded should not go to the user. that is an internal implementation detail, and we do not want to have to change the error enum if we change the internals. the user should see that something failed, and a low-level log message could indicate the specific issue.

another such example is an invalid packet from the remote peer. the user does not need a granular/specific error about what aspect of the packet was invalid, but only an error that indicates that the remote peer misbehaved according to the protocol. a log message would indicate the specific aspect of the invalid packet.

we should represent what we can with std::io::Error and be conservative in the new errors that we define (that are exposed to the user), so that those errors are meaningful to the user (e.g. have too many connections) and do not reflect internal implementation details. if other relevant internal failures are not visible, then we can add logs.

src/stream.rs Outdated Show resolved Hide resolved
src/stream.rs Outdated Show resolved Hide resolved
src/conn.rs Outdated Show resolved Hide resolved
@emhane
Copy link
Member Author

emhane commented May 22, 2023

a good step toward improving error handling! however, I want to make some high-level comments rather than leaving many smaller, more specific comments.

while utp probably leans too heavily on the std::io::Error, a lot of the errors introduced here are quite granular. for example, whether or not a send on some internal channel failed or succeeded should not go to the user. that is an internal implementation detail, and we do not want to have to change the error enum if we change the internals. the user should see that something failed, and a low-level log message could indicate the specific issue.

another such example is an invalid packet from the remote peer. the user does not need a granular/specific error about what aspect of the packet was invalid, but only an error that indicates that the remote peer misbehaved according to the protocol. a log message would indicate the specific aspect of the invalid packet.

we should represent what we can with std::io::Error and be conservative in the new errors that we define (that are exposed to the user), so that those errors are meaningful to the user (e.g. have too many connections) and do not reflect internal implementation details. if other relevant internal failures are not visible, then we can add logs.

I disagree with you, debugging this crate without knowing where in the code the error originates from is much more time consuming. As you can see, mostly the errors I added wrap the io error you chose. In a few places, I think you have chosen the wrong error based on what you wish the program to mean and not what it actually means, for example in the on_timeout function, actually what fails here is a counter that you have implemented and that important piece of information is abstracted away as you chose the io error TimedOut, which matches the function name well but doesn't assist in debugging.

@njgheorghita
Copy link
Contributor

Just leaving my 2 cents here...

the user should see that something failed, and a low-level log message could indicate the specific issue.

vs.

debugging this crate without knowing where in the code the error originates from is much more time consuming.

imo this library is not "production" / still in alpha. that being so, the focus should be in aiding developers not users. improving the debugging process > ux. once the library is production ready, these errors can be revisited to improve the ux. a tracking issue to revisit the errors could be helpful so that we can move forward with these improvements in the meanwhile, even at the temporary expense of ux.

@jacobkaufmann
Copy link
Collaborator

I disagree with you, debugging this crate without knowing where in the code the error originates from is much more time consuming.

I'm not sure which point you disagree with. and I don't understand why you need to define all of these new errors to see where the error occurs in utp. you could achieve the same thing with additional logging/tracing. it would make sense to define new errors if the consumer of the library were to take different actions based on the error variant. for instance, if there are too many existing connections, then the consumer may decide to manually close some idle stream etc. in your example, would knowing that the connection timed out because a counter reached some limit change how you handle the error? if not, and you will only end up logging the error in your application, then the better approach is to emit that information from utp rather than define errors at the granularity of the implementation details.

like I said above, I see that std::io::Error is not expressive enough to represent all of the errors that would be relevant to an application, but we don't need to represent all possible errors at the level of specificity you have here. if the problem you want to solve is more visibility into what's going on, then it would be better to start with adding logging/tracing.

improving the debugging process > ux.

I would reiterate that I think we can achieve your desired improvements to debugging ability in a simpler way. if you are only going to end up logging whatever errors we define here anyway, then we can emit those within the library code. we can introduce new errors where those errors could conceivably influence application behavior.

@emhane
Copy link
Member Author

emhane commented May 23, 2023

I disagree with you, debugging this crate without knowing where in the code the error originates from is much more time consuming.

I'm not sure which point you disagree with. and I don't understand why you need to define all of these new errors to see where the error occurs in utp. you could achieve the same thing with additional logging/tracing. it would make sense to define new errors if the consumer of the library were to take different actions based on the error variant. for instance, if there are too many existing connections, then the consumer may decide to manually close some idle stream etc. in your example, would knowing that the connection timed out because a counter reached some limit change how you handle the error? if not, and you will only end up logging the error in your application, then the better approach is to emit that information from utp rather than define errors at the granularity of the implementation details.

Who is this "consumer" you're referencing to @jacobkaufmann? Please back your argument with some facts so I can better understand your firm standpoint, which is against mine and @njgheorghita's standpoint. uTP is a transport protocol, the user of a transport protocol will always be a developer. Additional to that fact, a non-developer user wouldn't be able to make more sense of some io error than any other error, in general only devs use command line tools anyway.

like I said above, I see that std::io::Error is not expressive enough to represent all of the errors that would be relevant to an application, but we don't need to represent all possible errors at the level of specificity you have here. if the problem you want to solve is more visibility into what's going on, then it would be better to start with adding logging/tracing.

Language agnostic practice is such that any place where execution of the program stops, an error should be thrown. Anywhere an error occurs but the code should continue executing, the error is written to a log owned by the program. What are you basing your standpoint on that "we don't need to represent all possible errors at the level of specificity you have here"?

improving the debugging process > ux.

I would reiterate that I think we can achieve your desired improvements to debugging ability in a simpler way. if you are only going to end up logging whatever errors we define here anyway, then we can emit those within the library code. we can introduce new errors where those errors could conceivably influence application behavior.

Why is it up to the application using uTP to handle uTPs errors in a way that is meaningful to the consumer=developer?

A reminder that it is very seldom you propagate io::Errors from the underlying libraries which in turn are propagating these errors from interactions with software outside of this program (io operations). Actually in most places you are choosing which io::Error to emit where yourself, also on operations a developer doesn't associate with io operations, which to me is malpractice which, naturally, is confusing. I understand it as you wanting to make the ux seamless with the os here like udp or tcp, but at the end of the day uTP is a user space transport protocol not a kernel space transport protocol, so there errors are expected to be user space errors.

@jacobkaufmann
Copy link
Collaborator

when I say "user" or "consumer" I mean developer i.e. a user/consumer of the library. sorry for the confusion.

but I want to refocus the discussion here. I want to reiterate that I see that std::io::Error is not expressive enough, and I'm not opposed to defining a new error enum. now correct me if I'm wrong, but the goal of the changes are to improve visibility of internal failures for debugging purposes. assuming that is true, then it would be simpler to add logging/tracing to emit the specific errors within utp. if you want to be able to debug utp, then we can emit the specific logs at the source as opposed to leaking all internal implementation details only to be logged in the application layer. I want to reiterate again that this stance does not imply that we cannot (here or elsewhere) introduce a new error type.

@emhane
Copy link
Member Author

emhane commented May 25, 2023

when I say "user" or "consumer" I mean developer i.e. a user/consumer of the library. sorry for the confusion.

but I want to refocus the discussion here. I want to reiterate that I see that std::io::Error is not expressive enough, and I'm not opposed to defining a new error enum. now correct me if I'm wrong, but the goal of the changes are to improve visibility of internal failures for debugging purposes. assuming that is true, then it would be simpler to add logging/tracing to emit the specific errors within utp. if you want to be able to debug utp, then we can emit the specific logs at the source as opposed to leaking all internal implementation details only to be logged in the application layer. I want to reiterate again that this stance does not imply that we cannot (here or elsewhere) introduce a new error type.

When the program panics, what does it matter how verbose the error is that is thrown? The program execution stops, I think that is what bothers the developer most. I would even say that enabling log for level 'error' shouldn't show these errors that make the program panic, they should only show the errors that don't make the program panic.

I am debugging the program now, and I don't think theio::Errors you have manually chosen to emit are helpful.

@KolbyML
Copy link
Member

KolbyML commented May 25, 2023

I agree with jacob. The errors this library returns are meant for the user expecting the library to be bug free.

As demostrated by me finding the overflow bug in #60 in the expainer I show how you can get a Timeout or NotConnected error, but the underlying bug is the same. Error handlying won't help us fix a buggy library debug statements and breakpoints will.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants