-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dial i/o timeouts while connecting #109
Comments
That error message does seem to indicate a network or server issue.
That is pretty old, but off the top of my head I don't recall any changes that would affect this.
I'm not absolutely sure but I think the |
Thanks for the reply - I'll post an update if I find out anything interesting through testing. |
@jackc I did some more investigation, and I'm pretty certain that the However, in the end I've actually taken the approach of replacing the wrappedDial := config.DialFunc
config.DialFunc = func(ctx context.Context, network, addr string) (net.Conn, error) {
var conn net.Conn
var err error
for i := 0; i < pgMaxDialAttempts; i++ {
ok := func() bool {
// We're manually enforcing a dial timeout here rather than relying on connect_timeout
// in the connection string because the connect_timeout applies to the full connection
// process, meaning that any dial retries would fail because the context has already expired.
ctx, cancel := context.WithTimeout(ctx, time.Second*5)
defer cancel()
conn, err = wrappedDial(ctx, network, addr)
return err == nil
}()
if ok {
break
}
}
return conn, err
} That seems to have worked (in that an initial dial times out after 5 seconds, but a subsequent dial succeeds), although unfortunately since implementing it I've only seen one example of the failure, so it's difficult to be certain. I'm happy to close the issue if you want since I've got a workaround now, but just figured the info could be useful. |
I'm periodically seeing connection failures when trying to connect to an Aurora Serverless instance. Most connection attempts are successful, but occasionally we get errors, making me think that there's some underlying network / database issue causing the problems. The error messages look something like this:
We're using v1.7.0 of pgconn and v4.9.0 of pgx. I know these aren't the latest versions, so we can definitely look at updating if there's anything that's likely to help with this issue.
The connection attempt times out after 60 seconds, which makes sense because of this line, and the error message is coming from here.
While investigating this, I noticed there's connection retry logic in the Go sql package, for example here. It automatically retries connecting if
driver.ErrBadConn
is returned. I guess what I'm wondering is would it make sense to returnErrBadConn
when a dial timeout happens? Obviously this doesn't solve the underlying issue, but it might mitigate the problem assuming it's transient.I'm happy to experiment with this, but I just wanted to ask first since I'm not mega familiar with Go SQL drivers.
Thanks in advance!
The text was updated successfully, but these errors were encountered: