Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

driver getting panicked with index out of range #574

Open
jkoleti-uptycs opened this issue Mar 13, 2024 · 9 comments
Open

driver getting panicked with index out of range #574

jkoleti-uptycs opened this issue Mar 13, 2024 · 9 comments

Comments

@jkoleti-uptycs
Copy link

jkoleti-uptycs commented Mar 13, 2024

panic: runtime error: index out of range [1] with length 0

goroutine 199034 [running]:
encoding/binary.bigEndian.PutUint16(...)
	/usr/local/go/src/encoding/binary/binary.go:144
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*chunker).send(0xc004441c20, {0x2b65db0, 0x4866120}, {0x2b55f80?, 0xc00200c928})
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/chunker.go:72 +0x39e
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*outgoing).send(0xc004441c20, {0x2b65db0, 0x4866120}, {0x2b55f80, 0xc00200c928})
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/outgoing.go:260 +0x55
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*messageQueue).send(...)
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/message_queue.go:134
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt4).Close(0xc0032aad00, {0x2b65db0, 0x4866120})
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/bolt4.go:925 +0x1a5
created by github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*server).removeIdleOlderThan in goroutine 199032
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/server.go:187 +0x465

Our Database setup

  • Database: memgraph v2.8 in single instance mode
  • Driver version: neo4j-go-driver v5.10.0 (tried with neo4j-go-driver v5.18, still seeing this issue)
  • Operating system: ubuntu 20.04

We are using the neo4j go driver for connecting to our memgraph database with default config.
One of our applications is getting panicked while running cypher queries.
So built our application with the -race flag enabled and below is the stack trace for a panic.

==================
WARNING: DATA RACE
Write at 0x00c004441c50 by goroutine 198988:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*chunker).send()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/chunker.go:107 +0x2ca
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*outgoing).send()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/outgoing.go:260 +0x54
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*messageQueue).send()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/message_queue.go:134 +0x1a4
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt4).Close()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/bolt4.go:925 +0x125
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).unregUnlocked.func1.1()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/pool.go:391 +0x61

Previous write at 0x00c004441c50 by goroutine 198992:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*chunker).beginMessage()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/chunker.go:47 +0x1ec
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*outgoing).begin()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/outgoing.go:45 +0xd6
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*outgoing).appendGoodbye()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/outgoing.go:244 +0xd5
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*messageQueue).appendGoodbye()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/message_queue.go:129 +0x124
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt4).Close()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/bolt4.go:924 +0x108
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*server).removeIdleOlderThan.func1()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/server.go:187 +0x61

Goroutine 198988 (running) created at:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).unregUnlocked.func1()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/pool.go:391 +0x104
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:477 +0x30
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).unreg()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/pool.go:385 +0x18f
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).Return()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/pool.go:460 +0x4e4
  github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).executeTransactionFunction.func1()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/session_with_context.go:441 +0x85
  runtime.deferreturn()
      /usr/local/go/src/runtime/panic.go:477 +0x30
  github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).runRetriable()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/session_with_context.go:416 +0x5ed
  github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).ExecuteWrite()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/session_with_context.go:365 +0x86
  github.com/neo4j/neo4j-go-driver/v5/neo4j.SessionWithContext.ExecuteWrite-fm()
      <autogenerated>:1 +0xa7
  github.com/prahaladd/gograph/neo.(*Neo4jConnection).ExecuteQuery()
      /root/go/pkg/mod/github.com/prahaladd/[email protected]/neo/executor.go:139 +0x536
  ...
  ...
  ...

Goroutine 198992 (running) created at:
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*server).removeIdleOlderThan()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/server.go:187 +0x464
  github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*Pool).CleanUp()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/pool.go:146 +0x313
  github.com/neo4j/neo4j-go-driver/v5/neo4j.(*sessionWithContext).Close.func1()
      /root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/session_with_context.go:655 +0x77
==================
panic: runtime error: index out of range [1] with length 0

goroutine 199034 [running]:
encoding/binary.bigEndian.PutUint16(...)
	/usr/local/go/src/encoding/binary/binary.go:144
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*chunker).send(0xc004441c20, {0x2b65db0, 0x4866120}, {0x2b55f80?, 0xc00200c928})
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/chunker.go:72 +0x39e
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*outgoing).send(0xc004441c20, {0x2b65db0, 0x4866120}, {0x2b55f80, 0xc00200c928})
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/outgoing.go:260 +0x55
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*messageQueue).send(...)
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/message_queue.go:134
github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/bolt.(*bolt4).Close(0xc0032aad00, {0x2b65db0, 0x4866120})
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/bolt/bolt4.go:925 +0x1a5
created by github.com/neo4j/neo4j-go-driver/v5/neo4j/internal/pool.(*server).removeIdleOlderThan in goroutine 199032
	/root/go/pkg/mod/github.com/neo4j/neo4j-go-driver/[email protected]/neo4j/internal/pool/server.go:187 +0x465

This issue is happening randomly, the occurrence of this issue is not predictable.

Saw a similar issue that was closed here - #525

@kdescoteaux-uptycs
Copy link

Note that this race happens when the ExecuteWrite pool Return invokes unreg() (because connection is old?) and this launches a go routine to Close the connection which cannot even begin to run until after the connection put in the Idle list.
When ExecuteWrite returns the session could then be closed by the caller, which launches a Cleanup, which will close the old connections in the Idle List in more go routines.
It is not safe to leave the Close to run asynchronously from the main flow of ExecuteWrite unless there is a lock in Close to prevent two simultaneous Closes from running.

@StephenCathcart
Copy link
Contributor

Hey there!

Thanks for sharing the stack trace and your observations, it's very helpful. I've bumped up the priority on this one because it looks like it might be tied to the previous issue #525 and we now have a bit more info to work with. I'll keep you in the loop with any updates or breakthroughs.

@aniketkdm
Copy link

Is there any update on this. We are seeing the same / similar behavior.

@StephenCathcart
Copy link
Contributor

Hi @aniketkdm, are you seeing this problem using a Neo4j server, and if so which version? I'm just asking as the original issue was around Memgraph.

@aniketkdm
Copy link

Hello @StephenCathcart, yes we are seeing same / similar problem. We are using Neo4j aura through GCP and when we are running heavy load (reads and writes) we get %!v(PANIC=Format method: runtime error: invalid memory address or nil pointer dereference) randomly. We have tried adding recovery functions in our code but the panics are not getting recovered in our function. We are using github.com/neo4j/neo4j-go-driver/v5 v5.18.0. Even with the recovery functions, we are not getting stacktrace for these panics.

@justinwalz
Copy link

To chime in, we're also seeing this issue

  • server version neo4j:5.19.0-enterprise (three kubernetes statefulsets on EKS)
  • client version github.com/neo4j/neo4j-go-driver/v5 v5.20.0

@StephenCathcart
Copy link
Contributor

Thanks for the extra information @aniketkdm and @justinwalz that narrows things down quite a bit.

@adam-bedrock
Copy link

Hey was this ever resolved?

@StephenCathcart
Copy link
Contributor

StephenCathcart commented Nov 18, 2024

Hey @adam-bedrock not yet, unfortunately. We're able to reproduce the error, but simply locking the close function doesn't fix the issue - it needs a bit of a refactor. I'll keep you posted when we're able to release a fix. Thanks for your patience.

Update: #610 should fix the race condition we're seeing, it'll be in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants