Nitro can't stop successfully: 'taking too long to stop' and after goes forward #2839

boogeroccam · 2024-12-17T15:03:02Z

Describe the bug

A clear and concise description of what the bug is.

Nitro can't stop successfully after getting sigint signal.

To Reproduce

Steps to reproduce the behavior:

docker-compose.yaml

version: '3.8'
services:
  arbitrum-node:
    image: 'offchainlabs/nitro-node:v3.2.1-d81324d'
    container_name: arbitrum-node
    ports:
      - "0.0.0.0:8547:8547"
      - "0.0.0.0:8546:8546"
      - "0.0.0.0:6070:6070"
    command:
      #- --init.prune=full
      #- --init.url=file:///home/user/.arbitrum/archive.tar
      - --parent-chain.connection.url=http://eth-node:8545
      - --parent-chain.blob-client.beacon-url=http://eth-node:5052
      - --chain.id=42161
      - --ws.addr=0.0.0.0
      - --ws.origins=*
      - --http.vhosts=*
      - --http.addr=0.0.0.0
      - --http.corsdomain=*
      - --metrics
      - --metrics-server.addr=0.0.0.0
      - --metrics-server.port=6070
    volumes:
      - /data/arbitrum:/home/user/.arbitrum
    restart: unless-stopped
    deploy:
      restart_policy:
        condition: on-failure
      update_config:
        delay: 10s
    stop_grace_period: 30s
    network_mode: host
    user: 1000:1000
    logging:
      driver: json-file
      options:
        max-size: 10m
        max-file: "10"

docker compose up
docker exec arbitrum-node kill -15 1
waiting for end up node process more than 30 second and get error

Expected behavior

A clear and concise description of what you expected to happen.

The process is expected to complete successfully and create a snapshot from which we can perform prune.

Additional context

Add any other context about the problem here.

Firstly snapshot gathered from https://snapshot-explorer.arbitrum.io/ (pruned), started and after hitting 2.9TB in db-size on order to prune db read some another issues with prunning #2441 #2805, decide to give a chance nitro to terminate successfully

Full log from nitro container in file
arbitrium-logs.log

INFO [12-17|13:58:37.697] shutting down because of sigint
INFO [12-17|13:58:37.697] HTTP server stopped                      endpoint=[::]:8547
INFO [12-17|13:58:37.697] HTTP server stopped                      endpoint=[::]:8548
INFO [12-17|13:58:37.697] delayed sequencer: context done          err="context canceled"
INFO [12-17|13:58:38.394] created block                            l2Block=285,722,652 l2BlockHash=5a33dd..b036e3
...
WARN [12-17|13:59:07.700] taking too long to stop                  name=arbnode.MessagePruner delay[s]=30.000
WARN [12-17|13:59:07.701] goroutine 1 [running]:
github.com/offchainlabs/nitro/util/stopwaiter.getAllStackTraces()
	/workspace/util/stopwaiter/stopwaiter.go:121 +0x3d
github.com/offchainlabs/nitro/util/stopwaiter.(*StopWaiterSafe).stopAndWaitImpl(0xc037e9e300, 0x6fc23ac00)
	/workspace/util/stopwaiter/stopwaiter.go:139 +0xe5
github.com/offchainlabs/nitro/util/stopwaiter.(*StopWaiterSafe).StopAndWait(...)
	/workspace/util/stopwaiter/stopwaiter.go:116
github.com/offchainlabs/nitro/util/stopwaiter.(*StopWaiter).StopAndWait(...)
	/workspace/util/stopwaiter/stopwaiter.go:324
github.com/offchainlabs/nitro/arbnode.(*Node).StopAndWait(0xc000764fc0)
	/workspace/arbnode/node.go:977 +0x19a
main.mainImpl.func14()
	/workspace/cmd/nitro/nitro.go:638 +0x17
main.mainImpl.func6()
	/workspace/cmd/nitro/nitro.go:434 +0x35
main.mainImpl()
	/workspace/cmd/nitro/nitro.go:675 +0x5172
main.main()
	/workspace/cmd/nitro/nitro.go:142 +0x13
...
goroutine 64031 [select]:
github.com/ethereum/go-ethereum/core/state.(*subfetcher).loop(0xc11aeda800)
	/workspace/go-ethereum/core/state/trie_prefetcher.go:319 +0x553
created by github.com/ethereum/go-ethereum/core/state.newSubfetcher in goroutine 187
	/workspace/go-ethereum/core/state/trie_prefetcher.go:246 +0x206

INFO [12-17|13:59:08.451] created block                            l2Block=285,722,771 l2BlockHash=fbbf31..d89917
INFO [12-17|13:59:09.452] created block                            l2Block=285,722,775 l2BlockHash=da8fcb..dd1751
INFO [12-17|13:59:10.452] created block                            l2Block=285,722,779 l2BlockHash=87f0ef..50c265

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nitro can't stop successfully: 'taking too long to stop' and after goes forward #2839

Nitro can't stop successfully: 'taking too long to stop' and after goes forward #2839

boogeroccam commented Dec 17, 2024

Nitro can't stop successfully: 'taking too long to stop' and after goes forward #2839

Nitro can't stop successfully: 'taking too long to stop' and after goes forward #2839

Comments

boogeroccam commented Dec 17, 2024