Skip to content

Commit

Permalink
Add more comprehensive debugging docs
Browse files Browse the repository at this point in the history
Signed-off-by: David Son <[email protected]>
  • Loading branch information
sondavidb committed Sep 13, 2023
1 parent 1e00571 commit 3a5113f
Showing 1 changed file with 24 additions and 0 deletions.
24 changes: 24 additions & 0 deletions docs/debug.md
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,30 @@ The snapshotter contains custom retry logic when fetching spans(data) from the r
* You can look for `retrying request` within the logs to determine the error and response returned from the remote registry.
* You can also check `operation_duration_remote_registry_get` metric to see how long it takes to complete `GET` from remote registry.

## Removing an image
Removing an image additionally removes any associated snapshots. A simple `sudo nerdctl image rm [image tag]` should remove the snapshot before the full image is gone. You can confirm the image is gone by ensuring it is no longer present in `sudo nerdctl image ls`.

## Restarting the snapshotter
While a graceful stop/restart o f the process is preferred, sometimes it is not possible or simply easier to just kill the process. However, oftentimes it comes with a multitude of issues that do not allow you to start the snapshotter properly, particularly if a previously loaded snapshot is still present. For instance, if you pull from a repository that requires credentials and stop the snapshotter, if credentials are expired upon the daemon's startup, the snapshotter will fail to start properly.

Many errors related to loaded snapshots can be surpassed by setting `allow_invalid_mounts_on_restart=true` in `/etc/soci-snapshotter-grpc/config.toml`. Note that using the snapshotter will likely load in a broken state and you will be unable to do common functionality (such as pulling another image) until the currently loaded snapshot is removed.

## Creating a clean slate
If all else fails, a clean slate can help to get you back to square one. These steps should bring you to a clean slate. (NOTE: This includes wiping your entire container store clean, so be sure to back up any important files.)

```bash
sudo killall -2 soci-snapshotter-grpc # SIGINT allows for a more graceful cleanup, can omit the -2 flag to send a SIGTERM

# If necessary, unmount any remaining fuse mounts, though a SIGINT to the daemon should automatically handle that

sudo rm -rf /var/lib/containerd
sudo rm -rf /var/lib/soci-snapshotter-grpc

sudo systemctl restart containerd

sudo soci-snapshotter-grpc
```

# Debugging Tools

## CLI
Expand Down

0 comments on commit 3a5113f

Please sign in to comment.