Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SSH server dies when creating lots of connections #1124

Open
NefixEstrada opened this issue Nov 12, 2024 · 3 comments
Open

SSH server dies when creating lots of connections #1124

NefixEstrada opened this issue Nov 12, 2024 · 3 comments

Comments

@NefixEstrada
Copy link
Contributor

NefixEstrada commented Nov 12, 2024

Hello,

We've been using WarpGate happily, but we're facing an issue:

  1. We use Ansible
  2. We have an inventory with 14 hosts
  3. When running a (fairly long) playbook, at some point, the SSH server dies, The web server still works, but the SSH server stops listening to the port (running ss -tulpn doesn't show it)
  4. It keeps the opened connections alive, but is unable to open new connections
  5. After a restart of the server, it works again

It seems to be related to the number of connections or high opening / closing of connections.

How can we further debug this?

Thanks!!

Néfix Estrada
IsardVDI

======

Further looking, there are lots of this error:

ERROR warpgate_core::logging::database: Failed to store log entry error=Exec(SqlxError(Database(SqliteError { code: 14, message: "unable to open database file" })))
@codyro
Copy link

codyro commented Nov 12, 2024

What OS are you running (EX, RHEL w/ SELinux enabled)?

ERROR warpgate_core::logging::database: Failed to store log entry error=Exec(SqlxError(Database(SqliteError { code: 14, message: "unable to open database file" })))

Are you seeing this error after you restart the warpgate server?

@NefixEstrada
Copy link
Contributor Author

@codyro We're running Ubuntu 22.04

This error appears only after a while, when the server is under load. Restarting the server fixes the issue temporarily, but it comes back when there's load again

@codyro
Copy link

codyro commented Nov 20, 2024

I was able to test this (albeit on RHEL9) but couldn't replicate it (although there appears to be another issue I faced that may be related to something else—I need to debug it a bit more and submit an issue if necessary).

Here is what I did:

Steps to reproduce

  1. Spin up 20VMs
  2. Setup Ansible with some debug tasks/roles and an inventory for test instances
  3. Ran ansible-playbook with various forks (1,5,10,20) with the default strategy [1]

Playbook(s) used to test

I ran a small playbook to see if everything ran cleanly and quickly. That worked without issues. Since you noted that your playbooks are relatively long, I added another role. This was when I ran into various issues; however, none of them were what you experienced. warpgate continued to listen normally and accepted new connections without a problem.

---
- name: Test warpgate
  hosts: all
  gather_facts: true
  vars:
    ansible_user: "codyr:{{ inventory_hostname }}"
    ansible_host: warpgate.host.com
    ansible_port: 2222
  tasks:
    - name: Print hostname
      ansible.builtin.debug:
        var: ansible_hostname

    - name: Add sshkey to root user
      ansible.posix.authorized_key:
        user: root
        state: present
        key: 'ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIGM0TQe1trzJ4VEsKRhURyJJ7wsr/9UAY2JJdUhaPZfA  cody@test'
  ## Make the playbook run longer
  # roles:
  #   - role: geerlingguy.mysql

What version of ansible & warpgate are you running? I was also using the sqlite backend and did not see the errors you were receiving.

codyr@wp ~ [2]> podman exec -it systemd-warpgate warpgate --version
warpgate 0.11.0
(hawkansible-venv) codyr@Portia ~/D/g/t/s/w/ansible-warpgate (warpgate-testing)> ansible --version
ansible [core 2.17.5]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants