Skip to content

User_Using your own compute workers

Adrien Pavão edited this page Nov 4, 2021 · 65 revisions

In these instructions, we guide you step-by-step to create your own "compute worker" (a server to which submissions of challenge participants are sent to be executed).

Contents


Hooking up a compute worker to a queue

Go to "Worker Queue Management":

Select "Add Queue":

Give the queue a name. Select whether you want is public or private (public queues can be used by everybody). For private queues, select which users you want to share them with. You can edit these settings later.

Get the queue key you needed in the previous section. Put in the .env configuration file as per instructions.

You can also set a hostname by adding a CODALAB_HOSTNAME env var.

Setting up a server as compute worker

To set up a server as compute worker, we provide instructions for creating Ubuntu Virtual machines (VM), but you may also use your own machine:

Then ssh into your machine and run the following commands:

Step 1: Install Docker

  • Get docker (quick and dirty way; for a full clean installation see Docker Ubuntu instructions):

    $ curl https://get.docker.com | sudo sh

  • Add yourself to the docker group (so you don't nee to run as root):

    $ sudo usermod -aG docker $USER

  • Log out of your server and log back in. Then check your installation:

    $ docker run hello-world

Step 2: Configuration file

  • Create and edit the configuration file .env:

    $ vim .env

Note: vim is an editor coming with Ubuntu. If you rather use emacs, type first sudo apt-get install emacs.

  • Add in .env the line BROKER_URL= and paste your worker queue key:

    BROKER_URL=pyamqp://6ad9ac58-88e3-4a22-9be7-6ed5126ef388:40546f9d-0f2e-4413-a6b2-a1d86cad2b30@localhost/37324ab2-ee78-4e8d-a6ee-6089a159d253

To get your key, see previous section. KNOWN BUG: Sometimes the server IP address or URL is replaced by localhost, like in the example above. If you get this, substitute localhost by the IP address or URL of your server.

  • Add BROKER_USE_SSL=True in .env.

Step 3: Start the worker

  • Get your worker to start computing (it will start listening to the queue via BROKER_URL and pick up jobs):

Note: this will make a /tmp/codalab directory

docker run \
    --env-file .env \
    --name compute_worker \
    -d \
    --restart unless-stopped \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /tmp/codalab:/tmp/codalab \
    codalab/competitions-v1-compute-worker:1.1.5
  • Get the ID of the docker process

    $ docker ps

  • Make a submission to your competition and check logs with to confirm it is working:

    $ docker logs -f <DOCKER PS ID>

  • Should you need to change the queue that your compute worker listens to, edit again .env, then run:

    $ docker kill <DOCKER PS ID>

  • Then re-run the one line command above.

Remark: to use GPUs, go to the bottom of this guide.

Assigning a queue to a competition

Go to your competition page and click "Edit":

Select the queue that you have created:

Cleaning up periodically

It is recommended to clean up your docker images and containers.

  1. Run the following command:
sudo crontab -e
  1. Add the following line:
@daily docker system prune -af

Using GPU

Prerequisite

Install cuda, nvidia, docker and nvidia-docker (system dependent)

Command to launch the worker

In order to use GPU compute workers, you need to replace the command on Step 3 by this one:

sudo mkdir -p /tmp/codalab && nvidia-docker run \
    -v /var/run/docker.sock:/var/run/docker.sock \
    -v /var/lib/nvidia-docker/nvidia-docker.sock:/var/lib/nvidia-docker/nvidia-docker.sock \
    -v /tmp/codalab:/tmp/codalab \
    -d \
    --name compute_worker \
    --env-file .env \
    --restart unless-stopped \
    --log-opt max-size=50m \
    --log-opt max-file=3 \
    codalab/competitions-v1-nvidia-worker:v1.5-compat
Side remark on GPU computation

The default competition Docker image is codalab/codalab-legacy:py3. In order to use GPUs, you should use as a basis the image codalab/codalab-legacy:gpu which contains useful dependencies for GPU usage. Here, we are talking about the containers in which the submission are executed, not the containers of the workers themselves.

Clone this wiki locally