Add support for GPU acceleration in Windows (only NVIDIA validated) #476

dpascualhe · 2024-12-15T18:49:51Z

We have been able to achieve GPU acceleration for NVIDIA GPUs in Windows when launching from within WSL2. The user needs to have a valid installation of WSL2 + CUDA and Docker Desktop, and the Docker container must be launched from within the WSL2 terminal. Some extra flags are required in the docker run command:

docker run -it --gpus all -v /usr/lib/wsl:/usr/lib/wsl -e LD_LIBRARY_PATH=/usr/lib/wsl/lib --device /dev/dri -p 7164:7164 -p 6080:6080 -p 1108:1108 -p 7163:7163 jderobot/robotics-academy:latest)

Output from the script:

--- GPU acceleration info ---
GPUs found:
        24d7:00:00.0 3D controller [0302]: Microsoft Corporation Basic Render Driver [1414:008e]
INFO: GPU candidates found at /dev/dri/card0 (keeping /dev/dri/card0).
INFO: 'nvidia-smi' available. Most likely an NVIDIA GPU in WSL.
-----------------------------

Oddly enough, all GPUs seem to be visible within the WSL, but they are disguised as Microsoft devices so the actual vendor information can't be accessed. We'll have to settle with 'Microsoft' vendor for now. The new set_dri_name.sh adds Microsoft vendor as the last resort, checks if there is any card available in /dev/dri, and keeps the first one. It also checks if nvidia-smi can be run from within the container, which would mean that the selected GPU is likely to be NVIDIA. In dual GPU systems I have not been able to access Intel GPUs, so further testing would be required in that regard.

javizqh · 2024-12-15T21:45:15Z

Sadly I cannot test this in Windows. If someone else can test it, it would be appreciated

dduro2020 · 2024-12-16T13:04:38Z

I have the same problem, I can't test it in Windows

dpascualhe · 2024-12-16T13:15:15Z

Maybe @codezerro can help?

codezerro · 2024-12-16T15:12:27Z

@dpascualhe how can i help you?

dpascualhe · 2024-12-17T18:07:15Z

@codezerro we need to validate that the new version of the introspection script for GPU selection in this PR is working as expected for Windows environments. Given that it is just a single file, the easier way for me to test it is to simply run the latest RoboticsBackend container, copy the updated file, and then manually launch the entrypoint.

Instructions (all of them must be run from a WSL2 environment with working CUDA drivers; check that you can run nvidia-smi from within the WSL2 environment first):

Start database:

docker run --hostname my-postgres --name academy_db -d -e POSTGRES_DB=academy_db -e POSTGRES_USER=user-dev -e POSTGRES_PASSWORD=robotics-academy-dev -e POSTGRES_PORT=5432 -d -p 5432:5432 jderobot/robotics-database:latest

Once finished, start RoboticsAcademy, but overriding the entrypoint so that we can get access to the container:

docker run --rm -it $(nvidia-smi >/dev/null 2>&1 && echo "--gpus all" || echo "") -v /usr/lib/wsl:/usr/lib/wsl -e LD_LIBRARY_PATH=/usr/lib/wsl/lib --device /dev/dri -p 6080:6080 -p 1108:1108 -p 7163:7163 -p 7164:7164 --link academy_db --entrypoint /bin/bash jderobot/robotics-academy:latest

Copy the updated set_dri_name.sh to the container:

docker cp set_dri_name.sh <robotics backend container id>:set_dri_name.sh

From within the docker container, run the entrypoint:

./entrypoint.sh

Things we need to check:

The GPU info logged by the script (first traces when running the entrypoint).
Access an exercise with Gazebo (e.g., follow line) and check the GPU status (top right).
Run nvidia-smi from another terminal and check that gzserver process is listed (while running follow line).

(@javizqh @dduro2020 maybe this is a weird process? let me know if you think there's a more straightforward approach)

javizqh · 2024-12-17T20:05:56Z

If you are not building a RADI to test, I cannot think of an easier way to test it

codezerro · 2024-12-19T05:10:05Z

@dpascualhe let me some time.

codezerro · 2024-12-22T04:44:25Z

@dpascualhe first attempt was not very good. I faced some technical issues. I'll update you when I get some results.

dpascualhe · 2024-12-23T11:19:32Z

@dpascualhe first attempt was not very good. I faced some technical issues. I'll update you when I get some results.

Issues due to the WSL2+CUDA+Docker environment or RoboticsAcademy?

codezerro · 2024-12-25T07:56:21Z

@dpascualhe i done it successfully, but some tasks are unclear to me. you can use my system.

dpascualhe · 2025-01-03T20:52:57Z

From @codezerro PC (Windows 11 + CUDA 12.7 + WSL with Ubuntu 24.04), after following the instructions the above:

gzserver and gzclient appear as processes when running nvidia-smi command.
MICROSOFT appears as GPU vendor.
RT Factor goes up to 1.
GPU introspection script output:

-- GPU acceleration info ---
GPUs found:
5dc0:00:00.0 3D controller [0302]: Microsoft Corporation Device [1414:008e]
INFO: GPU candidates found at /dev/dri/card0 (keeping /dev/dri/card0).
INFO: 'nvidia-smi' available. Most likely an NVIDIA GPU in WSL.
GPU selected:
5dc0:00:00.0 3D controller [0302]: Microsoft Corporation Device [1414:008e]
DRI_VENDOR: microsoft
DRI_NAME: card0

A screenshot for further proof:

@javizqh , from my side, I consider the changes in this PR sufficiently validated for merging (being aware that our testing pool for different Windows/WSL/CUDA configurations has been small)

javizqh · 2025-01-04T18:37:27Z

Have you checked that it still works fine in Linux @dpascualhe? If so I will merge it.

Add support for GPU acceleration in Windows (only nvidia validated)

2cca716

dpascualhe linked an issue Dec 15, 2024 that may be closed by this pull request

Add GPU acceleration support for Windows #475

Open

Minor fix

20fde47

dpascualhe requested review from dduro2020 and javizqh December 15, 2024 20:49

dpascualhe marked this pull request as ready for review December 15, 2024 20:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for GPU acceleration in Windows (only NVIDIA validated) #476

Add support for GPU acceleration in Windows (only NVIDIA validated) #476

dpascualhe commented Dec 15, 2024 •

edited

Loading

javizqh commented Dec 15, 2024

dduro2020 commented Dec 16, 2024

dpascualhe commented Dec 16, 2024

codezerro commented Dec 16, 2024

dpascualhe commented Dec 17, 2024 •

edited

Loading

javizqh commented Dec 17, 2024

codezerro commented Dec 19, 2024

codezerro commented Dec 22, 2024

dpascualhe commented Dec 23, 2024

codezerro commented Dec 25, 2024

dpascualhe commented Jan 3, 2025

javizqh commented Jan 4, 2025

Add support for GPU acceleration in Windows (only NVIDIA validated) #476

Are you sure you want to change the base?

Add support for GPU acceleration in Windows (only NVIDIA validated) #476

Conversation

dpascualhe commented Dec 15, 2024 • edited Loading

javizqh commented Dec 15, 2024

dduro2020 commented Dec 16, 2024

dpascualhe commented Dec 16, 2024

codezerro commented Dec 16, 2024

dpascualhe commented Dec 17, 2024 • edited Loading

javizqh commented Dec 17, 2024

codezerro commented Dec 19, 2024

codezerro commented Dec 22, 2024

dpascualhe commented Dec 23, 2024

codezerro commented Dec 25, 2024

dpascualhe commented Jan 3, 2025

javizqh commented Jan 4, 2025

dpascualhe commented Dec 15, 2024 •

edited

Loading

dpascualhe commented Dec 17, 2024 •

edited

Loading