Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tt-xla plugin is built for Ubuntu 22.04 but TT Cloud runs 20.04 #121

Closed
steeve opened this issue Dec 15, 2024 · 7 comments
Closed

tt-xla plugin is built for Ubuntu 22.04 but TT Cloud runs 20.04 #121

steeve opened this issue Dec 15, 2024 · 7 comments
Assignees
Labels
community issue was filed by a community member (not TT)

Comments

@steeve
Copy link

steeve commented Dec 15, 2024

This results in a wrong GLIBC version error.

Tried patching the dockerfiles to use ubuntu 20.04 docker images, but it fails with:

184.5 CMake Error at /usr/local/lib/python3.12/dist-packages/cmake/data/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:233 (message):
184.5   Could NOT find Python3 (missing: Interpreter) (Required is at least version
184.5   "3.8")
184.5
184.5       Reason given by package:
184.5           Interpreter: Cannot run the interpreter "/opt/ttmlir-toolchain/venv/bin/python"
184.5
184.5 Call Stack (most recent call first):
184.5   /usr/local/lib/python3.12/dist-packages/cmake/data/share/cmake-3.31/Modules/FindPackageHandleStandardArgs.cmake:603 (_FPHSA_FAILURE_MESSAGE)
184.5   /usr/local/lib/python3.12/dist-packages/cmake/data/share/cmake-3.31/Modules/FindPython/Support.cmake:4002 (find_package_handle_standard_args)
184.5   /usr/local/lib/python3.12/dist-packages/cmake/data/share/cmake-3.31/Modules/FindPython3.cmake:602 (include)
184.5   CMakeLists.txt:971 (find_package)
184.5
184.5
184.5 -- Configuring incomplete, errors occurred!
184.5 make[2]: *** [CMakeFiles/llvm-project.dir/build.make:95: llvm-project-prefix/src/llvm-project-stamp/llvm-project-configure] Error 1
184.5 make[1]: *** [CMakeFiles/Makefile2:164: CMakeFiles/llvm-project.dir/all] Error 2
184.5 make: *** [Makefile:91: all] Error 2
------
Dockerfile.ci:27
--------------------
  26 |     # Build the toolchain
  27 | >>> RUN cmake -B toolchain -DTOOLCHAIN=ON third_party/ && \
  28 | >>>     cd third_party/tt-mlir/src/tt-mlir/ && \
  29 | >>>     source env/activate && \
  30 | >>>     cmake -B env/build env && \
  31 | >>>     cmake --build env/build
  32 |
--------------------

Thanks

@github-actions github-actions bot added the community issue was filed by a community member (not TT) label Dec 15, 2024
@nsmithtt
Copy link

nsmithtt commented Dec 15, 2024

Hey @steeve, thanks for filing an issue! Before getting to the specifics of your issue, we do not officially support ubuntu 20.04 as noted in the tt-mlir support matrix here. This is a transitive constraint from the tt-metalium project, is there a reason you cannot use 22.04?

That said, it might be possible that it works, but it's not being tested.

With regard to your error specifically, I haven't seen it before, are you able to manually run /opt/ttmlir-toolchain/venv/bin/python?

@staylorTT
Copy link
Contributor

I think this may be due to the image being ran on cloud. We can work with them internally to get this figured out.

@steeve
Copy link
Author

steeve commented Dec 15, 2024

Thanks for getting back! My main problem is that TensTorrent cloud runs on Ubuntu 20.04 :(

user@tt-metal-code-server-dev-1d76aaf4-deployment-79bddb48f4-tzq6n:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 20.04.6 LTS
Release:        20.04
Codename:       focal

@steeve
Copy link
Author

steeve commented Dec 15, 2024

I think this may be due to the image being ran on cloud. We can work with them internally to get this figured out.

That'd be great ! I'm in a bit of a bind.

@milank94 milank94 added bug Something isn't working and removed bug Something isn't working labels Dec 16, 2024
@vmilosevic
Copy link
Contributor

@steeve one thing you can try is to run tt-xla inside a docker image based on Ubuntu 22.04.
After setting up drivers, huge pages and tt-smi tool install also docker on the machine, pull the tt-xla image and run tt-xla inside the container. You need to mount the device and host folders when starting the container to make the device accessible.

Here is an example you can try

docker run --rm -it \
  --device /dev/tenstorrent/0 \
  -v /dev/hugepages:/dev/hugepages \
  -v /dev/hugepages-1G:/dev/hugepages-1G \
  -v /etc/udev/rules.d:/etc/udev/rules.d \
  -v /lib/modules:/lib/modules \
  -v /opt/provisioning_env:/opt/provisioning_env \
  ghcr.io/tenstorrent/tt-xla/tt-xla-ci-ubuntu-22-04:latest

git clone https://github.com/tenstorrent/tt-xla.git
cd tt-xla

source venv/activate
cmake -G Ninja \
-B build \
-S .

cmake --build build
cmake --install build

export LD_LIBRARY_PATH="/opt/ttmlir-toolchain/lib/:install/lib:${LD_LIBRARY_PATH}"
pytest -v tests/

@steeve
Copy link
Author

steeve commented Dec 16, 2024

Thanks @vmilosevic ! Unfortunately I don't yet have access to an actual machine. I'm still on a kubernetes pod.

@steeve
Copy link
Author

steeve commented Dec 17, 2024

All good, the tt-fe was updated to 22.04. Closing.

@steeve steeve closed this as completed Dec 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community issue was filed by a community member (not TT)
Projects
None yet
Development

No branches or pull requests

7 participants