Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--bootstrap_impl=script breaks py_binary with py_package + py_wheel #2489

Open
ewianda opened this issue Dec 9, 2024 · 8 comments
Open

--bootstrap_impl=script breaks py_binary with py_package + py_wheel #2489

ewianda opened this issue Dec 9, 2024 · 8 comments

Comments

@ewianda
Copy link
Contributor

ewianda commented Dec 9, 2024

🐞 bug report

Affected Rule

py_package + py_wheel

Is this a regression?

Yes, related commit #2409

Description

Using the resulting py_binary with py_package+py_wheel fails to find the interpreter symlink. This also breaks pkg_tar from rules_pkg

🔬 Minimal Reproduction

diff --git a/examples/wheel/BUILD.bazel b/examples/wheel/BUILD.bazel
index 58a43015..f06f6ee8 100644
--- a/examples/wheel/BUILD.bazel
+++ b/examples/wheel/BUILD.bazel
@@ -17,6 +17,7 @@ load("@bazel_skylib//rules:write_file.bzl", "write_file")
 load("//examples/wheel/private:wheel_utils.bzl", "directory_writer", "make_variable_tags")
 load("//python:packaging.bzl", "py_package", "py_wheel")
 load("//python:pip.bzl", "compile_pip_requirements")
 load("//examples/wheel/private:wheel_utils.bzl", "directory_writer", "make_variable_tags")
 load("//python:packaging.bzl", "py_package", "py_wheel")
 load("//python:pip.bzl", "compile_pip_requirements")
+load("//python:py_binary.bzl", "py_binary")
 load("//python:py_library.bzl", "py_library")
 load("//python:py_test.bzl", "py_test")
 load("//python:versions.bzl", "gen_python_config_settings")
@@ -414,3 +415,29 @@ py_console_script_binary(
     pkg = "@pypiserver//pypiserver",
     script = "pypi-server",
 )
+
+py_binary(
+    name = "main_binary",
+    srcs = ["main.py"],
+    main = "main.py",
+    deps = [
+        "//examples/wheel/lib:module_with_data",
+        "//examples/wheel/lib:simple_module",
+    ],
+)
+
+py_package(
+    name = "binary_package",
+    # Only include these Python packages.
+    packages = ["examples"],
+    deps = [":main_binary"],
+)
+
+py_wheel(
+    name = "minimal_binary",
+    distribution = "requires_files",
+    version = "0.1.0",
+    deps = [
+        ":binary_package",
+    ],
+)

🔥 Exception or Error

    runpy.run_path(main_filename, run_name="__main__")
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "/home/ewianda/.cache/bazel/_bazel_ewianda/87bd9802a0763f5c88728552ef326f75/sandbox/linux-sandbox/14/execroot/_main/bazel-out/k8-opt-exec-ST-d57f47055a04/bin/tools/wheelmaker.runfiles/_main/tools/wheelmaker.py", line 631, in <module>
    main()
  File "/home/ewianda/.cache/bazel/_bazel_ewianda/87bd9802a0763f5c88728552ef326f75/sandbox/linux-sandbox/14/execroot/_main/bazel-out/k8-opt-exec-ST-d57f47055a04/bin/tools/wheelmaker.runfiles/_main/tools/wheelmaker.py", line 541, in main
    maker.add_file(package_filename, real_filename)
  File "/home/ewianda/.cache/bazel/_bazel_ewianda/87bd9802a0763f5c88728552ef326f75/sandbox/linux-sandbox/14/execroot/_main/bazel-out/k8-opt-exec-ST-d57f47055a04/bin/tools/wheelmaker.runfiles/_main/tools/wheelmaker.py", line 304, in add_file
    self._whlfile.add_file(package_filename, real_filename)
  File "/home/ewianda/.cache/bazel/_bazel_ewianda/87bd9802a0763f5c88728552ef326f75/sandbox/linux-sandbox/14/execroot/_main/bazel-out/k8-opt-exec-ST-d57f47055a04/bin/tools/wheelmaker.runfiles/_main/tools/wheelmaker.py", line 156, in add_file
    with open(real_filename, "rb") as fsrc:
         ^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'bazel-out/k8-fastbuild/bin/examples/wheel/_main_binary.venv/bin/python3'
Target //examples/wheel:minimal_binary failed to build
@rickeylev
Copy link
Collaborator

I think this is somewhat WAI. A plain py_binary can't really be given as something for py_package to process, and the plain py_binary isn't meant to be redistributable.

The first order problem is that zip doesn't support symlinks. There's some extensions to allow it to store them, though. However, that might be tricky because most of files it is given are going to be symlinks to something else, so we'd need some way to tell "dereference these symlinks, but not these symlinks". Maybe by reading the symlink and see if its non-absolute? IDK.

If you really want to package the whole binary, then you're probably better off packaging the zip file version of the binary. That has special code in its startup to handle the case of coming from a zip file that couldn't store symlinks.

Why do you want to pass a py_binary to py_package?

@aignas
Copy link
Collaborator

aignas commented Dec 10, 2024

Is this also affecting the tar rule from the aspect for usage in rules_oci?

@chowder
Copy link
Contributor

chowder commented Dec 10, 2024

Yeah it's the same problem.

The problem as I see it, is that the venv is created both in the runfiles directory, but also in the directory containing the runfiles directory (is there a name for this?).

Only the former is required, the latter isn't functional (broken symlink) and never gets invoked anyway; stage-1 bootstrap runs the former.

I didn't see an easy way to fix this - usually I would've tried to use runfiles symlinks if I wanted to create a symlink only inside the runfiles directory, but this doesn't let you create arbitrary relative symlinks.

Personally I see this as a deficiency of the related tar rules, but rickylev's already explained why this is a bit tricky.

@ewianda
Copy link
Contributor Author

ewianda commented Dec 10, 2024

Is this also affecting the tar rule from the aspect for usage in rules_oci?

I don't know, It is affecting pkg_tar . Should be related to bazelbuild/rules_pkg#115

@ewianda
Copy link
Contributor Author

ewianda commented Dec 10, 2024

Why do you want to pass a py_binary to py_package?

This is a very old usecase for us where we build wheels for Apache-beam application for Dataflow jobs

@aignas
Copy link
Collaborator

aignas commented Dec 10, 2024

I personally think the py_binary -> py_package -> py_wheel is definitely not the intended usage. It might be better to have py_library -> py_package -> py_wheel + extra entry_points.txt METADATA so that you get a console script when you install a package.

Do I understand correctly that the py_package will pull all of the third party deps and bundle them into the wheel built by bazel?

As for pkg_tar maybe there is also an easy way to tell it to create a particular symlink needed for stage2 bootstrap...


That said, I think the current py_binary behaviour with the --bootstrap=script is something that we have to have for #2161 to work... unless we fully solve #2156 and do our own venvs as part of creating py_binary. Are there other better options?

@ewianda
Copy link
Contributor Author

ewianda commented Dec 10, 2024

@aignas just tested with tar from aspect precisely https://github.com/aspect-build/bazel-examples/tree/main/oci_python_image

ianda/00aeeab5df0e41ab9e87aa07d33c6a73/sandbox/linux-sandbox/15/execroot/_main -w /tmp -S /home/ewianda/.cache/bazel/_bazel_ewianda/00aeeab5df0e41ab9e87aa07d33c6a73/sandbox/linux-sandbox/15/stats.out -D 

/home/ewianda/.cache/bazel/_bazel_ewianda/00aeeab5df0e41ab9e87aa07d33c6a73/sandbox/linux-sandbox/15/debug.out -- external/aspect_bazel_lib~~toolchains~bsd_tar_linux_amd64/tar --create --file bazel-out/k8-fastbuild/bin/benchsci/buildtools/containers/py_image_example.interpreter_layer.tar @bazel-out/k8-fastbuild/bin/benchsci/buildtools/containers/py_image_example.interpreter_tar_manifest.spec)

tar: Error reading archive bazel-out/k8-fastbuild/bin/benchsci/buildtools/containers/py_image_example.interpreter_tar_manifest.spec: Can't open bazel-out/k8-fastbuild/bin/benchsci/buildtools/containers/_archive.venv/bin/python3

@ewianda
Copy link
Contributor Author

ewianda commented Dec 10, 2024

Do I understand correctly that the py_package will pull all of the third party deps and bundle them into the wheel built by bazel?

I think py_package filters which first-party source code goes into the wheel. I think third party deps have to be specified manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants