-
Notifications
You must be signed in to change notification settings - Fork 548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: use runfiles strategy to get runfiles root #2115
base: main
Are you sure you want to change the base?
Conversation
@fmeum FYI as you're the RUNFILES resolution expert :) |
@@ -321,19 +324,6 @@ def Create(env: Optional[Dict[str, str]] = None) -> Optional["Runfiles"]: | |||
# Support legacy imports by defining a private symbol. | |||
_Runfiles = Runfiles | |||
|
|||
|
|||
def _FindPythonRunfilesRoot() -> str: | |||
"""Finds the root of the Python runfiles tree.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember that I introduced this function instead of using RUNFILES_DIR
since I thought that the Python runfiles tree may be different from the actual runfiles directory (the one with non-Python data dependencies) when using --build_python_zip
. Do you know whether that's true? If this isn't true, I at least don't understand why the runfiles library would have to deal with RUNFILES_MANFIEST_FILE
at all as the Python runfiles are always arranged in a directory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds vaguely familiar.
It should be easy to test these cases now. Over in tests/base_rules/BUILD.bazel, there's some existing tests for setting these zip flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, some of the failures seem to show this.
@mattem Could you share an example of the venv layout? We can try to find logic that works in both cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fmeum The venv created by rules_py when running a binary or test is rooted at the root of the targets runfiles tree, and takes the standard Python venv layout, with dependencies then being symlinked in. For example, a target named foo
in the //py/tests
package depending on the PyPi version of the runfiles helper will have the following package in the site-packages
directory.
bazel-bin/py/tests/external-deps/foo.runfiles/.foo.venv/lib/python3.9/site-packages/runfiles
The runfiles
last segment of the path is the Python package name.
There is also the case where the user plants the venv directory in an arbitrary place in the source tree for IDE support, at that point .foo.venv
could technically be anywhere, but likely at the source tree root or in the same package as the target. Users then likely "run" the binary via the IDE for debugging etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would the path under the venv of some Python package from an external repository look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For each Python package coming from a Bazel module (not an external pure-Python dep), it would need to write that package name and the corresponding canonical repo name (Label.workspace_name
) to a file.
Then we could look for that file here, find the package name in the path of the current source file and obtain the repo name from it. Feel free to mention me on the PR (draft) that sets up the file if you need help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we able to tell that a dependency comes from a bazel module vs an external Python dependency? They will have that Label.workspace_name
set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I guess your external deps can come from everywhere? In that case you could also just store the mapping between package and Label.workspace_name
for every package. That would match the logic here if it weren't broken due to the different runfiles structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, dependencies can come from both external bazel dependencies or fetched from PyPi (etc) by rules_python. In fact in rules_py, there is very little difference between 1p and 3p dependencies.
That would match the logic here if it weren't broken due to the different runfiles structure.
To be clear, the runfiles helper only fails when it is consumed via the PyPi dependency, it works as expected when used via @bazel_tools//tools/python/runfiles
, as I guess it doesn't attempt to fetch the repo name? Secondly, the "normal" runfiles structure is there, it's just up a few more directory segments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, the runfiles helper only fails when it is consumed via the PyPi dependency, it works as expected when used via
@bazel_tools//tools/python/runfiles
, as I guess it doesn't attempt to fetch the repo name?
The runfiles helper in@bazel_tools//tools/python/runfile
is deprecated and not fully functional with Bzlmod.
Secondly, the "normal" runfiles structure is there, it's just up a few more directory segments.
The subtle part is that this failure isn't about not finding the runfiles strategy, it's about identifying the point of reference for lookups in that structure. For example, when you requestrules_python/foo/bar
viaRlocation
, you need to know which Bazel repo is asking for that to look up the actual repo (and thus its canonical name) seen under that name by the caller.
If we had a separate file that maps Python package names to Bazel repo names, we could get the Python package name from the path and thus the Bazel repo name relative to which runfiles paths would be interpreted.
When attempting to calculate the current repository, the runfiles helper used a strategy of walking up the current files directory four times, however this fails to correctly resolve the runfiles root when the runfiles helper is consumed via the PyPi package and linked into a virtual environment (the root will be determined to be the venv
lib
directory).This change makes it so the runfiles root is fetched from the helper strategy, which is either created with the
$RUNFILES_ROOT
environment variable (directory based), or calculates the root via the runfiles manifest (manifest based).With these changes, the runfiles helper correctly identifies the runfiles root when linked into a Python virtual environment.