Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strict rules that check dependencies and type annotations #296

Open
dayfine opened this issue Apr 26, 2020 · 24 comments
Open

Strict rules that check dependencies and type annotations #296

dayfine opened this issue Apr 26, 2020 · 24 comments

Comments

@dayfine
Copy link

dayfine commented Apr 26, 2020

Will we have strict rules that check for deps and type annotations? i.e. py_strict_library, py_strict_binary, py_strict_test.

@abergmeier
Copy link

check for deps

Can you elaborate? Not sure what is meant.

type annotations

Then py_strict* would be a macro also instantiating a type annotation rule?

@dayfine
Copy link
Author

dayfine commented Apr 27, 2020

At google we have (actually) pytype_strict_{binary,library} to check for typings using https://github.com/google/pytype, and check for strict dependencies, i.e. depend on build targets for what you import.

As I realized this involves pytype, I suppose this could be a quite some work...

@thundergolfer
Copy link

People are using macros to do this sort of thing right now, and Bazel Aspects alternatively. In future I think there should be a 'paved road' for writing type-annotated Python in Bazel, though can't say yet whether there'd be rules like py_strict_binary in rules_python that execute pytype, mypy, or pyre.

@rickeylev
Copy link
Collaborator

(I'm the author of the strict deps checking rule at Google)

"Strict deps" is a term we use to mean if you have "import foo" in your code, then some direct dependency provides "foo". It helps prevent your code from breaking when its dependencies's dependencies have changed.

The strict deps checking and pytype checking are separate implementations. We just bundle them into a single macro because usually people want both (both have their own separate stand alone macros, too).

The core design of how strict deps works would probably mostly work for Bazel built Python code. It has to make some assumptions with how things work, of course. We've also had to work around a variety of short comings in the Python providers that starlark currently has.

wrt naming: having macro names that reflect a particular aspect/validator being enforced also comes with a combinatorical drawback. You end up with py_strict_, pytype_, mypy_, pytype_strict_, mypy_strict_*, etc etc. While it works, and there's a certain appeal to having "what it does" right there in the rule name, it's also a headache for new comers ("Which of the 12 rules do I use?") and naming. I wonder if there's a better way (the particular idea in my head is: what if I want to make sure my code works with pytype and mypy and pyre and strict deps and etc mypy_pytype_pyre_strict_library?).

@dayfine
Copy link
Author

dayfine commented Apr 28, 2020

Ok I see that there is open source consideration here that different type checkers may be used. So strict_deps and type checking are definitely two separate considerations.

I wonder if the type checkers can all be hooked into a single entry point (not sure if it makes sense to use more than one type checker at the same time but) so the work can be hidden behind py_library:

py_library(
   ...
   type_checkers = ['pytype']
   ...
)

@abergmeier
Copy link

Maybe rather than having a string have a Provider like PyTypeCheckerInfo or so. This way you are able to pass custom checkers, etc.

@rickeylev
Copy link
Collaborator

Passing a string probably isn't a good idea because it would require the callee to have a mapping of names to functions, e.g., py_library would have to have to depend on and load all possible implementations. Instead passing some sort of function/object would make more sense. But, the basic idea of having a type_checkers arg I like. Or maybe validators (to mirror the "_validation" name bazel gave to such actions), since strict deps (and various other static-analysis checks) aren't about type checking. In any case, I encourage thinking about static analysis type validators in general instead of type checkers specifically, e.g. checking that you don't have unused dependencies, basic syntax check, checking for SQL injections; checking incorrect format strings -- the sort of checks that other languages use compiler hooks for.

Anyways...

Here's a bit of a brain dump for some Bazel things that would make implementing strict deps much easier. These are various assumptions it makes in our implementation. Fundamentally, it's all about mapping a target to the module names, package names, and package attributes that the target provides.

  • rule-type specific providers. e.g. PyBinaryInfo, PyTestInfo, PyLibraryInfo. This would allow it to distinguish between rule types without having to hard code against the Target.kind string name. The distinction matters because of some subtle differences in how they get treated.
  • Information/mechanism to transform a file name to module name. This basically boils down to three things:
    1. Source and bytecode suffixes (.py, .pyc, etc), e.g. the importlib.machinary.*_SUFFIXES values
    2. Information about the "import roots" (PYTHONPATH); I think this corresponds to the workspace root?
  • A way to get the module names a rule directly provides, and if those are packages or modules. The package vs module distinction is important for strict deps. This also ends up being necessary for non-trivial rules; e.g. those that rely on runtime logic that affect the importable names like sys.modules trickery etc.
  • Info (probably a provider of some sort) of what's in the runtime's stdlib, specifically the module names, and packages and package attrs in the runtime's stdlib.
  • Info (probably in the same runtime provider as above) of what the runtime's Python version is; this is so that we know how to parse the file (e.g. a 2.7, 3.6, 3.8 etc parser).

In our implementation, most of those points above are hard coded in some way (e.g., we have a file with a bunch of info about the stdlib, we make various assumptions when transforming a file path to a module name, etc).

@blais
Copy link

blais commented May 17, 2020

+1 to py_strict_* functions and would love pytype support too. These two features by themselves make Bazel a really appealing option for Python code, plus the ability to easily generate par files (I suspect the Python community might otherwise ignore Bazel).

@fahhem
Copy link

fahhem commented Jun 3, 2020

Something like the above, but with labels so it supports custom checkers seems like a good middle ground.

py_strict_library(
    ...
    type_checkers = ["@rules_python//type_checkers:mypy"],
)

Then a couple macros could be provided:

load("@rules_python//type_checkers/defs.bzl", "py_mypy_strict_library")

py_mypy_strict_library(...)

That just add in a type_checkers list and pass everything along.

@blais
Copy link

blais commented Jun 4, 2020

@fahhem This looks very much like what Google does. I would much rather be able to provide a global setting for all Python libraries for the type checkers. If you have a large project, inserting the type_checkers in every single target will become quite redundant real fast. Ditto re. strict targets.

And what about linting? The problem with requiring lint to pass is that one needs to be able to disable that until ready to commit. That smells to me also like a global setting.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

@github-actions github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Apr 14, 2021
@github-actions
Copy link

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

@cnsgsz
Copy link

cnsgsz commented Jun 24, 2023

Is this issue resolved? I don't seem to find an implementation in the thread.

@chrislovecnm
Copy link
Collaborator

I will re-open. But since a different issue was filed in bazel/bazel we may not be able to implement here.

@chrislovecnm chrislovecnm reopened this Jun 24, 2023
@github-actions github-actions bot removed the Can Close? Will close in 30 days if there is no new activity label Jun 24, 2023
Copy link

github-actions bot commented May 3, 2024

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

@github-actions github-actions bot added the Can Close? Will close in 30 days if there is no new activity label May 3, 2024
@matts1
Copy link
Contributor

matts1 commented May 6, 2024

Bumping so it doesn't close this issue

@aignas aignas changed the title Strict rules for PY3 Strict rules that check dependencies and type annotations May 6, 2024
@aignas aignas removed the Can Close? Will close in 30 days if there is no new activity label May 6, 2024
@nachumg
Copy link

nachumg commented Jun 4, 2024

Bump. Any plans to get this done anytime soon?

@hassanhub
Copy link

(from an ex-Googler)
I found both of these features extremely helpful when developing code at Google.
I'm wondering why this hasn't been pushed publicly after this long time (it's been 4 years since this issue was opened).
Any reason why this isn't pushed to the a public release @rickeylev?

@michael-quinlan
Copy link

Also bumping.

Was very valuable when a few of us were at google and we wish it was available in the open verison.

@nachumg
Copy link

nachumg commented Nov 19, 2024

+1

2 similar comments
@bingjeff
Copy link

+1

@andyliu14
Copy link

andyliu14 commented Dec 28, 2024

+1

@rickeylev
Copy link
Collaborator

Related: #2537

github-merge-queue bot pushed a commit that referenced this issue Dec 31, 2024
This adds attributes and fields of use to static analysis.

For type definition files (usually `.pyi` files), the `pyi_srcs` and
`pyi_deps`
fields are added to the rules. They end up in the PyInfo fields
direct_pyi_files
and transitive_pyi_files.

So that static analysis tools can retain access to a target's Python
source files,
even if precompiling is enabled, `direct_original_sources` and
`transitive_original_sources` fields are added to PyInfo.

Work towards #2537,
#296
@rickeylev
Copy link
Collaborator

With #2538, the basics for a basic implementation of strictdeps is now possible.

The last remaining feature gap is handling cases where the importable modules/packages/attributes information is in a file instead of directly within a provider field. This makes it possible to (1) handle importing things from packages (distinguishing between from pkg import attr vs from pkg import missing_module_dependency), (2) handle cases that require a manual definition of what modules/packages/attributes a target makes available, and (3) use a build tool to compute (1) and (2).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests