-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Remove python version specific usage within extensions #1270
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1270
Note: Links to docs will display an error until the docs builds have been completed. ❌ 15 New FailuresAs of commit ac340bb with merge base addcb24 (): NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
Mark has more experience here, We would also want to update the wheel build flow so that it produces py3 wheels not once that specify the version? As well I think integration into the Nova workflows so that we are pushing the right artifacts is important |
a. Is it okay that there is no longer a python module for torchao._C? I don't think we need it but would need tests to work for sure b. Are there other ways to install ao (other than pip install and pip wheel) that I need to test? Or other uses of the custom ops outside of test/test_ops.py that I need to verify? Nope that's it, granted you'll need to see what this does in CI though specifically the build_linux_wheel job which will build us for many different cuda and python versions. Also will need to sanity check that torchao custom extensions work when you download it on a machine with varying python versions c. Do we still want the old behavior to exist? Is there any reason we'd still want to build wheels per python? It's mostly just a function of wanting custom ops working, a python agnostic version would be better d. I assumed py38 was the lowest version supported by ao, but I later learned that pytorch has moved to 3.9 so that should probably be 3.9 now. Does ao follow pytorch in terms of python support? we just drop support when python versions are at EOL https://devguide.python.org/versions/ |
@@ -114,6 +114,7 @@ def get_extensions(): | |||
extension( | |||
"torchao._C", | |||
sources, | |||
py_limited_api=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice to have it, but imo not necessary. At least by default, linker will ignore the dependencies, unless one uses some symbols from that library (unless library is linked with -Wno-as-needed
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is necessary only to get python to recognize the extension + therefore the wheel as python agnostic. e.g., for pypi packaging and naming purposes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also for our cppextension to know not to drag in libtorch_python
from importlib.util import find_spec | ||
from pathlib import Path | ||
spec = find_spec("torchao") | ||
assert spec is not None, "torchao python module spec is unexpectedly None" | ||
SO_PATH = Path(spec.origin).parent / "_C.abi3.so" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Easiest solution are usually the best one :)
from importlib.util import find_spec | |
from pathlib import Path | |
spec = find_spec("torchao") | |
assert spec is not None, "torchao python module spec is unexpectedly None" | |
SO_PATH = Path(spec.origin).parent / "_C.abi3.so" | |
SO_PATH = Path(__file__).parent / "_C.abi3.so" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't work in the pip wheel -e . followed by the pip install *.whl case!
In that case the _C.abi3.so will live in site-packages (or wherever the user's install is) so we need a more encompassing way to access the location here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share an example? And perhaps it talks about some gaps in setup.py
as _C
should have been installed in the same folder as __init__.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this should indeed work in the pip wheel/pip install case, because the init.py file and the _C.abi3.so are in the same folder under site-packages:
It's funny because I actually started off with what you had lol so dang I did some extra work :'/
Do you know where the logic lives to always place the _C under the same folder as init.py? I don't see mention of it in the ao setup.py, so is this a Python guarantee for custom extensions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know where the logic lives to always place the _C under the same folder as init.py?
@janeyx99 I think it's a setuptools thing
Lines 114 to 116 in 01dc7da
extension( | |
"torchao._C", | |
sources, |
So it automatically understands that it should put _C.so
under torchao
folder. I played around with meson-python a bit for another project, and they don't do the same thing (meson-python will create torchao._C.so
instead)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah thanks for the info @gau-nernst!
Alright thanks for your comments! Sounds like ao would like a single wheel more so I'll make the CI do that as my next step in the PR (amidst getting all the other tests to pass :p) @msaroufim Just FYI I've done the sanity checks and verified that the custom ops worked + detailed it in the test plan above (that was honestly the most tedious part of this task lol). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
py_limited_api
is a nice touch, but strictly speaking is not necessary, , unless cpp_extensions links torch_python with-Wno-as-needed
or something like that- Python have a weird tendency to call their extensions
.so
files even on platforms where they are not (and AOTI follows in that anti-threns), but shouldn't it be.dll
on Windows and.dylib
on MacOS (There are probably already some utils in cpp extensions to decorate library names with platform appropriate prefixes and suffixes )
* initial test * Pad casual mask with zeroes and set decoder max_seq_len to the max sequence length so their shapes are both set to the max_seq_len * Fix control bug for image inputs * Clear image input after submitting a chat * Include empty assistant message for chat * Pipe image input from CLI --------- Co-authored-by: Jack-Khuu <[email protected]> Co-authored-by: vmpuri <[email protected]>
This PR will be landed in two stages in
#1276
#1277
Replace PYBIND usage with torch.ops.load_library in order to allow ao to build 1 wheel for multiple python versions.
This requires pytorch/pytorch#138088
Why is this cool?
You no longer have to build an ao wheel for every python version, saving time, resources, and pypi package space (though this last one might not be a concern).
This proves that there is a way to use torch CPP extensions without dragging an unnecessary python dependency, which was not the case before.
One disclaimer:
This does mean that there will be no more torchao._C python module, which may be BC breaking. There are ways to not BC break (by creating a _C.py and re-exposing all the functions) so I'm not too concerned about that. Though some reviewer should let me know if this is crucial.
Test Plan
Yes, it looks like there is still much CI to fix for ao, but I wanted to be confident that the core change worked before sinking the time into this. How did I gain confidence in this change? Locally, I built ao for python 3.10 using
This produced
torchao-0.7.0+gitac340bb4-cp38-abi3-linux_x86_64.whl
Then, I installed this wheel within other conda environments (with fresh torch) for python 3.9 through python 3.12 using
pip install torchao-0.7.0+gitac340bb4-cp38-abi3-linux_x86_64.whl
To verify this wheel was good, I ranpython test/test_ops.py
to check that all the tests passed, which they do.I also made sure that the python setup.py develop path was not broken. How? I ran
pip install -e .
on the code in this PR and ensured thatpython test/test_ops.py
passed in both 3.9 and 3.10.What do I want review on?
a. Is it okay that there is no longer a python module for torchao._C? A quick GH search shows 0 uses https://github.com/search?q=torchao._C&type=code in public code.
b. Are there other ways to install ao (other than pip install and pip wheel) that I need to test? Or other uses of the custom ops outside of test/test_ops.py that I need to verify?
c. Do we still want the old behavior to exist? Is there any reason we'd still want to build wheels per python?
d. I assumed py38 was the lowest version supported by ao, but I later learned that pytorch has moved to 3.9 so that should probably be 3.9 now. Does ao follow pytorch in terms of python support?
Based on the answers to the above qs, I will clean up this PR accordingly.