-
Notifications
You must be signed in to change notification settings - Fork 777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opaque Object Layout #4678
base: main
Are you sure you want to change the base?
Opaque Object Layout #4678
Conversation
This allows for variable sized base classes in future by using negative basicsize values.
Variable sized base classes require the use of relative offsets
Metaclasses cannot define `__new__` so `__init__` is the only alternative.
I didn't realise I was going to hit the issue that the minimum supported rust version of pyo3 (1.63) doesn't support generic associated types which I'm using for Rust 1.65 (the first with GAT support) was released on 2022-11-03 and python 3.12 (the first version to allow extending variable sized base classes) was released on 2023-10-02. I'm going to bump the MSRV to 1.65 but I'm open to other suggestions that avoid requiring GATs. I checked the MSRV requirements according to Debian bookworm is supported until June 2026 which seems like a long time to wait... |
4ce2db0
to
401af6c
Compare
7820766
to
de2c2a3
Compare
de2c2a3
to
426b218
Compare
You can't do that unfortunately. Our msrv policy is at https://github.com/PyO3/pyo3/blob/main/Contributing.md#python-and-rust-version-support-policy What you can do is to feature gate this and only enable the functionality on Rust 1.65 and up, or figure out a way to do it without gats. |
That sounds good to me. Do you mean introduce a feature like Would it be sufficient to introduce the feature and have it disabled by default? |
A feature is one way to do it, a cfg that is automatically enabled when the build script detects it's on rust 1.65+ is another method. Both have pros and cons as far as user experience goes, so it's best to have a solution that doesn't need them. The layout code is also reasonably complex, so I'd rather not mix conditional compilation into it if possible. |
The design currently revolves around the
This way the layout is set generically in the base types like The ways I can currently see around this are:
I'm going to implement (1) for now because the steps seem clearest, so this PR can be in a state where it could hypothetically be merged, then I'm open to suggestions if someone finds a way around the limitations I'm running into. Attempt 1The problem here is that the static and variable markers aren't known to be disjoint, so they conflict. trait IsStaticType {}
trait IsVariableType {}
struct PyAny;
impl IsStaticType for PyAny {}
struct PyType;
impl IsVariableType for PyType {}
struct Layout<T> { _data: PhantomData<T> }
trait LayoutMethods {
fn access_data();
}
impl<T: IsStaticType> LayoutMethods for Layout<T> {
fn access_data() {
println!("hi from static");
}
}
impl<T: IsVariableType> LayoutMethods for Layout<T> { // ERROR: conflicting implementations
fn access_data() {
println!("hi from variable");
}
} Attempt 2The problem here is that trait LayoutMethods {
fn access_data() {
panic!("unknown layout")
}
}
impl<T: PyClassImpl> LayoutMethods for T {}
trait StaticLayout: Sized + 'static + LayoutMethods {
fn access_data() {
println!("hi from static");
}
}
impl<T, B> StaticLayout for T
where
T: PyClassImpl<BaseType = B> + 'static,
B: StaticLayout,
{ }
trait VariableLayout: Sized + 'static + LayoutMethods {
fn access_data() {
println!("hi from variable");
}
}
impl<T, B> VariableLayout for T
where
T: PyClassImpl<BaseType = B> + 'static,
B: VariableLayout,
{ }
trait PyClassImpl { type BaseType; }
struct PyAny;
impl LayoutMethods for PyAny {}
impl StaticLayout for PyAny {}
struct PyType;
impl LayoutMethods for PyType {}
impl VariableLayout for PyType {}
struct MyClass {}
impl PyClassImpl for MyClass {
type BaseType = PyAny;
} |
def6754
to
26920d5
Compare
26920d5
to
53a13e1
Compare
I have finished my proof of concept 🙂 It supports using the opaque layout and extending GATs are no longer required, replaced with This is a large change so it might be best to split this over several PRs. I think It would still be a good idea to align on whether the general approach here is good then the specifics can be refined in the individual PRs? I'm also happy to keep everything together if you prefer. There is a lot more code now because in my original post I forgot to remove the temporary use of There is a limitation with extending |
67b25ab
to
c97b64b
Compare
Awesome, thank you for driving this forward! If others don't get to this sooner, I will try to make time for this on a Friday in the next few weeks most likely. I have quite a lot of family demands at the moment so it may take a bit of time to find cognitive space to really dig in to this 🙏 |
This PR introduces the capability to inherit from classes where the exact layout/size is not known using the mechanism described in PEP 697. This allows inheriting from
type
in order to create metaclasses. These metaclasses cannot yet be used with pyo3 classes (i.e.#[pyclass(metaclass=Foo)]
) but this should be a possibility in future. It may also be possible in future for pyo3 classes to extend builtin classes using the limited API and to potentially extend imported classes (egenum.EnumType
).This PR supersedes my first attempt at creating metaclasses: #4621. Previously I was using the normal class layout which relies on the exact layout of every base class to be known. For
type
the base structure isffi::PyHeapTypeObject
. The contents of this structure are intended to be private (even for the 'unlimited' API) and so an alternative mechanism is required (PEP 697).Instead of nesting each base class at the beginning of the derived class in memory, objects created with PEP 697 only require knowledge of how much additional memory the derived class requires and the address of this section of memory is accessed via PyObject_GetTypeData. The PEP 697 mechanism could potentially be used to inherit from other builtins when only the 'limited' API is available however many of these builtins also store items (eg
list
,set
) which introduces more complications, so this PR focuses on inheriting fromPyType
only.This PR is a proof of concept but I am sure that some of the decisions I made aren't the best so feedback is welcome. The current design is as follows:
Before this PR the only possible layout was the 'static' layout where the size of the base class is known. This layout was described by
PyClassObject<T>
(now renamed toPyStaticClassObject
). This PR introduces hides the details of the static layout behind a traitInternalPyClassObjectLayout
so that other layouts are possible.Before this PR:
PyClassObject<T>
was used to obtain the exact layout ofT: PyClassImpl
assuming that layout is the 'static' layout. This PR introducesPyClassImpl::Layout
so that a pyclass declares its layout. The pyclass usesPyTypeInfo::Layout
to setT::Layout
. Each base type declares eitherPyTypeInfo::Layout = PyStaticClassObject
orPyTypeInfo::Layout = PyVariableClassObject
.Potential Improvements
Some questions still remain:
__init__
approach, how to initialise super classes?__init__
approach, how to restrict return value to()
orPyResult<()>
?type -> MyMetaclass -> MySubMetaclass
but I think there are some outstanding problems. I also am not sure how to prevent users from doing this so I have left it for now.setattr
works without using#[pyclass(dict)]
but the set values aren't accessible in the subclasssetattr
on a class extendingtype
does not fail, unlike on a class extendingobject
. I'm not sure if this meanstype
has__dict__
already?Initialisation
I'm not sure what the best way is to handle initialisation.
tp_new
cannot be used with metaclasses (source) however the best semantics of__init__
aren't clear. For now I am requiring that metaclasses implementDefault
which allows the struct to be initialised before__init__
is called (but this is a hack and may have issues properly initialising superclasses). There are runtime checks when a class is constructed that metaclasses do not define#[new]
and do define__init__
. This should be sufficient to ensure that users cannot create a struct with uninitialized memory.An alternative to using
__init__
could be to repurpose#[new]
to wrap it and assign it totp_init
instead in the case of a metaclass but that would be more complex to implement.