From bf8d81b117fcfc223b2a3b79cfd9a6f8d59ae81d Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 17 Mar 2015 08:51:49 +1300 Subject: [PATCH 1/5] DST custom coercions. Custom coercions allow smart pointers to fully participate in the DST system. In particular, they allow practical use of `Rc` and `Arc` where `T` is unsized. This RFC subsumes part of [RFC 401 coercions](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md). --- text/0000-dst-coercion.md | 175 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 175 insertions(+) create mode 100644 text/0000-dst-coercion.md diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md new file mode 100644 index 00000000000..2f9bdd1824e --- /dev/null +++ b/text/0000-dst-coercion.md @@ -0,0 +1,175 @@ +- Feature Name: dst-coercions +- Start Date: 2015-03-16 +- RFC PR: (leave this empty) +- Rust Issue: (leave this empty) + +# Summary + +Custom coercions allow smart pointers to fully participate in the DST system. +In particular, they allow practical use of `Rc` and `Arc` where `T` is unsized. + +This RFC subsumes part of [RFC 401 coercions](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md). + +# Motivation + +DST is not really finished without this, in particular there is a need for types +like reference counted trait objects (`Rc`) which are not currently well- +supported (without coercions, it is pretty much impossible to create such values +with such a type). + +# Detailed design + +There is an `Unsize` trait and lang item. This trait signals that a type can be +converted using the compiler's coercion machinery from a sized to an unsized +type. All implementations of this trait are implicit and compiler generated. It +is an error to implement this trait. If `&T` can be coerced to `&U` then there +will be an implementation of `Unsize` for `T`. E.g, `[i32; 42]: +Unsize<[i32]>`. Note that the existence of an `Unsize` impl does not signify a +coercion can itself can take place, it represents an internal part of the +coercion mechanism (it corresponds with `coerce_inner` from RFC 401). The trait +is defined as: + +``` +#[lang="unsize"] +trait Unsize: ::std::marker::PhantomFn {} +``` + +There are implementations for any fixed size array to the corresponding unsized +array, for any type to any trait that that type implements, for structs and +tuples where the last field can be unsized, and for any pair of traits where +`Self` is a sub-trait of `T` (see RFC 401 for more details). + +There is a `CoerceUnsized` trait which is implemented by smart pointer types to +opt-in to DST coercions. It is defined as: + +``` +#[lang="coerce_unsized"] +trait CoerceUnsized: ::std::marker::PhantomFn + Sized {} +``` + +An example implementation: + +``` +impl, U: ?Sized> CoerceUnsized> for Rc {} + +// For reference, the definition of Rc: +pub struct Rc { + _ptr: NonZero<*mut RcBox>, +} +``` + +Implementing `CoerceUnsized` indicates that the self type should be able to be +coerced to the `Target` type. E.g., the above implementation means that +`Rc<[i32; 42]>` can be coerced to `Rc<[i32]>`. + + +## Newtype coercions + +We also add a new built-in coercion for 'newtype's. If `Foo` is a tuple +struct with a single field with type `T` and `T` has at least the `?Sized` +bound, then coerce_inner(`Foo`) = `Foo` holds for any `T` and `U` where +`T` coerces to `U`. + +This coercion is not opt-in. It is best thought of as an extension to the +coercion rule for structs with an unsized field, the extension is that here the +field conversion is a proper coercion, not an application of `coerce_inner`. +Note that this coercion can be recursively applied. + + +## Compiler checking + +### On encountering an implementation of `CoerceUnsized` (type collection phase) + +* The compiler checks that the `Self` type is a struct or tuple struct and that +the `Target` type is a simple substitution of type parameters from the `Self` +type (one day, with HKT, this could be a regular part of type checking, for now +it must be an ad hoc check). We might enforce that this substitution is of the +form `X/Y` where `X` and `Y` are both formal type parameters of the +implementation (I don't think this is necessary, but it makes checking coercions +easier and is satisfied for all smart pointers). +* The compiler checks each field in the `Self` type against the corresponding field +in the `Target` type. Either the field types must be subtypes or be coercible from the +`Self` field to the `Target` field (this is checked taking into account any +`Unsize` bounds in the environment which indicate that some coercion can take +place). Note that this per-field check uses only the built-in coercion +mechanics. It does not take into account `CoerceUnsized` impls (although we +might allow this in the future). +* There must be only one field that is coerced. +* We record in a side table a mapping from the impl to an adjustment. The +adjustment will contain the field which is coerced and a nested adjustment +representing that coercion. The nested adjustment will have a placeholder for +any use of the `Unsize` bound (we should require that there is exactly one such use). + +### On encountering a potential coercion + +* If we have an expression with type `E` where the type `F` is required during +type checking and `E` is not a subtype of `F`, nor is it coercible using the +built-in coercions, then we search for an implementation of `CoerceUnsized` +for `E`. A match will give us a substitution of the formal type parameters of +the impl by some actual types. +* We look up the impl in the side table described above. The substitution is used +with the placeholder in the recorded adjustment to create a new coercion which +will map one field of the struct being coerced. That coercion should always be +valid (if it is not, there is a compiler bug). +* We create a new adjustment for the coerced expression. This will include the +index of the field which is deeply coerced and the adjustment for the coercion +described in the previous step. +* In trans, the adjustment is used to codegen a coercion by moving the coerced +value and changing the indicated field to a new type according to the nested +adjustment. + +### Adjustment types + +We add `AdjustCustom(usize, Box)` and +`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These +represent the new custom and newtype coercions, respectively. We add +`UnsizePlaceHolder(Ty, Ty)` to the `UnsizeKind` enum to represent a placeholder +adjustment due to an `Unsize` bound. + +### Example + +For the above `Rc` impl, we record the following adjustment (with some trivial +bits and pieces elided): + +``` +AdjustCustom(0, AdjustNewType( + AutoDerefRef { + autoderefs: 1, + autoref: AutoUnsafe(mut, AutoUnsize( + UnsizeStruct(UnsizePlaceholder(T, U)))) + })) +``` + +When we need to coerce `Rc<[i32; 42]>` to `Rc<[i32]>`, we look up the impl and +find `T = [i32; 42]` and `U = [i32]` (note that we automatically require that +`Unsize` is satisfied when looking up the impl). We can therefore replace the +placeholder in the above adjustment with `UnsizeLength(42)`. That gives us the +real adjustment to store for trans. + +# Drawbacks + +Not as flexible as the previous proposal. Can't handle pointer-like types like +`Option>`. + +# Alternatives + +The original [DST5 proposal](http://smallcultfollowing.com/babysteps/blog/2014/01/05/dst-take-5/) +contains a similar proposal with no opt-in trait, i.e., coercions are completely +automatic and arbitrarily deep. This is a little too magical and unpredicatable. +It violates some 'soft abstraction boundaries' by interefering with the deep +structure of objects, sometimes even automatically (and implicitly) allocating. + +[RFC 401](https://github.com/rust-lang/rfcs/blob/master/text/0401-coercions.md) +proposed a scheme for proposals where users write their own coercion using +intrinsics. Although more flexible, this allows for implcicit excecution of +arbitrary code. If we need the increased flexibility, I believe we can add a +manual option to the `CoerceUnsized` trait backwards compatibly. + +The proposed design could be tweaked: we could make newtype coercions opt-in +(this would complicate other parts of the proposal though). We could change the +`CoerceUnsized` trait in many ways (we experimented with an associated type to +indicate the field type which is coerced, for example). + +# Unresolved questions + +None From 4d803c14371ca7cbfd327b3db9735ed3ba58e31f Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 17 Mar 2015 12:12:36 +1300 Subject: [PATCH 2/5] Tweak newtype coercions, remove ?Sized requirement. --- text/0000-dst-coercion.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 2f9bdd1824e..2dea9d73a42 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -66,9 +66,8 @@ coerced to the `Target` type. E.g., the above implementation means that ## Newtype coercions We also add a new built-in coercion for 'newtype's. If `Foo` is a tuple -struct with a single field with type `T` and `T` has at least the `?Sized` -bound, then coerce_inner(`Foo`) = `Foo` holds for any `T` and `U` where -`T` coerces to `U`. +struct with a single field with type `T`, then coerce_inner(`Foo`) = `Foo` +holds for any `T` and `U` where `T` coerces to `U`. This coercion is not opt-in. It is best thought of as an extension to the coercion rule for structs with an unsized field, the extension is that here the @@ -121,7 +120,7 @@ adjustment. ### Adjustment types We add `AdjustCustom(usize, Box)` and -`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These +`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These represent the new custom and newtype coercions, respectively. We add `UnsizePlaceHolder(Ty, Ty)` to the `UnsizeKind` enum to represent a placeholder adjustment due to an `Unsize` bound. From 97452ca19a8dab989057b93c01c6e259bc86222e Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Wed, 18 Mar 2015 15:00:57 +1300 Subject: [PATCH 3/5] Remove newtype coercions, make CoerceUnsized coercions more general --- text/0000-dst-coercion.md | 124 ++++++++++++++++++++------------------ 1 file changed, 66 insertions(+), 58 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 2dea9d73a42..0a390ebce24 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -51,34 +51,60 @@ An example implementation: ``` impl, U: ?Sized> CoerceUnsized> for Rc {} +impl, U: ?Sized> NonZero for NonZero {} -// For reference, the definition of Rc: +// For reference, the definitions of Rc and NonZero: pub struct Rc { _ptr: NonZero<*mut RcBox>, } +pub struct NonZero(T); ``` Implementing `CoerceUnsized` indicates that the self type should be able to be coerced to the `Target` type. E.g., the above implementation means that -`Rc<[i32; 42]>` can be coerced to `Rc<[i32]>`. +`Rc<[i32; 42]>` can be coerced to `Rc<[i32]>`. There will be `CoerceUnsized` impls +for the various pointer kinds available in Rust and which allow coercions, therefore +`CoerceUnsized` when used as a bound indicates coercible types. E.g., +``` +fn foo, U>(x: T) -> U { + x +} +``` + +Built-in pointer impls: -## Newtype coercions +``` +impl, U: ?Sized> CoerceUnsized> for Box {} +impl, U: ?Sized, 'a> CoerceUnsized<&'a U> for Box {} +impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for Box {} +impl, U: ?Sized> CoerceUnsized<*const U> for Box {} +impl, U: ?Sized> CoerceUnsized<*mut U> for Box {} + +impl, U: ?Sized, 'a, 'b: 'a> CoerceUnsized<&'a U> for &mut 'b U {} +impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for &mut 'a U {} +impl, U: ?Sized, 'a> CoerceUnsized<*const U> for &mut 'a U {} +impl, U: ?Sized, 'a> CoerceUnsized<*mut U> for &mut 'a U {} -We also add a new built-in coercion for 'newtype's. If `Foo` is a tuple -struct with a single field with type `T`, then coerce_inner(`Foo`) = `Foo` -holds for any `T` and `U` where `T` coerces to `U`. +impl, U: ?Sized, 'a, 'b> CoerceUnsized<&'a U> for &'b U {} +impl, U: ?Sized, 'b> CoerceUnsized<*const U> for &'b U {} -This coercion is not opt-in. It is best thought of as an extension to the -coercion rule for structs with an unsized field, the extension is that here the -field conversion is a proper coercion, not an application of `coerce_inner`. -Note that this coercion can be recursively applied. +impl, U: ?Sized> CoerceUnsized<*const U> for *mut U {} +impl, U: ?Sized> CoerceUnsized<*mut U> for *mut U {} + +impl, U: ?Sized> CoerceUnsized<*const U> for *const U {} +``` + +Note that there are some coercions which are not given by `CoerceUnsized`, e.g., +from safe to unsafe function pointers, so it really is a `CoerceUnsized` trait, +not a general `Coerce` trait. ## Compiler checking ### On encountering an implementation of `CoerceUnsized` (type collection phase) +* If the impl is for a built-in pointer type, we check nothing, otherwise... * The compiler checks that the `Self` type is a struct or tuple struct and that the `Target` type is a simple substitution of type parameters from the `Self` type (one day, with HKT, this could be a regular part of type checking, for now @@ -87,63 +113,46 @@ form `X/Y` where `X` and `Y` are both formal type parameters of the implementation (I don't think this is necessary, but it makes checking coercions easier and is satisfied for all smart pointers). * The compiler checks each field in the `Self` type against the corresponding field -in the `Target` type. Either the field types must be subtypes or be coercible from the -`Self` field to the `Target` field (this is checked taking into account any -`Unsize` bounds in the environment which indicate that some coercion can take -place). Note that this per-field check uses only the built-in coercion -mechanics. It does not take into account `CoerceUnsized` impls (although we -might allow this in the future). +in the `Target` type. Assuming `Fs` is the type of a field in `Self` and `Ft` is +the type of the corresponding field in `Target`, then either `Ft <: Fs` or +`Fs: CoerceUnsized` (note that this includes built-in coercions). * There must be only one field that is coerced. -* We record in a side table a mapping from the impl to an adjustment. The -adjustment will contain the field which is coerced and a nested adjustment -representing that coercion. The nested adjustment will have a placeholder for -any use of the `Unsize` bound (we should require that there is exactly one such use). +* We record for each impl, the index of the field in the `Self` type which is +coerced. -### On encountering a potential coercion +### On encountering a potential coercion (type checking phase) * If we have an expression with type `E` where the type `F` is required during type checking and `E` is not a subtype of `F`, nor is it coercible using the -built-in coercions, then we search for an implementation of `CoerceUnsized` -for `E`. A match will give us a substitution of the formal type parameters of -the impl by some actual types. -* We look up the impl in the side table described above. The substitution is used -with the placeholder in the recorded adjustment to create a new coercion which -will map one field of the struct being coerced. That coercion should always be -valid (if it is not, there is a compiler bug). -* We create a new adjustment for the coerced expression. This will include the -index of the field which is deeply coerced and the adjustment for the coercion -described in the previous step. -* In trans, the adjustment is used to codegen a coercion by moving the coerced -value and changing the indicated field to a new type according to the nested -adjustment. +built-in coercions, then we search for a bound of `E: CoerceUnsized`. Note +that we may not at this stage find the actual impl, but finding the bound is +good enough for type checking. -### Adjustment types +* If we require a coercion in the receiver of a method call or field lookup, we +perform the same search that we currently do, except that where we currently +check for coercions, we check for built-in coercions and then for `CoerceUnsized` +bounds. We must also check for `Unsize` bounds for the case where the receiver +is auto-deref'ed, but not autoref'ed. -We add `AdjustCustom(usize, Box)` and -`AdjustNewtype(Box)` to the `AutoAdjustment` enum. These -represent the new custom and newtype coercions, respectively. We add -`UnsizePlaceHolder(Ty, Ty)` to the `UnsizeKind` enum to represent a placeholder -adjustment due to an `Unsize` bound. -### Example +### On encountering an adjustment (translation phase) -For the above `Rc` impl, we record the following adjustment (with some trivial -bits and pieces elided): +* In trans (which is post-monomorphisation) we should always be able to find an +impl for any `CoerceUnsized` bound. +* If the impl is for a built-in pointer type, then we use the current coercion +code for the various pointer kinds (`Box` has different behaviour than `&` and +`*` pointers). +* Otherwise, we lookup which field is coerced due to the opt-in coercion, move +the object being coerced and coerce the field in question by recursing (the +built-in pointers are the base cases). -``` -AdjustCustom(0, AdjustNewType( - AutoDerefRef { - autoderefs: 1, - autoref: AutoUnsafe(mut, AutoUnsize( - UnsizeStruct(UnsizePlaceholder(T, U)))) - })) -``` -When we need to coerce `Rc<[i32; 42]>` to `Rc<[i32]>`, we look up the impl and -find `T = [i32; 42]` and `U = [i32]` (note that we automatically require that -`Unsize` is satisfied when looking up the impl). We can therefore replace the -placeholder in the above adjustment with `UnsizeLength(42)`. That gives us the -real adjustment to store for trans. +### Adjustment types + +We add `AdjustCustom` to the `AutoAdjustment` enum as a placeholder for coercions +due to a `CoerceUnsized` bound. I don't think we need the `UnsizeKind` enum at +all now, since all checking is postponed until trans or relies on traits and impls. + # Drawbacks @@ -164,8 +173,7 @@ intrinsics. Although more flexible, this allows for implcicit excecution of arbitrary code. If we need the increased flexibility, I believe we can add a manual option to the `CoerceUnsized` trait backwards compatibly. -The proposed design could be tweaked: we could make newtype coercions opt-in -(this would complicate other parts of the proposal though). We could change the +The proposed design could be tweaked: for example, we could change the `CoerceUnsized` trait in many ways (we experimented with an associated type to indicate the field type which is coerced, for example). From 7468c306816c136dff320515de73e36061b16f7c Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Tue, 24 Mar 2015 13:39:12 +1300 Subject: [PATCH 4/5] Address some comments --- text/0000-dst-coercion.md | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 0a390ebce24..5c8ab6329a1 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -1,4 +1,4 @@ -- Feature Name: dst-coercions +- Feature Name: dst_coercions - Start Date: 2015-03-16 - RFC PR: (leave this empty) - Rust Issue: (leave this empty) @@ -51,7 +51,7 @@ An example implementation: ``` impl, U: ?Sized> CoerceUnsized> for Rc {} -impl, U: ?Sized> NonZero for NonZero {} +impl, U: ?Sized> CoerceUnsized> for NonZero {} // For reference, the definitions of Rc and NonZero: pub struct Rc { @@ -107,7 +107,9 @@ not a general `Coerce` trait. * If the impl is for a built-in pointer type, we check nothing, otherwise... * The compiler checks that the `Self` type is a struct or tuple struct and that the `Target` type is a simple substitution of type parameters from the `Self` -type (one day, with HKT, this could be a regular part of type checking, for now +type (i.e., That `Self` is `Foo`, `Target` is `Foo` and that there exist +`Vs` and `Xs` (where `Xs` are all type parameters) such that `Target = [Vs/Xs]Self`. +One day, with HKT, this could be a regular part of type checking, for now it must be an ad hoc check). We might enforce that this substitution is of the form `X/Y` where `X` and `Y` are both formal type parameters of the implementation (I don't think this is necessary, but it makes checking coercions @@ -115,7 +117,8 @@ easier and is satisfied for all smart pointers). * The compiler checks each field in the `Self` type against the corresponding field in the `Target` type. Assuming `Fs` is the type of a field in `Self` and `Ft` is the type of the corresponding field in `Target`, then either `Ft <: Fs` or -`Fs: CoerceUnsized` (note that this includes built-in coercions). +`Fs: CoerceUnsized` (note that this includes some built-in coercions, coercions +unrelated to unsizing are excluded, these could probably be added later, if needed). * There must be only one field that is coerced. * We record for each impl, the index of the field in the `Self` type which is coerced. @@ -156,8 +159,7 @@ all now, since all checking is postponed until trans or relies on traits and imp # Drawbacks -Not as flexible as the previous proposal. Can't handle pointer-like types like -`Option>`. +Not as flexible as the previous proposal. # Alternatives From 4b99006cac67f7ad89c670bcb351d32bc0d5d33f Mon Sep 17 00:00:00 2001 From: Nick Cameron Date: Wed, 3 Jun 2015 16:55:41 +1200 Subject: [PATCH 5/5] eddyb's changes --- text/0000-dst-coercion.md | 28 +++++++++++----------------- 1 file changed, 11 insertions(+), 17 deletions(-) diff --git a/text/0000-dst-coercion.md b/text/0000-dst-coercion.md index 5c8ab6329a1..59e27ae0c9b 100644 --- a/text/0000-dst-coercion.md +++ b/text/0000-dst-coercion.md @@ -51,13 +51,13 @@ An example implementation: ``` impl, U: ?Sized> CoerceUnsized> for Rc {} -impl, U: ?Sized> CoerceUnsized> for NonZero {} +impl, U: Zeroable> CoerceUnsized> for NonZero {} // For reference, the definitions of Rc and NonZero: pub struct Rc { _ptr: NonZero<*mut RcBox>, } -pub struct NonZero(T); +pub struct NonZero(T); ``` Implementing `CoerceUnsized` indicates that the self type should be able to be @@ -75,24 +75,18 @@ fn foo, U>(x: T) -> U { Built-in pointer impls: ``` -impl, U: ?Sized> CoerceUnsized> for Box {} -impl, U: ?Sized, 'a> CoerceUnsized<&'a U> for Box {} -impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for Box {} -impl, U: ?Sized> CoerceUnsized<*const U> for Box {} -impl, U: ?Sized> CoerceUnsized<*mut U> for Box {} +impl<'a, 'b: 'aT: ?Sized+Unsize, U: ?Sized> CoerceUnsized<&'a U> for &'b mut T {} +impl<'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<&'a mut U> for &'a mut T {} +impl<'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<*const U> for &'a mut T {} +impl<'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<*mut U> for &'a mut T {} -impl, U: ?Sized, 'a, 'b: 'a> CoerceUnsized<&'a U> for &mut 'b U {} -impl, U: ?Sized, 'a> CoerceUnsized<&mut 'a U> for &mut 'a U {} -impl, U: ?Sized, 'a> CoerceUnsized<*const U> for &mut 'a U {} -impl, U: ?Sized, 'a> CoerceUnsized<*mut U> for &mut 'a U {} +impl<'a, 'b: 'a, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<&'a U> for &'b T {} +impl<'b, T: ?Sized+Unsize, U: ?Sized> CoerceUnsized<*const U> for &'b T {} -impl, U: ?Sized, 'a, 'b> CoerceUnsized<&'a U> for &'b U {} -impl, U: ?Sized, 'b> CoerceUnsized<*const U> for &'b U {} +impl, U: ?Sized> CoerceUnsized<*const U> for *mut T {} +impl, U: ?Sized> CoerceUnsized<*mut U> for *mut T {} -impl, U: ?Sized> CoerceUnsized<*const U> for *mut U {} -impl, U: ?Sized> CoerceUnsized<*mut U> for *mut U {} - -impl, U: ?Sized> CoerceUnsized<*const U> for *const U {} +impl, U: ?Sized> CoerceUnsized<*const U> for *const T {} ``` Note that there are some coercions which are not given by `CoerceUnsized`, e.g.,