-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Placement New #17057
base: master
Are you sure you want to change the base?
add Placement New #17057
Conversation
Thanks for your pull request, @WalterBright! Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub run digger -- build "master + dmd#17057" |
29a5064
to
4db86a4
Compare
311f2cb
to
fd23a02
Compare
I got it to work for some basic cases. Next comes extending it to more complex ones. |
6456c9a
to
c1d3813
Compare
Placement new for structs seem to be working now! |
5f00e37
to
49f4a4c
Compare
If i'm reading this right, this will allow classes in betterC, using a kind of "allocator argument" like in $OTHER_LANGUAGES? Once upon a time I made my own internal fork of dmd that forced betterC on all D code but of course that didn't last long because it couldn't compile Phobos. (I should clarify, my usecase is making a kernel with betterC and so far I have had to use a hacky mixin and alias this in order to use inheritance. Also, will this unlock the door for interfaces?) |
16cdbd6
to
448abed
Compare
I'm sorry, but using a union as primary (singular!) case-study is completely out of order. You can write this: new((&ue)[0..1]) IntegerExp(loc, ~e1.toInteger(), type); Yeah, it's not perfect, but the union is actually the problem... don't wreck everything for this one degenerate case. My wall of text above should be seriously considered... I think it's a really strong design, and solves a whole lot of problems that C++ (and D) has with lifetime flow, it makes the start/end of lifetimes feel natural while also making the typed variables naturally match the lifetimes of the values they represent, and it also solves convenient recycling of memory. |
Given:
how does this look:
? Where:
Not bad! Only one dirty cast for all the new's, no &, and the user needn't use a |
|
This is what a function called void[] mallocate(size_t);
ref T mallocate(T)()
{
return *new(mallocate(T.sizeof)) T;
} No casts. It's completely @safe. No function that returns a T should return an invalid/uninitialised T ever. |
My whole point is, there's an opportunity here to eliminate UB, invalid memory references, and unsafe casts associated with lifetime management from the language specification entirely... and that's a HUGE deal. If you haven't understood the point of what I'm suggesting, then we need to get on a call and talk. |
We both want the same thing. But realistically, there's no way to not have UB when allocating and initializing memory. That's why at some point it's going to have to not be verifiably safe by the compiler. As for void initialized objects,
we already have that in the language, and it is supported. But only in system code. Casting an array of void into a real object is always going to be system code. I appreciate your efforts to make this as safe as practical. But I also want the union case to be easy - this also applies to option types and sum types. And I want things to be correct without adding runtime checks. So here goes, I verified that it works:
I changed your proposal of a slice
Efficient, compact, and no monkey business. |
This said though; you keep using And sorry, let me be more clear; yes I understand there will be unsafe escape hatches (union for instance, that's unsafe by definition), and some low-level coercions are still available and useful, but what I'm trying to achieve is where the common case (allocate and initialise) have no casts, no weird shit, so that flow can actually be completely safe. I thought about static array briefly too. It seems workable, but it's got some ergonomic problems. You can't pass fixed arrays easily, because you need a different overload of a function for every array size, which means this will always be wrapped in something that; receives a dynamic array, does a bounds check, and then slices the appropriately bounded static array, then pass to a call like you show above. The nice thing about a fixed array is that you can elide runtime bounds checking, which is nice, but I think the ergonomic problem carries more weight. Memory allocation is NEVER a high-frequency operation; I can't imagine any situation where I wouldn't tolerate a runtime bounds check in the Perhaps you'd consider accepting BOTH fixed array and also dynamic array; if a dynamic array was supplied, perform a runtime bounds check, static array, static bounds check? I want to re-enforce though that my proposal is designed to work very nicely if we also make |
It doesn't work because it returns a
Placement new has a major use case for unions, option types, and sum types, and these can be very high frequency. See CTFE, which I sped up quite a bit by using emplace on a stack allocated union. I've also written a Javascript interpreter - the operand stack is a critical high-frequency piece of code, and the operands get overwritten like any stack. As I've shown, the static array method generates optimal code, so the user doesn't need to hack around it.
Not a problem, it's still castable to a pointer to a static array, since the length is known in advance.
I haven't implemented placement new for variable sized arrays yet. The proposed solution uses a template to take care of placement new expressions for types of fixed size. I don't think it is a big deal, it's just a few instructions.
I expect a different implementation will be required for new'ing variable length arrays, meaning a
Placement new is never going to be safe. Prefixing a destroy() means that void-initialized objects will also be destroyed, and that's UB for sure. |
e99540d
to
6d6e632
Compare
No, your code you wrote: ref void[T.sizeof] mallocate(T)() {
return *(cast(void[T.sizeof]*) malloc(T.sizeof)); // <-- this is gross and blind cast/thunk is unsafe
} You don't need to write that. Just write what I suggested instead; you don't need the thunk: ref void[T.sizeof] mallocate(T)() {
return malloc(T.sizeof)[0 .. T.sizeof]; // <-- this is not gross and it is safe
}
Don't cast, just slice it to the proper length.
I don't see why you think this... it's definitely safe. The example code you wrote above is already safe, ie: struct S { int i = 1, j = 4, k = 9; }
ref void[T.sizeof] mallocate(T)() {
return malloc(T.sizeof)[0 .. T.sizeof]; // no casts, nothing nasty
}
void main() {
S* ps = new(mallocate!S()) S;
assert(ps.i == 1);
assert(ps.j == 4);
assert(ps.k == 9);
} What's unsafe about this? It's completely fine by my reading.
I don't know what |
|
||
void test6() | ||
{ | ||
S6* ps = new(mallocate!S6()) S6; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the case we recently discussed... but what are all the other cases?
I thought we finally agreed on raw buffer initialisation using sized void buffers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whether the void array buffers or used or a variable is up to the user. I left it in because it makes it most convenient when dealing with unions / optional types / sum types. The implementation doesn't care what type it is, only that it is an lvalue of sufficient size.
I didn't add tests for other cases, as they would be redundant. A tutorial should include them, however.
void test0() | ||
{ | ||
int i; | ||
int* pi = new (i) int; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought we ejected this pattern into space? Please, just no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I responded to this in the other comment.
6d6e632
to
efb517c
Compare
This is code straight ripped from my kernel:
I believe placement new would replace at least the d_new, but what about the d_delete? |
you're right, that is better. Thanks!
It's equivalent to this:
I.e. the compiler cannot tell what it's stepping on, as it does not know what the provenance of the |
Okay, so, what I'm suggesting by This assumption and assertion only works well when the function is specified to accept If the user has responsibility for the unsafe cast, then the operation itself is safe... this is quite specifically why I reject the idea that new would receive a T so vehemently; it fundamentally undermines the opportunity to make this whole thing safe, and wrecks lifetime analysis. |
D allows conversion of a type to *void in @safe code, but not the other way around. Passing a void[] through new to convert it to an object is not checkably safe. @System code is not necessarily unsafe, it just means its up to the user to ensure its safety as the compiler cannot do it. Memory allocation and initialization is always going to be uncheckable unless the language controls the whole process - like what gc new does and like what stack variables do.
Sure, but it cannot be marked @safe. Not accepting T as an argument means the user will have to cast it to void*, which will look ugly. Casting to void* is a safe operation, so it becomes an ugly construct with no added value. Taking a T argument is useful for unions, option types, and sum types. Although you don't use them, they are popular constructs (and I use them in D and would likely use them even more with placement new!). Note there's an inconsistency in the compiler - in @safe code, a The
Such efficiency matters a great deal with unions, option types, and sum types. We have to be competitive. |
I don't see how you get from
I don't think it looks ugly, it looks appropriate. It's visible to the reader that you're taking the buffer from under a value; which is what you are doing. I think the clarity of that operation holds huge value. And again, you're arguing from niche-cases; the default case that people will type doesn't encounter any of this; the 2 cases you've demonstrated that do are your union (unsafe by definition; special handling is completely appropriate), and a I think it's fundamentally bad for the language at the lowest level to promote or require handling invalid-but-already-typed objects. If the variable hasn't been initialised yet, it's memory, so it's Talking about option types and sum types is a red-herring; I use option types and sum types too, but they are a tool in the toolbox. You don't write placement new ever while using those objects; they have strong API associated with initialisation and assignment. Everything's internal, nobody will handle placement new expressions when interacting with those objects. Optimise for what people actually do in their code, which is allocate memory and then initialise it.
Is that an inconsistency? That's specifically what I'm taking advantage of.
They weren't "chosen"; you just trivially decided it with no warning or discussion, against the strongest advice I can possibly muster.
Complete red herring; I'm not suggesting anything that would change the codegen in any way whatsoever; I'm interested in getting the expression right; clear, self-explanatory, and relatively foolproof. Nothing I'm discussing changes a single thing about the codegen. I have proposed the OPTION to accept |
I avoided that because if
I use emplace 206 times in the compiler front end. I use a more primitive method extensively in the optimizer and back end. It's also used extensively in Phobos. We're just going in circles now repeating ourselves. The good news is that it works just fine for your use case. You can get rid of emplace now! I'm a bit less lucky, as we're still using an ancient version of D for the bootstrap compiler, grr grr.
This is a bit unfair. If Bob writes a proposal, he makes numerous decisions. People then ask for a reference implementation. Bob implements it, discovering that more decisions have to be made. Then the discussion happens. That's normally how things work. |
I don't follow. I don't see how there's any room for special cases here?
How is it used? If you're doing low-level/unsafe hack-ey stuff (like void init, and your unions), then just wrap a principled new in a function you like. The language should present the clearest and most fool-proof spec possible. The principle is as simple and clear as day: before the If you want to do some hacky stuff, then just wrap it in a function that does your preferred casts or type puns in whatever way you like; the world is your oyster. Don't wreck the spec for your personal convenience. If I'm wrong; the spec can be relaxed, but it can never be tightened...
It's because I consistently feel casually dismissed, and I always seem to feel that the only way I can be heard eventually is to repeat myself over and over. I'm having flashbacks of that time we needed to mangle C++ symbols with a namespace; that really did my head in! I don't know how to say this without sounding arrogant, but I know about this stuff better than most, and I am completely confident my spec on rvalue semantics, and placement new/delete (which are both related tools), will lead this aspect of the language to a really great place that people will be excited about. I wish you'd trust me enough to try my proposals verbatim before perverting them. Let them show their merit before casually dismissing them or butchering them into something I'm no longer excited about... you've spent a few days thinking about this stuff, but I've been baking for years or decades, and I also know the landscape; all these semantics are things I've been using for a very long time. I've worked on a broad and varied range of projects and written millions of lines of code in projects where these tools and patterns are present. I know what they're for, where they lead, and how they scale. We must have gone round in circles on the rvalue conversation no less than 10 times (possibly more!), but we got there eventually... so I'm not sure going around in circles isn't eventually productive.
This is a different matter. 'Fine' is not the benchmark we should be trying to achieve when finally correcting some of the most fundamental aspects of the D spec. This is a once-in-a-lifetime opportunity to get this stuff really right... and from my point of view, there is nothing in the language that needs work more important than these 2 things.
I don't understand. I see that the other way around. From my point of view; this is my proposal... I'm glad you liked it and ran with it, but there was a lot more to it that I couldn't easily share in the initial conversation we had. |
I realised what you meant here... this situation naturally occurs when you want to new an array though, so the case still has to be handled? |
efb517c
to
606be08
Compare
I am catching up after 4 days without internet or power. Seattle is a technologically advanced city with a primitive electric grid.
I'm sorry you got that impression. I've provided a rationale for each decision.
You did indeed propose The first line of the first post of the DIP in the n.g. on Oct 30: "Based on a suggestion by Manu Evans:" The dip is about passing an lvalue rather than a pointer.
It's a good thing we're friends, Manu! If you want things verbatim, write the DIP the way you want it. I wrote it the way I thought would work best. I've never seen an idea that survived into a specification without modification, like the replacement of void* with void[]. I've also never seen a specification that survived into an implementation without modification. (I rewrote the DIP after this implementation.) You've wanted 3 things, and I want them too:
The first two now have implementations and will work for your use cases. The other aspect of placement new works for my use cases. We should be both happy about this.
I didn't see the more, other than what I mentioned earlier. I know placement new from C++, having implemented it. I also implemented that in an earlier version of D: https://dlang.org/deprecate.html#Class%20allocators%20and%20deallocators and the implementation code for it is still present (and I made some use of it!).
Ok, but you've also made it clear it is not what you proposed :-) The hinge of our disagreement is placement new cannot be made |
Not at all, the hinge of our disagreement is that accepting a By writing: T x;
new(x) T; There's nothing you could do to make it look more like a value of I think if you were going to design this API, it's not possible to make the API worse than that; it communicates exactly the wrong thing, and I don't think you could inspire misunderstanding and user error more confidently if you tried. Before Yes I know you can write |
Could the current semantics of C++'s " Placement new" described in https://en.cppreference.com/w/cpp/language/new give insights on what type(s) that should be allowed to be passed as argument to |
C++ accepts |
The trouble with the function apparently accepting |
This implements Placement New, described in https://github.com/WalterBright/documents/blob/master/placementnew.md