Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add __rvalue(expression) builtin #17050

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

WalterBright
Copy link
Member

This adds the __rvalue(expression) builtin, which causes expression to be treated as an rvalue, even if it is an lvalue.

@WalterBright WalterBright added Enhancement WIP Work In Progress - not ready for review or pulling labels Nov 3, 2024
@dlang-bot
Copy link
Contributor

Thanks for your pull request, @WalterBright!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "master + dmd#17050"

@WalterBright WalterBright force-pushed the __rvalue branch 4 times, most recently from 0f383fd to 60c0f9e Compare November 3, 2024 07:24
@thewilsonator thewilsonator added Needs Changelog A changelog entry needs to be added to /changelog Needs Spec PR A PR updating the language specification needs to be submitted to dlang.org labels Nov 3, 2024
@WalterBright WalterBright force-pushed the __rvalue branch 5 times, most recently from 082c48d to 6e0e44f Compare November 4, 2024 08:07
@nordlow
Copy link
Contributor

nordlow commented Nov 4, 2024

Can this __rvalue(...) be used in place of core.lifetime.move(...)?

@nordlow
Copy link
Contributor

nordlow commented Nov 4, 2024

Will the call to use in

S s;
__rvalue(s)
use(s);

be either

  1. a defined behavior where s is set to S.init by __rvalue(s)
  2. a compiler error (optionally in @safe code) at least in the case where S has indirections or destructors, or
  3. an undefined (@system) behavior when S has indirections or destructors

?

I prefer option 2 and is the closest to what Rust does. Can it be implemented with constant-time overhead by adding a status bit to the (parameter) variable declaration indicating that its contents has been invalidated by a move?

I recently saw a https://youtu.be/08gvuBC-MIE?t=1839 in which Jon Kalb admits that the standard committe made a mistake by forcing moved from object to in a so called "fully formed state" instead of a so called "partially formed state". As this prevents certain kinds of optimizations. For details see https://youtu.be/08gvuBC-MIE?t=1839.

@WalterBright
Copy link
Member Author

@nordlow I'll write a proper document for this after I figure out just what the end result will be!

@WalterBright WalterBright force-pushed the __rvalue branch 2 times, most recently from 86b02a7 to 8ccedfb Compare November 5, 2024 06:41
@nordlow
Copy link
Contributor

nordlow commented Nov 5, 2024

Thanks. Please see updates to my comment at #17050 (comment).

@WalterBright
Copy link
Member Author

My opinion on move semantics is that once you've moved s to t, then s's lifetime is over, and it should be in the default initialized state (a concept C++ doesn't have).

@nordlow
Copy link
Contributor

nordlow commented Nov 5, 2024

My opinion on move semantics is that once you've moved s to t, then s's lifetime is over, and it should be in the default initialized state (a concept C++ doesn't have).

Is this realized by

  1. applying, if present, move constructor of S moving s to t and
  2. resetting all the bytes at s to S.init?

@nordlow
Copy link
Contributor

nordlow commented Nov 5, 2024

Have you considered it making it a compiler error to access s after it has been moved? If not, why? I'm asking because this would lead to slight better performance in debug mode at least. And this is one of the reasons why Rust has this behavior.

@WalterBright
Copy link
Member Author

@WalterBright
Copy link
Member Author

Have you considered it making it a compiler error to access s after it has been moved?

Yes, but it requires Data Flow Analysis, which is slow.

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

Have you considered it making it a compiler error to access s after it has been moved?

Yes, but it requires Data Flow Analysis, which is slow.

Ok, thanks.

Afaict, the complexity of supporting Rust-style r-value semantics depends on the context in which __rvalue would be used. For instance, in

S use(S);
S x;
auto y = __rvalue(x)
use(y); // allowed
use(x);  // disallowed

such a analysis could be implemented in the compiler with negligible overhead using an extra status bit in the Declaration node.

But in the general case I realize now that the compiler needs to recurse into all function calls that are passed l-values by reference as arguments.

Do you have a good reference to which data flow analysis in general and its applications such as this one?

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

I currently experimenting with using __rvalue defined in the branch of this MR in my code. Is __rvalue currently supposed to be wrapped in core.lifetime.move? If so, is

static if (__traits(compiles, { int x; const y = __rvalue(x); })) {
	import core.stdc.string : memcpy;
	T move(T)(return scope ref T source) @trusted {
		scope(exit) {
			static immutable init = T.init;
			memcpy(&source, &init, T.sizeof);
		}
		return __rvalue(source);
	}
	void move(T)(ref T source, ref T destination) @trusted {
		scope(exit) {
			static immutable init = T.init;
			memcpy(&source, &init, T.sizeof);
		}
		destination = __rvalue(source);
	}
} else
	public import core.lifetime : move;

/// unary move()
pure nothrow @nogc @safe unittest {
	auto x = S(42);
	assert(x == S(42));
	const y = move(x);
	assert(y == S(42));
	assert(x == S.init);
}

/// binary move()
pure nothrow @nogc @safe unittest {
	auto x = S(42);
	assert(x == S(42));
	S y;
	move(x, y);
	assert(y == S(42));
	assert(x == S.init);
}

version(unittest) {
	struct S { @disable this(this); int x; }
}

a suitable rewrite of the core.lifetime.move overloads?

@TurkeyMan
Copy link
Contributor

__rvalue() is a bad name for a move() intrinsic, the answer you seek is yes, __rvalue is exactly a replacement for move(), and it should be named move(). I'll argue for this before it's merged, but we're making very good progress here! :)

@TurkeyMan
Copy link
Contributor

Also no, this can't be 'wrapped', it's an intrinsic; it needs to be renamed move, you can't wrap it in a function named move.

@TurkeyMan
Copy link
Contributor

The end goal is to completely delete core.lifetime. Don't try to shoehorn this in there; that stuff is all dead.

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

What about the binary overload of move() and moveEmplace?

I'm asking because this MR needs to include the druntime modifications to core.lifetime that makes full use of __rvalue for the sake of deletion of the current very convoluted implementations of move and moveEmplace in core.lifetime.

It's important to note that current behaviour of core.lifetime.move conditionally resets the T source to its T.init value when certain conditions hold for T; specifically

"If T is a struct with a destructor or postblit defined, source is reset to its .init value after it is moved into target, otherwise it is left unchanged. "

I'm not sure this is in line with the behavior of __rvalue that Walter proposes in this MR.

See https://dlang.org/phobos/core_lifetime.html#.move for details.

Luckily druntime is now part of the dmd repo.

@tgehr
Copy link
Contributor

tgehr commented Nov 6, 2024

I agree with @nordlow that __rvalue is more low-level than move. E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

@nordlow
Copy link
Contributor

nordlow commented Nov 6, 2024

I agree with @nordlow that __rvalue is more low-level than move. E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

Nevertheless, I personally believe it is highly preferable to make all the overloads of move and moveEmplace become builtins using their existing name. This is gonna be a breaking change for code that rely on those symbols being templates but projects can be adjusted.

Btw, I'm working on migrating std.traits to become builtin __traits for the sake of reducing template bloat in std.traits. I yet again remind us all of the fact that the C++ standard, per definition, enforces implementations to lower all symbols in std.traits to builtins. For the same reason that I believe that most (or all) druntime's traits and std.traits should be converted to builtin __traits.

@WalterBright
Copy link
Member Author

such a analysis could be implemented in the compiler with negligible overhead using an extra status bit in the Declaration node.

Such certainly looks tempting, but it falls short as soon as the flow control becomes non-trivial. The flow analysis I know comes from class notes in a class I took on optimizations.

@WalterBright
Copy link
Member Author

emplace() will be replaced with the placement new operator, I posted a DIP on it in the development forum.

@WalterBright
Copy link
Member Author

E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

Currently it will get destructed twice.

@WalterBright
Copy link
Member Author

This work is heavily based on Timon Gehr's and Manu Evans' contributions.

@TurkeyMan
Copy link
Contributor

E.g., what happens if you pass a non-POD struct twice to the same function call using __rvalue.

Currently it will get destructed twice.

Another case naturally resolved with caller-destruction!

@TurkeyMan
Copy link
Contributor

I've started working with this branch and trying to build out the peripheral stuff.
One issue that needs to be solved is this error:

struct S
{
    this(typeof(this) s) { writeln("move"); }
    this(ref typeof(this) s) { writeln("copy"); }
}
error : `struct S` may not define both a rvalue constructor and a copy constructor

This error is erroneous; it is not only valid, but it is indeed the entire point of this work.
Can we delete that error message and then see there that leads? I wonder if there might need to be some tweak to proper overload selection with respect to rvalue/lvalue arguments?

Given the struct above, we should be able to write:

void main()
{
  S x;
  S a = x;
  S b = __rvalue(x);
}

And this should emit the output:

copy
move

@TurkeyMan
Copy link
Contributor

There's also still this issue:

struct S
{
    int i, j, k;

    this(S s)
    {
        S* a = &s; // if i assign the addresses to locals...
        S* b = &x;
        assert(a is b); // this works

        assert(&s is &x); // this doesn't work
    }
}

__gshared S x;

void main()
{
//    S t = __rvalue(x); // this doesn't work; it should call the constructor, but it doesn't
    S t2 = S(__rvalue(x)); // this does call the constructor...
}

It's weird that if I assign the pointers to locals and compare them, it's fine, but if I compare the & expressions, the assert fails.

Also, the first statement in main doesn't call the constructor like it should.

@TurkeyMan
Copy link
Contributor

I can't do anything with __rvalue until these issues are resolved.

@WalterBright
Copy link
Member Author

Ok I will check them out.

@WalterBright
Copy link
Member Author

Can we delete that error message and then see there that leads?

It leads to reopening https://issues.dlang.org/show_bug.cgi?id=19871 and https://issues.dlang.org/show_bug.cgi?id=23036

@TurkeyMan
Copy link
Contributor

Okay, well we'll need to work thorough those properly... is that change made in your branch, or can you commit a branch with that change so I can build it?

@WalterBright
Copy link
Member Author

--- a/compiler/src/dmd/clone.d
+++ b/compiler/src/dmd/clone.d
@@ -1655,7 +1655,7 @@ bool needCopyCtor(StructDeclaration sd, out bool hasCpCtor)

     if (cpCtor)
     {
-        if (rvalueCtor)
+        if (0 && rvalueCtor)
         {

@WalterBright
Copy link
Member Author

The problem is it goes into infinite recursion and runs off the end of the stack

@TurkeyMan
Copy link
Contributor

TurkeyMan commented Nov 28, 2024

Okay, well I'm sure that's fixable; just some semantic or terminal condition isn't geared quite right.
C++ has a similar-ish overload set, and it never caused any problems. There's a solution here...

@TurkeyMan
Copy link
Contributor

What about this one:

S t = __rvalue(x); // this doesn't work; it should call the constructor, but it doesn't

And also that odd one where I can compare the pointers if I first assign them to locals, but I can't compare the & expressions directly? I've seen a lot of weirdness in various tests I've setup; I reckon whatever weird-ness leads to that & comparison bug might solve other problems I've encountered.

@TurkeyMan
Copy link
Contributor

TurkeyMan commented Nov 28, 2024

Okay, I'm seeing the nature of the issue. The rules surrounding when default ctors should be generated need to be updated in relation to move semantics, and that spec update should inhibit the issues you're seeing.

I'm having a bit of a poke to see if I can implement the changes, but it might be a bit too broad for my level of compiler knowledge...

What I'm trying to do:

  1. there's a lot of code relating isRvalueConstructor hasRvalueConstructor, etc; those should be renamed is/hasMoveConstructor; and then this:
dsymbolsem.d:
        sd.hasCopyCtor = buildCopyCtor(sd, sc2);
+++     sd.hasMoveCtor = buildMoveCtor(sd, sc2);
        sd.postblit = buildPostBlit(sd, sc2);

Because a default move constructor will become a concept, and then some rules regarding when constructors are synthesised need adjustment, something like:

COPY CTOR:

  this(ref S);

  Compiler generates default copy ctor if:
  1. no user defined copy ctor
  2. no user defined move ctor (possibly issue an error that if a move constructor is specified, then a copy constructor should also be specified)
  3. no user defined copy-assignment (issue an error that it is illegal to specify elaborate assignment but not also elaborate construct)
  4. all members can copy (ie, not-const, copy ctor is not @disable, etc)
  -> do: element-wise copy-construction

  #2 may not be strictly necessary, but I see it inviting errors if users specify an elaborate move, but not an elaborate copy


MOVE CTOR (formerly "rvalue constructor"):

  this(S);

  ** if -preview=rvaluerefparam

  Compiler generates default move ctor if:
  1. no user defined copy ctor
  2. no user defined move ctor
  3. no user defined move-assignment (issue an error that it is illegal to specify elaborate assignment but not also elaborate construct)
  4. all members can move (ie, not const, move ctor is not @disable, etc)
  -> do: element-wise move-construction

  In this case, if a copy-ctor WAS specified, a default move will NOT be generated; the rvalue argument which would normally select the move-ctor will naturally match the copy-ctor as fall-back because rvalue can pass to ref, and this is the appropriate behaviour.


  ** if NOT -preview=rvaluerefparam

  Compiler generates default move ctor if:
  1. no user defined move ctor
  2. no user defined move-assignment (issue an error that it is illegal to specify elaborate assignment but not also elaborate construct)
  3. all members can move (ie, not const, move ctor is not @disable, etc)
  -> do: IF user-defined copy-ctor is specified; forward to copy ctor
         ELSE element-wise move-construction

  Element-wise move is identical to the default-copy; but just wrap `__rvalue(el)` for each element; which will invoke appropriate move semantics accordingly.

The reason that it should fall-back to move construction or forward to move construction, is that we can't know if there's any 'special sauce' in the copy that performs special record keeping, which would need to be mirrored in a move constructor; so we can't safely synthesise a move ctor in the presence of a copy ctor. We must defer to the user-implemented logic.

So; no additional logic needs to be added to the compiler if -preview=rvaluerefparam, because the fall-back will happen naturally (since calling with an rvalue can select a ref; the copy ctor). If that is not enabled; the compiler needs to implement a forward to the copy ctor.

There's a secondary conversation about similarly synthesising default opAssign's, but there are 2 or 3 reasonable options, and that conversation can wait for later....

@TurkeyMan
Copy link
Contributor

@WalterBright I don't really know how to diagnose compiler issues without a lot of trouble when they become complicated; is there some dev logging, or anything like that which emits the flow of of the compiler doing it's work? I don't see how to easily follow what it's doing without intensively time-consuming stepping through the debugger, which illustrate the program flow but mostly obscuring the context of the source code or expressions that are being evaluated at any given moment.

@WalterBright
Copy link
Member Author

I just add custom, targeted printf's as needed, and when I solve the problem, I delete them. Enabling generic logging results in a hopeless mass of irrelevant detail.

@WalterBright
Copy link
Member Author

The cause of the infinite loop is this line:

            if (!isCopyConstructorCallable(argStruct, arg, tprm, sc, pMessage))

in the function:

private extern(D) MATCH argumentMatchParameter (TypeFunction tf, Parameter p,
    Expression arg, ubyte wildmatch, int flag, Scope* sc, const(char)** pMessage)

I have yet to figure out how to fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement Needs Changelog A changelog entry needs to be added to /changelog Needs Spec PR A PR updating the language specification needs to be submitted to dlang.org WIP Work In Progress - not ready for review or pulling
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants