Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In what order are isolated assertions verified? #5862

Open
hmijail opened this issue Oct 26, 2024 · 12 comments
Open

In what order are isolated assertions verified? #5862

hmijail opened this issue Oct 26, 2024 · 12 comments
Labels
part: documentation Dafny's reference manual, tutorial, and other materials

Comments

@hmijail
Copy link

hmijail commented Oct 26, 2024

What change in documentation do you suggest?

In the JSON log of dafny measure-complexity --isolated-assertions, are vcResults assigned a vcNum according to their verification order? For example, is vcNum 1 verified before vcNum 2?

Asking because I have a log that lists the vcNums of the vcResults of a verificationResult out of order. E.g., I'm seeing a vcResult list with vcNums [4,1,3,2] in the JSON. So I wonder if they are being verified in JSON order or in vcNum order. This is important because, IIUC, if e.g. vcNum 3 fails verification, then subsequent verifications are unreliable.

When those example vcNums are sorted as [1,2,3,4], their line:col follow the order one encounters when reading the source code. This is the order that I intuitively (naively?) expected for verification. However some languages change the order of evaluation of things; so I wonder how this works during Dafny's verification.

Issue #5805 also made me think of this.

@hmijail hmijail added the part: documentation Dafny's reference manual, tutorial, and other materials label Oct 26, 2024
@hmijail
Copy link
Author

hmijail commented Oct 28, 2024

I noticed that with --cores=1, the JSON order seems to be the vcNum order. So I guess that solves one source of order variation.

Still remains the question of whether verification of vcNum x+1 logically follows vcNum x.

@keyboardDrummer
Copy link
Member

are vcResults assigned a vcNum according to their verification order? For example, is vcNum 1 verified before vcNum 2?

Usually one verifies using multiple cores, so multiple could be verifying at the same time, and the first assertion to start might be the last to finish.

if e.g. vcNum 3 fails verification, then subsequent verifications are unreliable.

Can you elaborate on what the problem is? As long as you get at least one error when a method/lemma does not verify, that's fine right?

@hmijail
Copy link
Author

hmijail commented Nov 5, 2024

I am measuring the brittleness of isolated assertions to estimate which ones could yield the biggest improvement if "stabilized".
Hence, it would be great to know which assertions merit attention and which can be discarded because a previous failure.
And for that, I need to know the definition of "previous".

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Nov 5, 2024

I don't see why you need to take into account previous failures. I think you could use dafny measure-complexity, which returns the VCs that had the highest resource count, and optimize those. How does that sound? If you're using --isolate-assertions, dafny measure-complexity will show the assertions with the highest resource count usage.

@hmijail
Copy link
Author

hmijail commented Nov 5, 2024

I don't see why you need to take into account previous failures.

If I understood correctly, here it was established that a failed assertion renders any following assertion useless, right?

I am currenly using measure-complexity --isolate-assertions. In each iteration, some assertion might fail, which taints the following assertions. Those tainted assertions' resource usage shouldn't go into the stats.

Or am I missing something?

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Nov 5, 2024

If I understood correctly, #5618 (comment) it was established that a failed assertion renders any following assertion useless, right?

No, that was only the case in that particular example, because the previous assertion proved false.

I think you'd be OK not doing anything particular when an assertion fails, and still looking at the resource counts of subsequent assertions.

In the example where you have:

assert false;
assert SomeThingDifficult(x,y,z);

Then the resource count of assert SomeThingDifficult(x,y,z) will be close to zero, because assert false before it means that anything can be proven, so then the difficult assertion will never be flagged as needing to be optimized.

To conclude, if there are assertions that have a high resource count, that means they are doing significant work despite the assertions before them, and they could benefit from being optimized.

@hmijail
Copy link
Author

hmijail commented Nov 5, 2024

But the problem is that it will rarely be as clear as assert false. Instead, it will be something that sometimes fails, sometimes verifies correctly, and sometimes times out. Will any of those turn into an assume false for subsequent assertions?

I guess that answering that is too complicated, and that's why I am trying to just bypass the whole problem by ignoring the "tainted" assertions.

So to circle back... isn't there an order to assertions?

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Nov 5, 2024

I don't see a problem to solve. If you have:

assert ProvingThisIsDifficultAndShouldBeOptimized(0);
assert ComplicatedButBoilsDownToFalse();
assert ProvingThisIsDifficultAndShouldBeOptimized(1);

then dafny measure-complexity will show you only ProvingThisIsDifficultAndShouldBeOptimized(0) consumed many resources, so it automatically avoids reporting the "tainted" ProvingThisIsDifficultAndShouldBeOptimized(1) as problematic.

@hmijail
Copy link
Author

hmijail commented Nov 5, 2024

  1. In your example, if assertions are not ordered / can be reordered, then how do we know that ComplicatedButBoilsDownToFalse wasn't asserted first and then tainted everything else?
    If this can happen, the implication is that if there is any consistent failure in an isolated assertion, then the other isolated assertions in the member can't be trusted.
    Is this right?

  2. If as you said an assert false makes subsequent assertions' consumption go close to zero, that would also be an useful heuristic to indirectly tease out problems. But again, only if there is an order!

Taking a step back, it looks like it's difficult to state whether there is an order or not? Should I understand that there isn't one? Or, can you point to something I could read to learn more?

@keyboardDrummer
Copy link
Member

keyboardDrummer commented Nov 6, 2024

I'm not sure what you mean by assertions being ordered or not. Statements in a program are ordered, and the order of statements matters to the meaning of the program. All statements that occur after an assert statement may assume that the asserted expression holds, so the order of assert statements can affect their verification.

For example:

assert x > 5; // This may fail
assert x > 4; // This will never fail because it is guaranteed by assuming `x > 5`

Versus

assert x > 4; // This may fail
assert x > 5; // This may fail as well

It is safe to assume the expression of an assertion after asserting it, because if it not true, then at least one error will be emitted and Dafny will not let you run your program.

@hmijail
Copy link
Author

hmijail commented Nov 6, 2024

I'm not sure what you mean by assertions being ordered or not.

The mental model I'm fighting is that of the C standards, which allow compilers to reorder statements and evaluation of expressions with rather minimal requirements, for the sake of optimization. This is one source for nasty surprises and much "language lawyering".

In Dafny, my initial assumption was that there is no such reordering; that if vcNum 3 fails, only assertions with vcNum>3 will be tainted. So when I found unordered vcNums in the logs, I feared some C-type reordering might be happening.

From what you are saying now, that seems to not be the case. That's all I wanted to know.

@keyboardDrummer
Copy link
Member

Okay, good. The vcNum is an internal number and may or may not relate to statement ordering, but it has no effect on semantics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
part: documentation Dafny's reference manual, tutorial, and other materials
Projects
None yet
Development

No branches or pull requests

2 participants