Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: execution traces #199

Merged
merged 52 commits into from
Oct 10, 2023
Merged

feat: execution traces #199

merged 52 commits into from
Oct 10, 2023

Conversation

karmacoma-eth
Copy link
Collaborator

(draft, will need to rebase and fix ci probably)

the traces in test_symbolic_create() now look like this:

Traces:
    this_address::Concat(2987291460, p_x_uint256)
        new contract @ 0xaaaa0002::0x60a060405234801561001057600080fd5b506040516101073803806101078339818101604052810190610032919061008c565b602a8103610043576100426100b9565b5b8060808181525050506100e8565b600080fd5b6000819050919050565b61006981610056565b811461007457600080fd5b50565b60008151905061008681610060565b92915050565b6000602082840312156100a2576100a1610051565b5b60006100b084828501610077565b91505092915050565b7f4e487b7100000000000000000000000000000000000000000000000000000000600052600160045260246000fd5b60805160096100fe6000396000505060096000f3fe6080604052600080fd(p_x_uint256)
        ← 6080604052600080fd
    ← 0000000000000000000000000000000000000000000000000000000000000001

Traces:
    this_address::Concat(2987291460, p_x_uint256)
        new contract @ 0xaaaa0002::0x60a060405234801561001057600080fd5b506040516101073803806101078339818101604052810190610032919061008c565b602a8103610043576100426100b9565b5b8060808181525050506100e8565b600080fd5b6000819050919050565b61006981610056565b811461007457600080fd5b50565b60008151905061008681610060565b92915050565b6000602082840312156100a2576100a1610051565b5b60006100b084828501610077565b91505092915050565b7f4e487b7100000000000000000000000000000000000000000000000000000000600052600160045260246000fd5b60805160096100fe6000396000505060096000f3fe6080604052600080fd(p_x_uint256)
        ← 4e487b710000000000000000000000000000000000000000000000000000000000000001 (error: <class 'halmos.exceptions.Revert'>)
    ← 0000000000000000000000000000000000000000000000000000000000000000
+ return_scheme in message output
+ new color scheme
Every test has a revert case where value > 0, before splitting and actually executing the test logic with value == 0
EvmExceptions: halt the current context, but execution continues normally
HalmosExceptions: halt the current path, but the test continues normally

+ add stuck() method to CallContext (returns internal errors encountered by itself or any of its subcalls)
+ better trace rendering for stuck executions
+ change subcalls() and logs() to be iterators instead of lists
@daejunpark
Copy link
Collaborator

btw, not sure why the ci tests aren't triggered. would it be because of the conflicts?

f"Unsupported console function: selector = 0x{funsig:0>8x}, "
f"calldata = {hexify(arg)}"
)
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no HalmosException here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no we should continue execution, we don't really care if a console call is missing IMO


ex.calls.append((exit_code_var, exit_code, ex.output))
# TODO: check if still needed
ex.calls.append((exit_code_var, exit_code, ex.context.output.data))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently call_id is generated using calls: call_id = len(ex.calls). but perhaps we can consider removing them in a separate pr.

src/halmos/sevm.py Outdated Show resolved Hide resolved
src/halmos/sevm.py Outdated Show resolved Hide resolved
offset : offset + 32
]
)
)
elif opcode == EVM.CALLDATASIZE:
if ex.calldata is None:
calldata = None if ex.message().is_create() else ex.message().data
# TODO: is optional calldata necessary?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if i remember correctly, we discussed this before, and concluded that we require calldata to be given explicitly. but let's deal with this in a separate pr.

src/halmos/sevm.py Outdated Show resolved Hide resolved
@@ -2464,8 +2634,15 @@ def run(self, ex0: Exec) -> Tuple[List[Exec], Steps]:
offset: int = int_of(ex.st.pop(), "symbolic RETURNDATACOPY offset")
# size (in bytes)
size: int = int_of(ex.st.pop(), "symbolic RETURNDATACOPY size")

# TODO: do we need to pass returndatasize here?
Copy link
Collaborator

@daejunpark daejunpark Sep 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, RETURNDATACOPY doesn't zero-pad, but revert for out-of-bounds access.
https://github.com/ethereum/execution-specs/blob/master/src/ethereum/shanghai/vm/instructions/environment.py#L444-L447

note that all the other copy opcodes (CODECOPY, EXTCODECOPY, CALLDATALOAD, CALLDATACOPY) do zero-pad, lol

let's comment this non-intuitive behavior here for future contributors.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok I didn't realize this was the purpose. Made some related improvements in 623a464

src/halmos/sevm.py Outdated Show resolved Hide resolved
# vm state
this: Address # current account address
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need to keep this separately? it should be always equal to context.message.target, right?

if keeping this is only for convenient purpose, we can remove this, and write a helper function this() like caller(). this way, we can avoid making inconsistency by mistake in the future.

src/halmos/sevm.py Show resolved Hide resolved
giving explicit contract and test function names makes it easier to invoke a specific test from the command line
also:

- changes the system recursion limit so that we can in fact process up to 1024 messages deep
- introduces a MAX_CALL_DEPTH global constant in halmos.sevm
- sevm.run() now checks the current execution context and throws if it is over MAX_CALL_DEPTH
- splits CallContext.stuck() into:
     - CallContext.is_stuck(), O(1) check that returns a bool
     - CallContext.get_stuck_reason(), O(n) traversal to fetch the internal exception
- we use None as a sentinel value for context.output.data, meaning that if execution finishes
  with no output (as opposed to empty output), then it encountered an error and is stuck.
  As a result, we must be careful to:
     - not set context.output.data=None on normal halting and EvmException (hence the change
       to mloc() to return empty bytes instead of None)
     - set context.output.data=None on HalmosException (and really any other unexpected issue)

TODO:
- processing calls 1024 is quite slow (20+s), will investigate and fix later
Copy link
Collaborator

@daejunpark daejunpark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did a second pass of review, and found a couple more semantics issues.

i think the outstanding semantics issues need to be fixed before merging this pr. the performance issues could be handled in a separate pr, though.

in the meantime, today, i will write more tests for semantics correctness, and include them in this pr. that might also help for fixing the merge conflicts with the current main.

hevm_fail = isinstance(context.output.error, FailCheatcode)
return hevm_fail or any(is_global_fail_set(x) for x in context.subcalls())


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: evaluate performance and make it O(1) if needed.

execs_to_model.append((idx, ex))
else:
stuck.append((opcode, idx, ex))
continue
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since is_global_fail_set is not O(1), we may want to reorg this if-statements sequence, so that it isn't executed for not error or non-assert-fail reverts at all. or, we can just make it O(1).

not necessarily to be done in this pr, but at least commenting it as todo.

src/halmos/sevm.py Outdated Show resolved Hide resolved
src/halmos/sevm.py Show resolved Hide resolved

# add to worklist
new_ex.next_pc()
stack.append((new_ex, step_id))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: now that the create failure semantics is supported, this for-loop can be made consistent with the one for call_known(). further, their common logic can be factored out to avoid duplication.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed, do you mind if we do a cleanups as a subsequent PR though in the interest of getting this merged faster?

"max_width" in self.options
and len(out) >= self.options["max_width"]
):
if len(out) >= self.options.get("max_width", 2**64):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2**64 is nice. can we just have it as the default value of max_width, so that it can be also displayed in the --help message?

- type annotations
- missing assert
- remove unused output field from Exec
- in calldata(): return empty list instead of None of create calls
target=new_addr,
caller=caller,
value=value,
data=create_hexcode,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is the calldata set to the creation code here?

according to the evm spec, calldata should be set to empty:
https://github.com/ethereum/execution-specs/blob/master/src/ethereum/shanghai/vm/instructions/system.py#L116-L117

if it's only for the tracing purpose, i think it is better to create a new field for its own, rather than repurposing the calldata field. i know that there is the calldata() helper function, but given that the calldata is crucial for correctness, i'd like to avoid any potential mistakes caused by this discrepancy in the future.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here is how I ended up this design. We need to represent either a regular message (has a target and calldata but no code) or a create message (technically has no target address and no calldata but has code)

a. we could represent them both in the same class using disjoint fields (ie. separate code and calldata fields)
b. we could represent them both in the same class with overloaded fields
c. we could represent them using separate classes like MessageCall and CreateCall

I went with (b.) because the semantic difference is pretty small and easily handled with a helper function to know if message.data corresponds to code or calldata. I similarly "hijacked" the target field because it's convenient to track the address being executed in that context.

I'm not a fan of option (a.) because the semantics of that union object is weird and you could end up in an inconsistent state (having both code and calldata), I think if we're going to refactor going with option (c.) would be cleaner:

  • it would be very clear at creation time if you're making a MessageCall or CreateCall
  • checking if something is a message call or create call is done simply via a type check rather than a helper function
  • they could easily be rendered/printed/traced differently
  • it's impossible to end up in an inconsistent state

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i think we need more time to discuss this further. let's keep this as it is for the scope of this pr.

@daejunpark
Copy link
Collaborator

daejunpark commented Oct 3, 2023

in the meantime, today, i will write more tests for semantics correctness, and include them in this pr. that might also help for fixing the merge conflicts with the current main.

i added new comprehensive tests for the returndata and calldata manipulation, and reverting behaviors: 2fa17b4

@karmacoma-eth karmacoma-eth merged commit d5be133 into main Oct 10, 2023
109 checks passed
@karmacoma-eth karmacoma-eth deleted the exec-traces branch October 10, 2023 00:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants