Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature/#244] Parse symbols from build to print readable traces #309

Merged
merged 13 commits into from
Jul 17, 2024

Conversation

pillip
Copy link
Contributor

@pillip pillip commented Jun 11, 2024

Change Type

  • Bug fix (Not Breaking-change. Solve issue.)
  • New feature (Not Breaking-change. Add new feature.)
  • Breaking change (Bug fix or New featrure that cause Breaking change.)
  • Code Convention (Edit Code Convention.)
  • Refactoring (Refactor Code without feature change.)

Description

This PR introduces several significant updates and new functionalities that parse and covert raw hex code to human readable string. The changes primarily focus on enhancing the Mapper class with methods for AST parsing and adding the DeployAddressMapper class for managing deployed contract address mappings.

Related Issues

This PR resolves #244

Changes

  1. Mapper class

    • Implemented using the Singleton pattern to manage contract and node mapping information from AST
    • Use AstNode and ContractMappingInfo dataclasses
    • Methods
      • parse_ast: Recursively parses AST nodes.
      • add_contract_mapping_info: Adds a new contract mapping
      • get_contract_mapping_info_by_name: Retrieves contract mapping by contract name
      • get_contract_mapping_info_by_bytecode: Retrieves contract mapping by bytecode suffix
      • append_node: Appends a new AST node to contract mapping info
      • find_nodes_by_address: Converts hex addresses to human-readable strings including contract name, event name, function name, and variable name
  2. DeployAddressMapper class

    • Implemented using the Singleton pattern
    • Added to manage deployed contract address mappings with predefined mappings such as HEVM and SVM.
  3. Modify functions when used for printing traces

    • render_trace: Added logic to look up deployed contract names.
    • rendered_calldata, hexify, find_nodes_by_address: Updated to support contract name parameters for searching nodes in specific contracts.

Tests

> cd halmos/examples/simple
> halmos -vvvv

Screenshot

image

pillip added 5 commits June 8, 2024 10:39
* Define AstNode and CotractMappingInfo dataclassses for storing
  contract's AST data
* Define SingletonMeta metaclass to enforce singleton pattern for Mapper
  class
* Implement Mapper class using SingletonMeta for managing contract
  mapping info from AST
* Add methods to mapper:
    * add_contract_mapping_info: Add a new contract mapping
    * get_contract_mapping_info_by_name: Get contract mapping by
      contract_name
    * get_contract_mapping_info_by_bytecode : Get contract mapping by
      bytecode
    * append_node : Append a new AST node to contract mapping info
* Add ignored node types to _PARSING_IGNORED_NODE_TYPES
* Implement methods for AST parsing:
    * parse_ast: Recursively parse AST nodes
    * _get_nodes_info: Helper method to extract id, name, address and
      visibility of node
    * _get_node_address: Helper method to extract address based on node
      type
    * _get_current_contract: Helper method to determine the name of
      current contract being parsed
* Add testcases of parse_ast method
* Add parsing logic in parse_build_out function at __main__.py
* Update _get_node_info method in Mapper classe to prepend 0x to the
  node address for facilitating trace comparisons
* Implement find_nodes_by_address method in Mapper class to convert hex
  address to human readable string
    * Returns formatted string(contract name, event name, function name
      and variable name)
    * Handles tricky case where address is 0x
* Use find_nodes_by_address method in hexify function
* Add DeployAddressMapper class to manage deployed contract address
  mappings with predefined mappings such as HEVM and SVM
* Update get_contract_mapping_info_by_bytecode method to finding by
  bytecode suffix
* Add deployed contract name lookup logic in render_trace function
* Update rendered_calldata, hexify function and find_nodes_by_address
  method to supprot contract name parameter for searching node in
  specific contract
nodes: List[AstNode]


class SingletonMeta(type):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a metaclass for implementing the Singleton pattern

@@ -313,23 +316,25 @@ def decode_hex(hexstring: str) -> Optional[bytes]:
return None


def hexify(x):
def hexify(x, contract_name: str = None):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By setting the default value of contract_name to None, existing usage of this function can use without modification

src/halmos/mapper.py Outdated Show resolved Hide resolved
@karmacoma-eth
Copy link
Collaborator

this looks great to me, thanks @pillip!

Copy link
Collaborator

@daejunpark daejunpark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

amazing, thanks for the contribution! i left a couple of comments.

src/halmos/__main__.py Outdated Show resolved Hide resolved
tests/test_mapper.py Outdated Show resolved Hide resolved
tests/test_mapper.py Outdated Show resolved Hide resolved
return cls._instances[cls]


class Mapper(metaclass=SingletonMeta):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although i prefer avoiding non-constant global objects, i understand that it's too combersome to refer to this inside hexify() without a global reference. also, this should work with multiple tests, even with --test-parallel, since this Mapper class is fully initialized during the build-out parsing phase, and never updated during test execution later.

perhaps we can add a comment to clarify this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense, it is logically a global object and it is immutable-like (after the parsing phase). I'm not too worried about enforcing the immutability here, a comment should be fine

# }

for contract_mapping_info in self._contracts.values():
if contract_mapping_info.bytecode.endswith(bytecode):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you please elaborate on why endswith is used here? is it because contract_mapping_info.bytecode is a contract creation (or init) bytecode, and bytecode is a runtime bytecode? if so, i think this may not work for a constructor with non-empty arguments, where the init bytecode is structured as: <contract-creation-preamble> + <runtime-bytecode> + <constructor-arguments>. what do you think? cc: @karmacoma-eth

node_type: str
id: int
name: str
address: str
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: precisely speaking, this is not an address, but a selector (4-byte signature) for functions, events, or errors.

clarifying this or renaming it would be great.

return result.strip() if result != "" and address != "0x" else address


class DeployAddressMapper(metaclass=SingletonMeta):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm concerned about DeployAddressMapper being a singleton object. unlike Mapper which is per project, DeployAddressMapper is per test. sharing a singleton across multiple tests don't seem like a good practice to me. also, i don't think this would would work with --test-parallel, due to potential race conditions. what do you think? cc: @karmacoma-eth

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, I don't think DeployAddressMapper should be a singleton (or it should be scoped to a single test context).

It seems to currently work with --test-parallel because it is not updated during the test run, it is only updated lazily in render_trace (so from the main thread). This likely means that values from a test can "leak" to another thread

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

little repro:

contract A {}

contract Test82 is Test, SymTest {
    function setUp() public {
        console2.log("Test82.setUp");
    }

    function test_DeployMapperLeak1_double_deploy() external {
        A a = new A();
        assert(address(a) != address(b));
    }

    function test_DeployMapperLeak2_deploys_nothing() external {
        vm.deal(address(this), 1 ether);
        payable(address(0xaaaa0002)).transfer(1 ether);
    }
}

if deploys_nothing runs first, it has the expected trace (0xaaaa0002 is a naked address, not an A)

Trace:
    CALL Test82::test_DeployMapperLeak2_deploys_nothing
        CALL HEVM_ADDRESS::0xc88a5e6d00000000000000000000000000000000000000000000000000000000aaaa00010000000000000000000000000000000000000000000000000de0b6b3a7640000
        ↩ 0x
        CALL 0xaaaa0002::0x (value: 1000000000000000000)
        ↩ f_ret_256(call_exit_code_02())
    ↩ REVERT f_ret_256(call_exit_code_02()) (error: Revert())

If double_deploy runs first, the DeployAddressMapper is now polluted and the trace for deploys_nothing wrongly indicates that eth was transferred to an A instance:

Trace:
    CALL Test82::test_DeployMapperLeak2_deploys_nothing
        CALL HEVM_ADDRESS::0xc88a5e6d00000000000000000000000000000000000000000000000000000000aaaa00010000000000000000000000000000000000000000000000000de0b6b3a7640000
        ↩ 0x
        CALL A::0x (value: 1000000000000000000)    # wrong label here
        ↩ f_ret_256(call_exit_code_02())
    ↩ STOP 0x

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for further clarification, karma!

if it is nontrivial to fix this, i'm ok with adding a todo comment, and opening a separate pr to fix it later.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed, let's merge as-is and fix up later. I wanted to try a quick fix by introducing scopes for the singletons, but it's trickier than I thought

Copy link
Collaborator

@daejunpark daejunpark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left todo notes for further improvements

@daejunpark daejunpark merged commit 7ba5a74 into a16z:main Jul 17, 2024
52 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parse symbols from build to print readable traces
3 participants