From 795a2ba2777fb4b8eb26c4270a50eea8a7334485 Mon Sep 17 00:00:00 2001 From: Charles Cooper Date: Fri, 15 Jul 2022 13:11:23 -0400 Subject: [PATCH] EIP-5202: Standard Factory Contract Format (#5202) * draft factory EIP * add discussions-to * Update EIPS/draft-factory-standard.md Co-authored-by: El De-dog-lo <3859395+fubuloubu@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Micah Zoltu * rename eip 5202, change a relative link * change discussions-to to ethereum magicians * expand description * Update EIPS/eip-5202.md Co-authored-by: Pandapip1 <45835846+Pandapip1@users.noreply.github.com> * add note: no contracts on mainnet starting with magic * minor wording change * add ed as eip coauthor Co-authored-by: El De-dog-lo <3859395+fubuloubu@users.noreply.github.com> Co-authored-by: Micah Zoltu Co-authored-by: Pandapip1 <45835846+Pandapip1@users.noreply.github.com> --- EIPS/eip-5202.md | 105 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) create mode 100644 EIPS/eip-5202.md diff --git a/EIPS/eip-5202.md b/EIPS/eip-5202.md new file mode 100644 index 0000000000..a22b01a23a --- /dev/null +++ b/EIPS/eip-5202.md @@ -0,0 +1,105 @@ +--- +eip: 5202 +title: Factory contract format +description: Define a bytecode container format for indexing and utilizing factory contracts +author: Charles Cooper (@charles-cooper), Edward Amor (@skellet0r) +discussions-to: https://ethereum-magicians.org/t/erc-5202-standard-factory-contract-format/9851 +status: Draft +type: Standards Track +category: ERC +created: 2022-06-23 +requires: 170 +--- + +## Abstract +Define a standard for "factory" contracts, or contracts which represent initcode that is stored on-chain. + +## Motivation +To decrease deployer contract size, a useful pattern is to store initcode on chain as a "factory" contract, and then use `EXTCODECOPY` to copy the initcode into memory, followed by a call to `CREATE` or `CREATE2`. However, this comes with the following problems: + +- It is hard for external tools and indexers to detect if a contract is a "regular" runtime contract or a "factory" contract. Heuristically searching for patterns in bytecode to determine if it is initcode poses maintenance and correctness problems. +- Storing initcode byte-for-byte on-chain is a correctness and security problem. Since the EVM does not have a native way to distinguish between executable code and other types of code, unless the initcode explicitly implements ACL rules, *anybody* can call such a "factory" contract and execute the initcode directly as ordinary runtime code. This is particularly problematic if the initcode stored by the factory contract has side effects such as writing to storage or calling external contracts. If the initcode stored by the factory contract executes a `SELFDESTRUCT` opcode, the factory contract could even be removed, preventing the correct operation of downstream deployer contracts that rely on the factory existing. For this reason, it would be good to prefix factory contracts with a special preamble to prevent execution. + +## Specification +The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119. + +A factory contract MUST use the preamble `0xFE71`. 6 bits are allocated to the version, and 2 bits to the length encoding. The first version begins at 0 (`0b000000`), and versions increment by 1. The value `0b11` for `` is reserved. In the case that the length bits are `0b11`, the third byte is considered a continuation byte (that is, the version requires multiple bytes to encode). The exact encoding of a multi-byte version is left to a future ERC. + +A factory contract MUST contain at least one byte of initcode. + +A factory contract MAY insert any bytes (data or code) between the version byte(s) and the initcode. If such variable length data is used, the preamble must be `0xFE71`. The `` represent a number between 0 and 2 (inclusive) describing how many bytes `` takes, and `` is the big-endian encoding of the number of bytes that `` takes. + +## Rationale +- To save gas and storage space, the preamble should be as minimal as possible. + +- It is considered "bad" behavior to try to CALL a factory contract directly, therefore the preamble starts with `INVALID (0xfe)` to end execution with an exceptional halting condition (rather than a "gentler" opcode like `STOP (0x00)`). + +- To help distinguish a factory contract from other contracts that may start with `0xFE`, a "magic" byte is used. The value `0x71` was arbitrarily chosen by taking the last byte of the keccak256 hash of the bytestring "factory" (i.e.: `keccak256(b"factory")[-1]`). + +- An empty initcode is disallowed by the spec to prevent what might be a common mistake. + +- Users may want to include arbitrary data or code in their preamble. To allow indexers to ignore these bytes, a variable length encoding is proposed. To allow the length to be only zero or one bytes (in the presumably common case that `len(data bytes)` is smaller than 256), two bits of the third byte are reserved to specify how many bytes the encoded length takes. + +- In case we need an upgrade path, version bits are included. While we do not expect to exhaust the version bits, in case we do, a continuation sequence is reserved. Since only two bytes are required for `` (as [EIP-170](./eip-170.md) restricts contract length to 24KB), a `` value of 3 would never be required to describe ``. For that reason, the special `` value of `0b11` is reserved as a continuation sequence marker. + +- The length of the initcode itself is not included by default in the preamble because it takes space, and it can be trivially determined using `EXTCODESIZE`. + +- The EOF ([EIP-3540](./eip-3540.md)) could provide another way of specifying factory contracts, by adding another section kind (3 - initcode). However, it is not yet in the EVM, and we would like to be able to standardize factory contracts today, without relying on EVM changes. If, at some future point, section kind 3 becomes part of the EOF spec, and the EOF becomes part of the EVM, this ERC will be considered to be obsolesced since the EOF validation spec provides much stronger guarantees than this ERC. + + +## Backwards Compatibility +Needs discussion + +## Reference Implementation + +```python +from typing import Optional, Tuple + +def parse_factory_preamble(bytecode: bytes) -> Tuple[int, Optional[bytes], bytes]: + """ + Given bytecode as a sequence of bytes, parse the factory preamble and + deconstruct the bytecode into: + the ERC version, preamble data and initcode. + Raises an exception if the bytecode is not a valid factory contract + according to this ERC. + arguments: + bytecode: a `bytes` object representing the bytecode + returns: + (version, + None if is 0, otherwise the bytes of the data section, + the bytes of the initcode, + ) + """ + if bytecode[:2] != b"\xFE\x71": + raise Exception("Not a factory!") + + erc_version = (bytecode[2] & 0b11111100) >> 2 + + n_length_bytes = bytecode[2] & 0b11 + if n_length_bytes == 0b11: + raise Exception("Reserved bits are set") + + data_length = int.from_bytes(bytecode[3:3 + n_length_bytes], byteorder="big") + + if n_length_bytes == 0: + preamble_data = None + else: + data_start = 3 + n_length_bytes + preamble_data = bytecode[data_start:data_start + data_length] + + initcode = bytecode[3 + n_length_bytes + data_length:] + + if len(initcode) == 0: + raise Exception("Empty initcode!") + + return erc_version, preamble_data, initcode +``` + +## Security Considerations + +There could be contracts on-chain already which happen to start with the same prefix as proposed in this ERC. However, this is not considered a serious risk, because the way it is envisioned that indexers will use this is to verify source code by compiling it and prepending the preamble. + +As of 2022-07-08, no contracts deployed on the Ethereum mainnet have a bytecode starting with `0xFE71`. + +## Copyright +Copyright and related rights waived via [CC0](../LICENSE.md).