Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

kevaundray
Copy link
Contributor

@kevaundray kevaundray commented Aug 17, 2025

What was wrong?

Not mergable in current state

Related to #1209

How was it fixed?

This initial PR is made to start discussion on implementing this. Something like this would be useful for both:

  • zkEVMs that need to create merkle proofs based on state that has been read
  • Block level access lists that need to keep track of state changes across the block

The idea being that an oracle/state object can be wrapped to record whenever state has been accessed/modified.

Notes

  • The original issue defined a quite minimal interface, this PR tries to reduce the diff by first introducing the idea of an Oracle and making all of the previous state management functions become methods.

To get what the original PR described, I believe we would move most of the methods into something like an EVMOracle structure that depends on a MerkleOracle

  • This PR does not modify the monkey patching that happens with the optimized impl for State

Comment on lines 185 to 186
def state_transition(chain: BlockChain, block: Block, oracle=None) -> None:
"""
Copy link
Contributor Author

@kevaundray kevaundray Aug 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

main TLDR; the object that is used to fetch state is now an explicit parameter to the state transition function instead of it being implicit

Comment on lines 873 to 872
sender_account = get_account(block_env.state, sender)
sender_account = block_env.oracle.get_account(sender)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most changes look like this:

Before:

func(state, params)

After:

oracle.func(params)

Comment on lines +9 to +11
# Use generic types for compatibility across forks
Account = Any
Address = Bytes20
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially thought that this interface would fork agnostic, but perhaps it makes sense to have one per fork?

This is required for SSTORE gas calculations per EIP-2200.
The implementation should use state snapshots/checkpoints to track
pre-transaction values.
TODO: The oracle does not have a `begin_transaction` method, so it kind of breaks here.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: TODO

from ethereum_types.bytes import Bytes20, Bytes32
from ethereum_types.numeric import Uint

# TODO: This file is current Osaka specific -- we could move state.py into here to mitigate this.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: TODO

@kevaundray
Copy link
Contributor Author

kevaundray commented Aug 17, 2025

cc @SamWilsn @gurukamath since you were tagged in that issue -- would be good to know if this is directionally what you were thinking first, since there is likely some cleanup required

Comment on lines +44 to +47
def get_storage(
self, address: Address, key: Bytes32 # noqa: U100
) -> Bytes32: # noqa: U100
"""Get storage value at `key` for the given `address`."""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

noqa: U100 was needed for tox lint

Comment on lines +98 to +112
def account_has_code_or_nonce(
self, address: Address # noqa: U100
) -> bool: # noqa: U100
"""
Check if account has non-zero code or nonce.

Used during contract creation to check if address is available.
"""

def account_has_storage(self, address: Address) -> bool: # noqa: U100
"""
Check if account has any storage slots.

Used during contract creation to check if address is available.
"""
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted in the PR description; methods like these can be moved into something like an "EVMOracle", since they can be implemented using get_account plus a few extra LOC to check if there is code/nonce/storage.

The user would then only implement MerkleOracle, which then gets wrapped in an EVMOracle so that the EVM can use the relevant methods

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the motivation behind the minimal state interface in #1209. Everything that isn't off the shelf that can be done in the specs should be done there, even if the performance is terrible.

@@ -19,7 +19,6 @@
from ethereum.utils.numeric import ceil32

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blockhash was not modified, but that should also likely be a part of "state" since 2935 has been implemented

Comment on lines 136 to 143
# TODO: Hack imo
if self.fork.is_after_fork("ethereum.osaka"):
from ethereum.state_oracle import MemoryMerkleOracle

oracle = MemoryMerkleOracle(self.alloc.state)
state_or_oracle = {"oracle": oracle}
else:
state_or_oracle = {"state": self.alloc.state}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the change currently only applies to Osaka

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are quite a few places in the ethereum_spec_tools directory that rely on accessing the state parameter explicitly, so I'll wait to see if this is directionally correct before modifying those

@SamWilsn
Copy link
Contributor

Our primary concern is readability, and I fear feeding the state oracle through everything will hurt that. Would an approach like our trace implementation work?

Otherwise I think this looks pretty good!

@kevaundray
Copy link
Contributor Author

Our primary concern is readability, and I fear feeding the state oracle through everything will hurt that. Would an approach like our trace implementation work?

Otherwise I think this looks pretty good!

I have no strong opinion on how its implemented :) -- when you say feeding the state oracle through everything, do you mean places outside of BlockEnv? Since in most places we are replacing block_env.state with block_env.oracle

@kevaundray
Copy link
Contributor Author

@SamWilsn have added a commit to an approach that is closer to evm_trace; with a global oracle -- was this what you had in mind?

Comment on lines +149 to 150
# TODO: Remove this, since its not being used.
"state": self.alloc.state,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove state from BlockEnvironment to finish this

return old


def get_state_oracle() -> MerkleOracle:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're still returning an object here. We want this to be as flat as possible. For each function on the MerkleOracle, create a function in this file exposing it.

Comment on lines +35 to +38
if _state_oracle is None:
raise RuntimeError(
"No global state oracle set. Call set_state_oracle() first."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should always be a default. That way the reader can easily trace the implementation of the unmodified spec. The average reader isn't going to be going through the whole codebase (instead just reading a function or implementing an EIP), so locality is very important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this implies that the state_oracl folder will be copied between forks, unless we have a fork agnostic default

Copy link
Contributor Author

@kevaundray kevaundray Aug 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think that we would need some state to create the default oracle with

Address = Bytes20


class MerkleOracle(Protocol):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so we have a vague rule that if something changes between forks, it must be placed into the fork's folder. For example, keccak256 is used in every fork and isn't likely to disappear, so it lives in ethereum.crypto, where as Address might get bumped to 32 bytes in the future, so it's defined per fork.

If the MPT is going to get replaced (or is fairly likely), it might be best to define it per fork.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to #1371 (comment)

Comment on lines +21 to +22
self, address: Address # noqa: U100
) -> Account: # noqa: U100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if you raise NotImplementedError it mutes the linter, but I'm not 100% sure on that.

Address = Bytes20


class MerkleOracle(Protocol):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Oracle" is a bit of a foreign term for non-cryptography folks. The base protocol could just be State.

That said, we've managed to keep protocols and most OOP out of the spec so far, so we might want to rethink how we organize this.

Comment on lines +98 to +112
def account_has_code_or_nonce(
self, address: Address # noqa: U100
) -> bool: # noqa: U100
"""
Check if account has non-zero code or nonce.

Used during contract creation to check if address is available.
"""

def account_has_storage(self, address: Address) -> bool: # noqa: U100
"""
Check if account has any storage slots.

Used during contract creation to check if address is available.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the motivation behind the minimal state interface in #1209. Everything that isn't off the shelf that can be done in the specs should be done there, even if the performance is terrible.

@kevaundray
Copy link
Contributor Author

Discussed with @SamWilsn offline and this may not be strictly better than what is already on main wrt the design constraints for the execution specs. Putting this back in draft.

The main change I'm looking to add is state tracking, which this should have made easier. If not, we can close it and make a separate PR for state tracking

@kevaundray kevaundray marked this pull request as draft August 21, 2025 11:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants