Ethereum EVM Source Code Analysis

·

The Ethereum Virtual Machine (EVM) is a crucial component of the Ethereum blockchain, responsible for executing smart contract code. This article provides an in-depth analysis of the EVM's source code, focusing on its structure, functionality, and key components.

Introduction

Ethereum's smart contracts are a revolutionary idea, making it essential to understand the EVM module when studying Ethereum. The EVM module implements the virtual machine that executes smart contracts, handling both contract creation and invocation. This article explores the EVM's implementation through its source code.

It's important to note that fully understanding Ethereum smart contract implementation requires more than just the EVM module. The Solidity compiler project is also critical, as the EVM merely interprets and executes compiled contract instructions. However, this article focuses solely on the EVM's role.

EVM Implementation Structure

The EVM's core object is the EVM struct, representing an Ethereum virtual machine instance. Each transaction processing instance creates a new EVM object. The EVM relies on three main components:

The interpreter executes contract instructions sequentially, while the JumpTable in vm.Config contains information about each operation. Four different instruction sets exist for different Ethereum versions: constantinopleInstructionSet, byzantiumInstructionSet, homesteadInstructionSet, and frontierInstructionSet.

Ethereum Virtual Machine: EVM

Creating the EVM

For each transaction processed, the system creates a new EVM instance in the ApplyTransaction function:

func ApplyTransaction(config *params.ChainConfig, bc ChainContext, author *common.Address, gp *GasPool, statedb *state.StateDB, header *types.Header, tx *types.Transaction, usedGas *uint64, cfg vm.Config) (*types.Receipt, uint64, error) {
    msg, err := tx.AsMessage(types.MakeSigner(config, header.Number))
    if err != nil {
        return nil, 0, err
    }
    context := NewEVMContext(msg, header, bc, author)
    vmenv := vm.NewEVM(context, statedb, config, cfg)
    _, gas, failed, err := ApplyMessage(vmenv, msg, gp)
    if err != nil {
        return nil, 0, err
    }
    // ... rest of the function
}

The NewEVM function creates the virtual machine instance:

func NewEVM(ctx Context, statedb StateDB, chainConfig *params.ChainConfig, vmConfig Config) *EVM {
    evm := &EVM{
        Context:      ctx,
        StateDB:     statedb,
        vmConfig:    vmConfig,
        chainConfig: chainConfig,
        chainRules:  chainConfig.Rules(ctx.BlockNumber),
        interpreters: make([]Interpreter, 0, 1),
    }
    evm.interpreters = append(evm.interpreters, NewEVMInterpreter(evm, vmConfig))
    evm.interpreter = evm.interpreters[0]
    return evm
}

Creating Contracts

When a transaction's recipient is empty, it indicates contract creation. The EVM.Create method handles this:

func (evm *EVM) Create(caller ContractRef, code []byte, gas uint64, value *big.Int) (ret []byte, contractAddr common.Address, leftOverGas uint64, err error) {
    contractAddr = crypto.CreateAddress(caller.Address(), evm.StateDB.GetNonce(caller.Address()))
    return evm.create(caller, &codeAndHash{code: code}, gas, value, contractAddr)
}

The actual creation happens in the EVM.create method, which performs several checks before executing the contract code through the run function.

Calling Contracts

The EVM provides several methods for contract calls:

The EVM.Call method is the most fundamental:

func (evm *EVM) Call(caller ContractRef, addr common.Address, input []byte, gas uint64, value *big.Int) (ret []byte, leftOverGas uint64, err error) {
    // Various checks and setup
    ret, err = run(evm, contract, input, false)
    // Error handling and return
}

👉 Explore advanced contract interaction methods

Interpreter Object: EVMInterpreter

The EVMInterpreter executes contract instructions but doesn't handle the actual instruction execution itself. Instead, it delegates to operation objects in the JumpTable.

Precompiled Contracts

Ethereum includes several precompiled contracts at specific addresses:

var PrecompiledContractsByzantium = map[common.Address]PrecompiledContract{
    common.BytesToAddress([]byte{1}): &ecrecover{},
    common.BytesToAddress([]byte{2}): &sha256hash{},
    common.BytesToAddress([]byte{3}): &ripemd160hash{},
    common.BytesToAddress([]byte{4}): &dataCopy{},
    common.BytesToAddress([]byte{5}): &bigModExp{},
    common.BytesToAddress([]byte{6}): &bn256Add{},
    common.BytesToAddress([]byte{7}): &bn256ScalarMul{},
    common.BytesToAddress([]byte{8}): &bn256Pairing{},
}

These contracts implement specific functionality directly in Go rather than through EVM bytecode.

Gas Consumption

Gas is consumed for instruction execution, memory usage, and state storage. The interpreter calculates gas costs before executing each instruction:

func (in *EVMInterpreter) Run(contract *Contract, input []byte, readOnly bool) (ret []byte, err error) {
    // ...
    cost, err = operation.gasCost(in.gasTable, in.evm, contract, stack, mem, memorySize)
    if err != nil || !contract.UseGas(cost) {
        return nil, ErrOutOfGas
    }
    // ...
}

Gas consumption includes:

Jump Table: vm.Config.JumpTable

The jump table contains operation definitions for all EVM instructions:

type operation struct {
    execute        executionFunc
    gasCost        gasFunc
    validateStack  stackValidationFunc
    memorySize     memorySizeFunc
    halts          bool
    jumps          bool
    writes         bool
    valid          bool
    reverts        bool
    returns        bool
}

Different Ethereum versions use different instruction sets, with constantinopleInstructionSet being the most comprehensive.

Jump Instructions

Jump instructions (JUMP and JUMPI) require that the destination's first instruction is JUMPDEST. The EVM validates this using a bit vector that distinguishes between code and data:

func (c *Contract) validJumpdest(dest *big.Int) bool {
    udest := dest.Uint64()
    if dest.BitLen() >= 63 || udest >= uint64(len(c.Code)) {
        return false
    }
    if OpCode(c.Code[udest]) != JUMPDEST {
        return false
    }
    // Additional validation using bit vectors
}

Storage

The EVM provides three storage areas:

Stack

The stack provides last-in-first-out storage for computation:

type Stack struct {
    data []*big.Int
}

Memory

Memory provides linear storage for contract execution:

type Memory struct {
    store       []byte
    lastGasCost uint64
}

Permanent Storage: StateDB

The StateDB provides permanent storage on the Ethereum blockchain, maintaining all account states and contract storage.

Additional Helper Objects

intPool

The intPool provides a pool of big.Int values for efficient memory management:

type intPool struct {
    pool *Stack
}

logger

The EVM includes logging functionality through JSONLogger and StructLogger for debugging and monitoring purposes.

Frequently Asked Questions

What is the EVM's primary function?

The Ethereum Virtual Machine executes smart contract code in a secure, isolated environment. It processes contract creation and invocation requests, ensuring proper gas accounting and state changes.

How does gas consumption work in the EVM?

Gas is consumed for computational steps, memory usage, and storage operations. Each instruction has an associated gas cost, and transactions must include sufficient gas to complete execution.

What are precompiled contracts?

Precompiled contracts are commonly used functions implemented directly in Go at fixed addresses. They provide optimized implementations for cryptographic operations and other frequently used functionality.

👉 Learn more about gas optimization techniques

How does the EVM handle contract calls?

The EVM supports multiple call types:

What is the purpose of the jump table?

The jump table maps opcodes to their implementation details, including execution functions, gas costs, and stack validation. This allows for flexible updates to the instruction set across different Ethereum versions.

How does the EVM ensure jump destination validity?

The EVM uses bit vectors to distinguish between code and data in contract bytecode. This ensures jump destinations point to valid instructions rather than data bytes.

What storage options are available to contracts?

Contracts can use:

Conclusion

The EVM module implements the virtual machine that executes Ethereum smart contracts. This article has explored its structure, including contract creation and invocation, instruction execution, gas accounting, and storage management.

Understanding the EVM is essential for Ethereum development, but it represents only part of the smart contract ecosystem. The Solidity compiler and other components also play crucial roles in the complete smart contract lifecycle.

For those interested in deeper exploration, studying the EVM source code provides valuable insights into blockchain virtual machine design and implementation.