Understanding Blockchain P2P Networks: Structure and Protocols

In the previous article, we introduced the core components of blockchain technology. Today, we dive into the first essential element: the peer-to-peer (P2P) network.

P2P technology is widely used across various domains—from streaming media and direct communication to file sharing and collaborative processing. Common P2P protocols include BitTorrent, ED2K, Gnutella, and Tor, which power tools like BT clients and eDonkey.

Cryptocurrencies like Bitcoin and Ethereum implement their own unique P2P network protocols, which differ significantly from traditional P2P models. This article focuses on the P2P networks underpinning blockchain technology, specifically those of Bitcoin and Ethereum.

Given the complexity of blockchain P2P networks, we will concentrate on four key aspects:

Network connections and topology
Node discovery
Local area network (LAN) penetration
Node interaction protocols

By the end of this article, you will have a clear understanding of the topology and operational principles of mature blockchain P2P networks.

Network Connections and Topology

Network Connections

Most blockchain projects rely on the TCP/IP protocol as their underlying network layer, placing them at the application layer alongside HTTP and SMTP. This clarifies a common misconception: blockchain technology does not "overthrow the internet" but may eventually challenge specific protocols like HTTP.

Blockchain replaces the client-server model of HTTP with a fully decentralized, point-to-point topology—a shift central to Ethereum’s vision of Web3.0.

Bitcoin’s P2P network is complex, especially when considering mining pool interactions and lightweight nodes. Here, we focus solely on full nodes. Bitcoin uses TCP for communication, with the mainnet default port set to 8333.

Ethereum’s P2P network differs by being fully encrypted and supporting both UDP and TCP connections. Its mainnet uses TCP port 30303 and UDP discovery port 30301.

Topology Structure

P2P networks can have centralized, semi-centralized, or fully distributed topologies. Bitcoin’s full-node network is fully distributed, using a flood algorithm: transactions originate at one node and propagate to all connected neighbors until the entire network is updated.

The interaction between full nodes and Simplified Payment Verification (SPV) clients resembles a semi-centralized structure. SPV nodes connect to a full node, which acts as a proxy to broadcast transactions on their behalf.

Node Discovery

Node discovery is the first step for any blockchain node joining the network. It’s akin to asking for directions in an unfamiliar place—without a map, you rely on nearby individuals.

Initial Node Discovery

In Bitcoin, initial node discovery occurs in two ways:

DNS Seed Nodes: Community maintainers, like Bitcoin core developer Sipa, manage domains (e.g., seed.bitcoin.sipa.be) that resolve to multiple IPv4 addresses. Clients connect to these via port 8333.
Hard-Coded Seed Nodes: The client software includes predefined node addresses. If all DNS seeds fail, the node attempts to connect to these hard-coded addresses.

Ethereum follows a similar approach, using hard-coded seed nodes in its software.

Post-Startup Node Discovery

After initial discovery, nodes dynamically maintain their peer lists. In Bitcoin, nodes exchange peer lists with neighbors. Ethereum uses a more structured approach: the Kademlia (KAD) network, a Distributed Hash Table (DHT) system that efficiently locates resources via UDP. Once nodes are discovered, data exchange switches to TCP.

Blacklists and Persistent Connections

Public blockchains operate in open environments, making them potential targets for attacks. Bitcoin allows users to manually blacklist suspicious nodes and whitelist trusted ones. Ethereum implements account-level blacklisting at the application layer. While not part of standard protocols, these measures enhance security—though network-level firewalls remain a common alternative.

LAN Penetration

Most nodes operate behind LANs, which public nodes cannot directly access. To enable inbound connections, network address translation (NAT) and Universal Plug and Play (UPnP) protocols are used.

NAT modifies TCP packet source addresses to map public IPs to local addresses. UPnP automates this process, allowing devices to discover and communicate with each other without manual configuration.

Both Bitcoin and Ethereum support UPnP for automatic LAN penetration, provided the router supports NAT and UPnP.

Node Interaction Protocols

Once connected, nodes communicate using specific commands embedded in message headers. Commands fall into two categories: requests and data exchanges.

The first step is a handshake: nodes exchange version information to ensure compatibility. Ethereum enhances this with symmetric encryption; Bitcoin does not.

After handshaking, nodes maintain persistent connections. Bitcoin uses PING/PONG messages as heartbeats. Ethereum integrates similar functionality into its node discovery process.

Request commands include queries like getaddr (to fetch peer lists) and inv (to transmit data). Block synchronization—critical for blockchain functionality—occurs in two ways:

Header-First: Nodes sync headers first, then request block bodies.
Block-First: Nodes request full blocks directly.
Header-first reduces network burden by optimizing data exchange.

Conclusion

Blockchain P2P networks address two primary challenges: resource location and resource acquisition. Node discovery and LAN penetration solve the first; interaction protocols address the second.

While this article focused on Bitcoin and Ethereum, most blockchain projects implement similar P2P mechanisms. The stability of the P2P layer directly impacts the entire network’s resilience. As blockchain networks form distributed webs akin to the internet, one might even design a node crawler to map all connected nodes—an intriguing project for enthusiasts.

👉 Explore advanced network protocols

Frequently Asked Questions

What is the role of P2P networks in blockchain?
P2P networks enable decentralized communication between nodes, ensuring no single point of failure. They facilitate transaction broadcasting, block propagation, and consensus without central servers.

How do nodes find each other in Bitcoin?
Bitcoin nodes use DNS seeds and hard-coded addresses for initial discovery. After startup, they exchange peer lists with connected nodes to dynamically update their connections.

Why is LAN penetration important for blockchain nodes?
Many nodes operate behind firewalls or NAT. LAN penetration techniques like UPnP allow these nodes to receive incoming connections, ensuring they fully participate in the network.

What is the difference between Bitcoin and Ethereum P2P networks?
Bitcoin uses TCP and a simple flood algorithm. Ethereum supports both TCP and UDP, employs encryption, and uses the Kademlia protocol for efficient node discovery.

How do block synchronization methods differ?
Header-first synchronization reduces bandwidth by fetching headers first and bodies later. Block-first requires full blocks immediately, which is simpler but less efficient.

Can I manually manage node connections?
Yes, both Bitcoin and Ethereum clients allow users to blacklist suspicious nodes or whitelist trusted ones, adding a layer of security and control.