Through our previous studies, we have gained a macro-level understanding of how nodes in a blockchain network collaborate externally. Next, we will delve into the blockchain from a more micro perspective.
Have you ever wondered what the inside of each block in a blockchain looks like? How does it store such a vast and complex amount of data?
This article will take you on a deep dive to find out!
The Overall Structure of a Blockchain
Blockchain is called a blockchain because it is literally a chain structure made up of blocks containing transaction information, linked together end-to-end. Each block is a single link in this chain structure.
How are blocks able to link to each other sequentially?
Each block points to the previous one through a value (the parent hash, explained in the next section), and so on. Blocks connect to form a chain that can be traced all the way back to the genesis block.
This is another example of large-scale collaboration. Each block just needs to follow its own simple rules to form a complex system.
The Overall Structure of a Single Block
Each block primarily consists of two parts: the block header and the block body. The block header is mainly used to store certain attributes related to the block itself, while the block body is used to store the actual transaction data records.
A block is connected to a parent block before it and a child block after it, as shown in the diagram below:
The Block Body
Let's start with the block body and see how it stores transaction data.
The block body includes all the transaction records that have been verified and generated during the block creation process for the current block. These records are used to generate a unique Merkle root through a hashing process in a Merkle tree, and this root is recorded in the block header.
What is a Merkle Root?
First, understand the Merkle tree. A Merkle tree is a hash binary tree. It is a data structure used for quickly summarizing and verifying the integrity of large-scale data. This binary tree contains cryptographic hash values. The term "tree" is often used in computer science to describe a data structure with branches.
In the Bitcoin network, Merkle trees are used to summarize all transactions in a block. They generate a digital fingerprint for the entire transaction set and provide an efficient way to verify whether a transaction exists within a block.
Generating a complete Merkle tree requires recursively hashing pairs of hash nodes and inserting the newly generated hash nodes into the Merkle tree until only one hash node remains. This node is the root of the Merkle tree.
In simple terms, you can think of a Merkle tree as an upside-down tree where each branch can only split into two smaller branches, and eventually, each smallest branch will hold two leaves.
Here, each leaf is a transaction record, and each branch point is a hash value. Each hash value is calculated based on the hash values of the two smaller branches or leaves that the branch point splits into.
The hash values of these branch nodes converge upward to a higher branch point, where they are hashed again to generate another hash value. This process continues all the way up to the root of the tree. The hash value calculated for this root is the root hash value. This structure allows for quick定位 (positioning) of a specific transaction within it.
A key characteristic of a Merkle tree is: any change in the underlying data will propagate to its parent node, all the way up to the root.
Understanding the Merkle tree will give you a deeper appreciation of the block body structure diagram mentioned above.
The Block Header
The block header is primarily composed of three sets of data. The first set is the hash of the parent block. This parent hash is used to connect the block to its previous block. The second set of data is related to miner competition and involves difficulty, timestamp, and Nonce (a random number). The third set is the root hash value calculated from the block body we just discussed, i.e., the Merkle root.
It's important to focus on one concept here: What is a parent hash?
Performing a hash operation on the data in a block's header generates a hash value. Any change in the block header's data will cause this hash value to change. Therefore, this hash value can serve as a unique identifier for the block.
This hash value can be used to find the corresponding block in the blockchain. For a new block that follows, this hash value is the parent hash.
One crucial point to note: A block does not store its own hash value; it only stores the hash value of its parent block. Its own hash value will be stored in the child block as that child's parent hash.
Because each block's header contains the hash of its parent, and there is only one parent hash (since the blockchain has only one longest chain), each block can be traced back to the genesis block (the first block) through its parent hash.
Introducing the concept of the parent hash not only links blocks together but also ensures the immutability of the blockchain.
Since the block header contains the parent block's hash, the current block's hash is also influenced by this value. If the data in the parent block changes, its hash value will inevitably change. This means a child block can no longer connect to the previous block using the original parent hash.
Therefore, if someone wants to alter the data in one block, they would have to recalculate all the subsequent blocks. This recalculation requires an enormous amount of computing power, making it practically impossible and further ensuring the security of the blockchain network.
The second set of data in the block header—difficulty, timestamp, and Nonce—will be explained in detail later when discussing mining. For now, we just need to know that the block header contains these elements to help build an overall concept early on.
Why Is the Block Designed This Way?
We have basically dissected the internal structure of a block. But have you considered the advantages of designing a block in this manner?
First, understand that the block header is 80 bytes, while the average transaction is at least 250 bytes. Furthermore, the average block contains over 500 transactions. Therefore, a complete block body containing all transactions is more than 1000 times larger than the block header.
The blockchain is a distributed network, so data needs to be stored on various nodes. However, the complete data of the Bitcoin network might add up to tens or even hundreds of gigabytes—more than an ordinary terminal can handle. Many Bitcoin clients are designed to run on devices with limited space and power.
Devices like smartphones, tablets, and embedded systems cannot store all the data of the Bitcoin network. So, what can be done?
This is where the brilliance of the block structure shines. Many nodes in the Bitcoin network are primarily used for verifying transactions. They only need to download the block headers, not the transaction information contained in each block, to complete transaction verification.
Such a blockchain, without transaction information, is only a fraction of the size (about 1/1000th) of the complete blockchain, greatly saving terminal storage space.
It is precisely because of the block's internal structure that a terminal can verify a specific transaction using only the data from the block headers. This is called Simplified Payment Verification (SPV). Nodes operating this way are called SPV nodes. The specific implementation原理 (principle) will be explained in the next article.
For those looking to delve deeper into the technical mechanisms that power such verification, you can explore more about blockchain core protocols.
Frequently Asked Questions
What is the main purpose of the Merkle root in a Bitcoin block?
The Merkle root serves as a digital fingerprint for all transactions within a block. It allows for efficient and secure verification of whether a specific transaction is included in the block without needing to download the entire block's data.
How does the parent hash contribute to blockchain security?
The parent hash links each block to its predecessor. Altering a block would change its hash, breaking the link with all subsequent blocks. To successfully alter the chain, every following block would need to be recomputed, which is computationally infeasible, thus securing the network.
What is an SPV node, and what does it do?
An SPV (Simplified Payment Verification) node is a lightweight client that does not store the entire blockchain. It only downloads block headers to verify transactions, relying on the Merkle tree structure to prove inclusion, making it suitable for devices with limited resources.
Why is the block header much smaller than the block body?
The header contains only summary data (like hashes and metadata), while the body contains all the actual transaction details. This design allows nodes to verify transactions efficiently without storing massive amounts of data.
Can the structure of a Bitcoin block be changed?
Changing the core structure of a Bitcoin block would require a consensus among the network participants. Such changes are proposed through Bitcoin Improvement Proposals (BIPs) and are implemented only if widely accepted, ensuring network stability.
What happens if a transaction within a block is incorrect or fraudulent?
The consensus rules of the Bitcoin network ensure all transactions in a block are validated by miners before inclusion. An invalid transaction would be rejected by honest nodes during the validation process and would not be included in a valid block.
Summary
The tight connection between the block body and the block header, the coordinated sequence of block headers and parent-child blocks, and the interlocking of various nodes create an impregnable chain. The deeper one studies Bitcoin, the more one can appreciate its power and the astonishingly clever ideas behind it.
We are not studying code here but rather observing the internal structure of a Bitcoin block from the perspective of overall design philosophy. We hope this article has provided you with a clear understanding.
To further your understanding of how these principles are applied in practice, you can discover advanced blockchain insights.