Merkle Root
  • Hash256
  • Hash160
  • Reverse Bytes
Youtube Twitter

Merkle Root A fingerprint for all the transactions in a block.

A merkle root is created by hashing together pairs of TXIDs, which gives you a short yet unique fingerprint for all the transactions in a block.

This merkle root is then used as a field in a block header, which means that every block header will have a short representation of every transaction inside the block.

Why do we use a merkle root?

If we wanted to create a unique fingerprint for all the transactions in a block, we could just hash all the TXIDs together in one go. However, if we later wanted to check that a TXID was a part of that hash, we would need to know all the other TXIDs too:

You would need every other TXID to recreate the same hash.

You would need every other TXID to recreate the same hash.

But with a merkle tree, if we want to check that a TXID is part of the merkle root, we would only need to know some of the hashes along the path of the tree:

You only need the right branches (the "merkle proof") to reconstruct the merkle root

You only need the right branches (the "merkle proof") to reconstruct the merkle root

As a result, by using a merkle root as our fingerprint for the block header, we can later find out if a transaction exists in a block without having to know every other TXID in the block.

A merkle tree is just an efficient way to prove that something is in a set, without having to know the full set.

Is it really worth it?

There doesn't appear to be much benefit to using merkle trees when you have a small number of transactions in a block, but you can really see the difference when you have a much larger number of starting "leaves" in the tree:

In this block of 28 transactions, instead of needing 27 other hashes, we only need 5 to prove that a TXID is used as part of the final hash.

In this block of 28 transactions, instead of needing 27 other hashes, we only need 5 to prove that a TXID is used as part of the final hash.

Merkle trees are much more efficient, especially when you're dealing with a blocks that have 2,000+ transactions...

Merkle Proof Example

Let's say we have only have a block header for a block with 2,352 transactions, and we want to check that a specific TXID is inside the block.

  • Without a merkle root (i.e. just a simple hash of all the txids in the block header), we would need to download 75,232 bytes (2,351 x 32 byte TXIDs) of data to recreate the fingerprint in the block header and verify that the TXID exists in the block.
  • With a merkle root, we only need to download 384 bytes (12 x 32 byte branches along the path of the merkle tree) to recreate the merkle root and verify that the TXID exists in the block.

Here's what the merkle proof looks like:

txid
----
51a3dd31a49acb157d010f08e5c4774721d6dd39217866f2ed42d209b66a6ff6

merkle proof
------------
50ba87bdd484f07c8c55f76a22982f987c0465fdc345381b4634a70dc0ea0b38 left
96b8787b1e3abed802cff132c891c2e511edd200b08baa9eb7d8942d7c5423c6 right
65e5a4862b807c83b588e0f4122d4ca2d46691d17a1ec1ebce4485dccc3380d4 left
1ee9441ddde02f8ffb910613cd509adbc21282c6e34728599f3ae75e972fb815 left
ec950fc02f71fc06ed71afa4d2c49fcba04777f353a001b0bba9924c63cfe712 left
5d874040a77de7182f7a68bf47c02898f519cb3b58092b79fa2cff614a0f4d50 left
0a1c958af3e30ad07f659f44f708f8648452d1427463637b9039e5b721699615 left
d94d24d2dcaac111f5f638983122b0e55a91aeb999e0e4d58e0952fa346a1711 left
c4709bc9f860e5dff01b5fc7b53fb9deecc622214aba710d495bccc7f860af4a left
d4ed5f5e4334c0a4ccce6f706f3c9139ac0f6d2af3343ad3fae5a02fee8df542 left
b5aed07505677c8b1c6703742f4558e993d7984dc03d2121d3712d81ee067351 left
f9a14bf211c857f61ff9a1de95fc902faebff67c5d4898da8f48c9d306f1f80f left

merkle root
-----------
17663ab10c2e13d92dccb4514b05b18815f5f38af1f21e06931c71d62b36d8af

The merkle proof contains a list of the hashes along the branches of the merkle tree that we need to get to the merkle root. We just start with the TXID "leaf" we want to check, and recursively concatenate and hash through this merkle proof, which will give us the merkle root at the bottom.

The branches of the merkle proof also need to let you know if they are on the "left" or "right", so you can concatenate each pair in the correct order before hashing them together.

So whilst merkle trees take a little more effort in the beginning, they save energy when it comes to verification later on.

When are merkle trees useful in bitcoin?

Thanks to merkle trees, you can create "thin nodes" (or "lightweight wallets") that can verify when a transaction has made it in to a block, without the overhead of having to download and store the entire the blockchain.

These wallets just download and store block headers, and use the merkle roots inside them (along with merkle proofs they receive from full nodes) to verify that a transaction has made it in to a block.

Block headers are just 80 bytes, whereas each block can be 1,000,000+ bytes.

Block headers are just 80 bytes, whereas each block can be 1,000,000+ bytes.

I have no experience with this though, so here are some interesting links instead:

How do you create a merkle root?

Here's a technical diagram that explains how merkle roots are created in bitcoin:

And here's some Ruby code for creating a merkle root (from an array of TXIDs). It's worth a read, even if you're not a programmer at the moment.

require 'digest' # Need this for the SHA256 hash function

# Hash function used in the merkle root function (and in bitcoin in general)
def hash256(hex)
    binary = [hex].pack("H*")
    hash1 = Digest::SHA256.digest(binary)
    hash2 = Digest::SHA256.digest(hash1)
    result = hash2.unpack("H*")[0]
    return result
end

def merkleroot(txids)
  # 0. Keep an array of results for each level of hashing
  result = []

  # 1. Split up array in to pairs
  txids.each_slice(2) do |one, two|
    # 2. Concatenate each pair (or concatenate with itself if not a pair)
    if (two)
      concat = one + two
    else
      concat = one + one
    end

    # 3. Hash the concatenated pair and add to results array
    result << hash256(concat)
  end

  # Recursion: Exit Condition - Stop recursion when we have one final hash result.
  if result.length == 1
    # Convert the result to a string and return it
    return result.join('')
  end

  # Recursion: Do the same thing again for the array of hashed pairs.
  merkleroot(result)
end


# Test (e.g. block 000000000003ba27aa200b1cecaad478d2b00432346c3f1f3986da1afd33e506)
txids = [
  "8c14f0db3df150123e6f3dbbf30f8b955a8249b62ac1d1ff16284aefa3d06d87",
  "fff2525b8931402dd09222c50775608f75787bd2b87e56995a7bdd30f79702c4",
  "6359f0868171b1d194cbee1af2f16ea598ae8fad666d9b012c8ed2b79a236ec4",
  "e9a66845e05d5abc0ad04ec80f774a7e585c6e8db975962d069a522137b80c1d"
]

txids = txids.map {|x| x.scan(/../).reverse.join('') } # TXIDs must be in little endian
result = merkleroot(txids) # The result is in little endian, so lets convert it back to big endian...
puts result.scan(/../).reverse.join('') # f3e94742aca4b5ef85488dc37c06c3282295ffec960994b2c0d5ac2a25a95766

Why is it called a "merkle" root?

Because Ralph Merkle patented it in 1979.

Common Misconception.

Code

By Greg Walker,

Last Updated: 14 Feb 2019
  • 14 Feb 2019: merkle root - using full copy of ruby code instead of including it - more reliable for now
  • 14 Feb 2019: Merkle Root - complete rewrite. Updated diagrams and included nice ruby code example.
  • 25 Nov 2018: Subtitles for pages in Glossary. Looks good.
  • 20 Mar 2018: CSS - Separated mobile CSS in to its own file, and changed all em to px
  • 28 Feb 2017: hash-function.md - added xlinks
  • 26 Feb 2017: txid.md - edited
  • 03 Jan 2017: Fixed broken HTML when using pandoc.
  • 01 Jan 2017: Moved /manual/guide/ and /manual/reference/ to /guide/ and /glossary/

Hey there, it's Greg.

I'll let you know about cool website updates, or if something seriously interesting happens in bitcoin.


Don't worry, it doesn't happen very often.