• Hash256
  • Hash160
  • Reverse Bytes
  • Hexadecimal
  • Satoshis

Checksum

A simple method for error-checking data.

A checksum is a small piece of data that allows you check if another piece of data is the same as expected.

For example, in Bitcoin, addresses include checksums so they can be checked to see if they have been typed in correctly.

Try it!

How do they work?

In Bitcoin, checksums are created by hashing data through SHA256 twice, and then taking the first 4 bytes of the result:

This gives you a small, reliable, and fairly unique snippet of data. A bit like a fingerprint.

You would then keep the data and the checksum together, so that you can check that the whole thing has been typed in correctly the next time you use it.

If you make one small mistake (in any part), the data will no longer match the checksum.

So basically, a checksum is a handy little error-checking tool.

Where are checksums used in Bitcoin?

Checksums are included in:

These two keys are commonly transcribed (copied, pasted, typed, written down, etc.), so it’s useful for them to contain checksums.

The presence of a checksum enables software to validate these types of keys when they are typed in. The software won’t be able to tell you what the key should be, but at least it will be able to save you from sending money to the wrong address due to a typo.

Creating a checksum.

As mentioned, checksums in Bitcoin are created by hashing data through SHA256 twice and taking the first 4 bytes.

You could call a checksum in bitcoin a “truncated SHA256 hash”.

Example.

data                 = learnmeabitcoin
sha256(sha256(data)) = 52bbde771cbf39f8a7db44372ba3ed2336276f95e3a8723388a28943cc95df57
checksum             = 52bbde77

1 byte = 2 characters

Code.

This is how you might calculate a checksum in Ruby:

require 'digest'

def checksum(data)
    # 1. Convert data to binary before hashing it.
    binary = [data].pack("H*")

    # 2. Hash the data twice
    hash1 = Digest::SHA256.digest(binary)
    hash2 = Digest::SHA256.digest(hash1)

    # 3. Take the first 4 bytes
    checksum = hash2[0,4]

    # 4. Convert binary back to hexadecimal and return result
    hex = checksum.unpack("H*")[0] # unpack returns an array, so [0] just grabs the first result
    return hex
end

puts checksum("learnmeabitcoin")

Checking a checksum.

You can verify a checksum by calculating the expected checksum for a piece of data, and comparing it with the one given.

Example: checking if an address is valid.

A common situation is checking that a given address is valid (all addresses come with a checksum inside).

To do this, you first of all need to decode the address from base58. Then you separate the data part from the checksum, and verify that the checksum you calculate from the data matches the one given.

address        = "1AKDDsfTh8uY4X3ppy1m7jw1fVMBSMkzjP" # typical P2PKH address
base58_decoded = "00662ad25db00e7bb38bc04831ae48b4b446d1269817d515b6" # (base58 decoding not shown here)

data           = "00662ad25db00e7bb38bc04831ae48b4b446d12698" # 1-byte prefix + 20-byte public key hash
checksum       = "17d515b6" # 4-byte checksum

data_checksum  = checksum("00662ad25db00e7bb38bc04831ae48b4b446d1269817d515b6") # calculate the checksum
               = "17d515b6" # check it matches the one given

A base58 decoded address contains: a prefix, the hash160 of something (e.g. public key hash), and a checksum. But all you really need to know here is that the checksum is the last 4 bytes.

Code

This Ruby code uses the same checksum() function above.

require 'digest'

# Checksum function
def checksum(data)
    binary = [data].pack("H*")
    hash1 = Digest::SHA256.digest(binary)
    hash2 = Digest::SHA256.digest(hash1)
    checksum = hash2[0,4]
    hex = checksum.unpack("H*")[0]
    return hex
end

# Get an address and decode it from base58
address        = "1AKDDsfTh8uY4X3ppy1m7jw1fVMBSMkzjP" # example address
base58_decoded = "00662ad25db00e7bb38bc04831ae48b4b446d1269817d515b6" # (base58 decoding not shown here)

# Separate the data part from the checksum
data     = base58_decoded[0...-8] # everything apart from the last 8 characters
checksum = base58_decoded[-8..-1] # the last 8 characters (4 bytes)

# Calculate the checksum for the data
data_checksum = checksum(data)

# Check to see if it matches the checksum given
verify = data_checksum == checksum

# Results
puts data_checksum
puts checksum
puts verify #=> true

As long as you can get to the data and the checksum, the verification part is pretty straightforward.

FAQ

Why only the first 4 bytes?

It would be safer and more reliable to use the full hash result as a checksum. However, this would make addresses much longer, as the entire 32 byte hash would have to be included inside.

The taking of the first 4 bytes gives you enough “uniqueness” to be pretty sure the original data is correct, whilst also not making the final address inconveniently long. It’s just a balance between reliability and convenience really.

What are the chances you make a mistake, but still get the same checksum result?

The checksum is a random 4-byte hexadecimal number, so there is a 1 in 0xFFFFFFFF of that happening. In decimal, that’s 1 in 4,294,967,295.

So pretty slim.

Thanks

  • Gregory Maxwell, for the quick computer science lesson on (and the history of) checksums.

Tools

Checksum

By Greg Walker,

Last Updated: 12 Nov 2020
  • 12 Nov 2020: checksum.md - ruby code for checking if an address is valid by verifying the checksum inside
  • 21 Jul 2020: renamed /guide/ to /technical/
Back to Top

Hey there, it's Greg.

I'll let you know about cool website updates, or if something seriously interesting happens in bitcoin.


Don't worry, it doesn't happen very often.