• Hash256
  • Hash160
  • Reverse Bytes
  • Hexadecimal
  • Satoshis

Hash Function

A small program that scrambles data.

A hash function is a mini computer program that takes data, scrambles it, and gives you a unique fixed-length result.

The cool thing about hash functions is that:

  • You can put as much data as you want in to the hash function, but it will always return the same-length result.
  • The result is unique, so you can use it as a way to identify that data.

So in other words, a hash function allows you to create a digital fingerprint for whatever data you put in to it.

Hash function properties

A good hash function has 3 important properties that make it useful.

Note: The SHA256 hash function is the main one used in Bitcoin, so I’ll use that in my upcoming examples.

1. You cannot work out the original data from the result.

A cryptographic hash function produces a random result (with no patterns), so there is no way of “going backwards” through the hash function to figure out what the original data was.

This is the property of a cryptographic hash function. You may be able to reconstruct the original data from the result of a “basic” hash function, but a cryptographic hash function’s job is to make this as difficult as possible.

2. The same data always returns the same result.

A hash function scrambles data systematically, so that the same input will always produce the same result.

If you put some data in to a hash function, you can be sure that data going to produce the same result every time.

For example:

data                sha256(data)
---------------     ----------------------------------------------------------------
learnmeabitcoin     ef235aacf90d9f4aadd8c92e4b2562e1d9eb97f0df9ba3b508258739cb013db2
learnmeabitcoin     ef235aacf90d9f4aadd8c92e4b2562e1d9eb97f0df9ba3b508258739cb013db2
learnmeabitcoin     ef235aacf90d9f4aadd8c92e4b2562e1d9eb97f0df9ba3b508258739cb013db2

3. Different data produces different results.

If you put unique data in to the hash function, the hash function will give you a unique result.

Even the smallest changes in data return wildly different results.

For example:

data                sha256(data)
----------------    ----------------------------------------------------------------
learnmeabitcoin     ef235aacf90d9f4aadd8c92e4b2562e1d9eb97f0df9ba3b508258739cb013db2
learnmeabitcoin1    f94a840f1e1a901843a75dd07ffcc5c84478dc4f987797474c9393ac53ab55e6
learnmeabitcoin2    b9638ef00b064055b5d0b408414be02f3ab66cce752c7ac3b7595b0fffaa6567
learnmeabitcoin3    c6fd80741e150fb7ee71453fb0a2a391261f6a0d4d60759b843639e6cbae7b91
learnmeabitcoin4    255da46dc8699fffd841b7c66a31eeb4f8eda8e1ca6850c7356376518f52d3c1

If different data returned the same result it would be called a “collision”, and it would mean the hash function was broken.

Where are hash functions used in Bitcoin?

1. Transaction Hashes

You hash transaction data to get a TXID (Transaction ID, Transaction Hash).

  • The ability to hash a long string of transaction data in to a short, unique string allows you to create a unique identifier for each transaction.

2. Block Hashes (and Mining)

You hash block headers to get a block hash.

  • So you can also create a unique ID for each block.
  • The fact that each hash result is random allows for the mechanism of mining.

3. Addresses

A public key is hashed (using both SHA256 and RIPEMD160) in the process of creating a bitcoin address.

  • The fact that you cannot work backwards from a hash result potentially helps with the security of public keys when they are placed inside locking scripts.
  • RIPEMD-160 produces a digest that’s shorter than the length of the public key, which reduces the length of the resulting address.

How do you hash data in Bitcoin?

There are two main methods for hashing data in Bitcoin, and they have the following names:

  1. Hash256 - Double SHA-256
  2. Hash160 - SHA-256 followed by RIPEMD-160

1. Hash256

This involves putting data through the SHA-256 hash function, then putting the result through the SHA-256 again. Or in other words, it’s just “double SHA-256”. We call it Hash256 for short.

This is the most common method for hashing data in Bitcoin. It’s used when hashing transaction data to create TXIDs, and when hashing block headers during mining.

require 'digest'

def hash256(hex)
  # convert hexadecimal string to byte sequence first
  binary = [hex].pack("H*") # H = hex string (highest byte first), * = multiple bytes
  
  # SHA-256 (first round)
  hash1 = Digest::SHA256.digest(binary)
  
  # SHA-256 (second round)
  hash2 = Digest::SHA256.digest(hash1)
  
  # convert from byte sequence back to hexadecimal
  hash256 = hash2.unpack("H*")[0]
  
  return hash256
end

puts hash256("aa") #=> e51600d48d2f72eb428e78733e01fbd6081b349528335bf21269362edfae185d

2. Hash160

This involves putting data through the SHA-256 hash function, then putting the result through the RIPEMD-160 hash function next. We call it Hash160 for short.

RIPEMD-160 produces a shorter hash digest (160 bits / 20 bytes) compared to SHA-256 (256 bits / 32 bytes), so it’s typically used when you want to produce a shorter hash than what you’d get from using Hash256.

It’s only used when shortening public keys and scripts in the process of creating legacy addresses (e.g. addresses beginning with a 1 or a 3). It has not been used in any recent developments that require hashing of data in Bitcoin.

require 'digest'

def hash160(data)
  # convert hexadecimal string to byte sequence first
  binary = [data].pack("H*") # H = hex string (highest byte first), * = multiple bytes
  
  # SHA-256
  sha256 = Digest::SHA256.digest(binary)
  
  # RIPEMD-160
  ripemd160 = Digest::RMD160.digest(sha256)
  
  # convert from byte sequence back to hexadecimal
  hash160 = ripemd160.unpack("H*").join
  
  return hash160
end

puts hash160("aa") #=> 58d179ca6112752d00dc9b89ea4f55a585195e26

A common mistake when hashing

A common mistake when hashing data in bitcoin is to insert strings in to the hash function, and not the underlying byte sequences those strings actually represent.

For example, let’s say we have the hexadecimal string ab.

If we insert this string directly in to the hash function, your programming language will actually send the ASCII encoding of each of these characters in to the hash function, which looks like this in binary:

"ab" = 01100001 01100010
sha256(0110000101100010) = fb8e20fc2e4c3f248c60c39bd652f3c1347298bb977b8b4d5903b85055620603

But what we actually want to send in to the hash function is the byte this hexadecimal string represents, which looks like this in binary:

0xab = 10101011
sha256(10101011) = 087d80f7f182dd44f184aa86ca34488853ebcc04f0c60d5294919a466b463831

This is why we usually need to “pack” our hexadecimal strings in to bytes first before hashing.

Most programming languages will have functions that allow you to do this:

hex = "ab"
binary = [hex].pack("H*") # H = hex string (highest byte first), * = multiple bytes
$hex = "ab"
$binary = pack("H*", $hex);

You will probably see a bunch of jargon text if you print out these converted binary values. This makes sense, because your programming language converts this binary data back to ASCII when printing, and it probably now refers to a weird code point in the ASCII table.

Remember that hash functions take in binary data as the input, so we need to be specific about the binary data we want to insert.

All bitcoin data is just a bunch of bytes at the end of the day. We just use their hexadecimal string representation for convenience from time to time.

If you forget to convert your hexadecimal strings to their corresponding bytes beforehand, your programming language will assume you want to send the binary representation of the characters in the string, and this will produce a completely different hash result than expected.

This is by far the most common issue people have when they hashing data in Bitcoin for the first time. So if you’re not getting the right hash results, this is probably where you’re going wrong.

By Greg Walker,

Last Updated: 19 Apr 2021
  • 19 Apr 2021: changed button text
  • 19 Apr 2021: code comments about packing data using H*
  • 19 Apr 2021: added hash256 and hash160 tools
  • 18 Apr 2021: added code examples for hash256 and hash160
  • 14 Apr 2021: added note about hashing strings and hashing bytes
  • 21 Jul 2020: renamed /guide/ to /technical/
Back to Top

Hey there, it's Greg.

I'll let you know about cool website updates, or if something seriously interesting happens in bitcoin.


Don't worry, it doesn't happen very often.