Hash Function

Fingerprints for data

Diagram showing data being put into a hash function and a fingerprint coming out the other side.

A hash function is a programming tool that creates fingerprints for data.

It takes in any amount of data, scrambles it, and returns a short and unique result for that data.

tool-6841ea61d854c
Tool Icon

SHA-256 (Text)

Hash a string of text using the SHA-256 hash function.

Enter any string of characters

0 characters
Hash Function Icon SHA-256
0 bytes
0 secs

This is just a quick example of the SHA-256 hash function. It hashes text (ASCII characters) instead of hexadecimal bytes. Use SHA-256 and HASH256 instead for hashing actual raw data in Bitcoin using SHA-256.

SHA = Secure Hashing Algorithm, 256 = 256 bits (the size of the hash result).

The result of the hash function is just a bunch of bytes. So the letters and numbers you're seeing are just bytes of data represented by hexadecimal characters, which is typically how the outputs of hash functions are displayed.

Technical terms

Diagram showing the technical terms for the input and output of a hash function.

Here are some technical terms that crop up from time to time:

"Preimage" is probably the most awkward technical term you'll come across, but it just refers to some specific data you insert into the hash function.

I prefer to use the terms "data" and "hash", but don't be surprised if you run into the other terms now and then.

Properties

What makes a hash function a hash function?

There are a few different properties that separate a basic hash function from a strong hash function

Basic Hash Function

There are a few basic properties of hash functions that you've probably already noticed:

  1. You always get the same result for the same data. If you and someone else insert the same data into the same hash function, you'll both get the same result. This is known as being deterministic.
    Diagram showing a hash function producing the same output for the same input.
    Deterministic
  2. You get a fixed-length result no matter the size of the data. A hash function can take in any amount of data, and it will scramble and compress it to produce a (usually) shorter result. The size of the hash varies between hash functions, but for SHA-256 it's 256 bits (32 bytes) in size.
    Diagram showing a hash function producing a fixed-size output no matter the size of the input.
    Fixed-length result
  3. You get wildly different results with small changes to the data. What comes out of the hash function appears to be random. Even the smallest change to the data you insert to the hash function will produce completely different results. This is known as the avalanche effect.
    Diagram showing a hash function producing wildly different outputs for slightly different inputs.
    Avalanche effect

Those are the obvious features you'd expect from a good "fingerprinting machine".

Strong Hash Function

(aka Cryptographic Hash Function)

There are some properties of strong hash functions (e.g. SHA-256) that you may not have noticed:

  1. A strong hash function is irreversible. If I gave you a hash result, you wouldn't be able to work out the original data that I used to create it. The only way you could hope to find it would be to try different inputs in a brute-force search, which would be computationally infeasible given the massive range of possible outputs. Anyway, this property of not being able to work backwards is known as one-wayness.
    Diagram showing how you are unable to work backwards from the output of a strong hash function to calculate the input.
    One-wayness
  2. Every piece of data should have its own unique result. Obviously this is technically impossible as there are infinite combinations of data out there. However, a secure hash function should make it computationally infeasible to find two different pieces of data that produce the same hash digest. This property is known as collision resistance.
    Diagram showing a how no two inputs to a strong hash function will have the same output.
    Collision resistance
  3. You can't control the result of a strong hash function. There's no way you can figure out how to construct an input to give you a specific result from a hash function. If you want a specific result, you just have to keep hashing different pieces of data to get the kind of result you want. This property is known as preimage resistance.
    Diagram showing how you cannot manipulate the input to a strong hash function to control the results of the output.
    Preimage resistance

A hash function is referred to as a "cryptographic hash function" if it achieves these 3 strong properties.

This means that it's usually slower than a basic hash function (although still pretty fast overall), but it also means it can be relied upon to be unpredictable and produce unique results for different pieces of data. Which is an important feature when it comes to cryptography.

So as you can guess, Bitcoin uses cryptographic hash functions:

Technical terms

This is just for my reference if nothing else, as I keep forgetting what each of these terms mean when reading about secure hash functions in textbooks.

Anyway, in technical terms, a "cryptographic hash function" should possess the 3 following key properties:

Preimage Resistance
It's difficult to construct an input that produces a specific output.
Diagram showing preimage resistance of a cryptographic hash function.
Second Preimage Resistance (Weak Collision Resistance)
Given some data and its hash result, it's hard to find another piece of data that will hash to the same result. This is occasionally referred to as weak collision resistance.
Diagram showing second preimage resistance of a cryptographic hash function.
Collision Resistance
It's hard to find any two pieces of data that hash to the same result. This is similar to the last property, but whereas in second preimage resistance you are stuck with some starting data and have to find another piece of data that produces the same result, with collision resistance you are free to choose any two pieces of data that produce the same hash result. This is therefore sometimes referred to as strong collision resistance.
Diagram showing collision resistance of a cryptographic hash function.
  • These three terms seem to get mixed up from time to time (especially because second preimage resistance is a form of collision resistance).
  • To put it another way, you have preimage resistance, and two types of collision resistance.

And what is a preimage exactly?

Preimage (mathematics) - The set of arguments of a function corresponding to a particular subset of the range.
thefreedictionary.com

So a "preimage" is something that you put into a function that maps to a specific result.

Bitcoin

What hash functions does Bitcoin use?

There are five methods for hashing data in Bitcoin:

  1. HASH256 – Double SHA-256 (most common)
  2. HASH160 – SHA-256 + RIPEMD-160
  3. SHA-256 – Single SHA-256
  4. HMAC-SHA512 – HMAC with SHA-512
  5. PBKDF2 – Password Based Key Derivation Function 2

In Bitcoin we hash bytes of data. So in the tools below, you need to represent bytes by using hexadecimal characters (where every byte is made from two hex characters). See a common mistake with hashing for more details.

HASH256

SHA-256(SHA-256(data))

If you're hashing something in Bitcoin, you're almost always using HASH256.

This works by putting the data through SHA-256, then taking the result and put it through SHA-256 again.

tool-6841ea61d904a
Tool Icon

HASH256

Double SHA-256. Used for hashing block headers, transaction data, and mostly anything that needs to be hashed in Bitcoin.

0 bytes
Hash Function Icon SHA-256
Hash Function Icon SHA-256

SHA-256(SHA-256(data))

0 bytes
0 secs

You may notice that the hashes you get for block data and transaction data appear to be backwards. This is because block hashes and transaction hashes are actually in reverse byte order when searching for them using bitcoin-cli and on blockchain explorers.

This is the primary method for hashing data in Bitcoin. It's sometimes referred to as "double-SHA256" or "SHA-256d", but in code it's most commonly called hash256 for short.

Here are some examples of where you'll find HASH256 being used in Bitcoin:

Code

require 'digest'

def hash256(hex)
  # convert hexadecimal string to byte sequence first
  binary = [hex].pack("H*") # H = hex string (highest byte first), * = multiple bytes

  # SHA-256 (first round)
  hash1 = Digest::SHA256.digest(binary)

  # SHA-256 (second round)
  hash2 = Digest::SHA256.digest(hash1)

  # convert from byte sequence back to hexadecimal
  hash256 = hash2.unpack("H*")[0]

  return hash256
end

puts hash256("aa") #=> e51600d48d2f72eb428e78733e01fbd6081b349528335bf21269362edfae185d

Why do we hash twice?

Satoshi never mentioned why they chose double-SHA256 when designing Bitcoin.

Satoshi was probably concerned about something called a length extension attack, and it has been recommended in some literature (e.g. Cryptography Engineering) to use double-SHA256 to protect against it.

These kinds of attacks are not a concern for Bitcoin though, so either Satoshi was misguided in their choice of using double-SHA256, or they just wanted to be extra cautious.

Satoshi standardized on using double-SHA256 for 32-byte hashes, and SHA256+RIPEMD160 (each once) for 20-byte hashes, presumably because of (likely misguided) concern about certain attacks (like length extension attacks, which only apply when hashing secret data), and then used those everywhere.
Pieter Wuille, bitcoin.stackexchange.com

Either way, hashing data twice is now just a quirk of Bitcoin.

If you designed Bitcoin from scratch today there would be no benefit to using double-SHA256. In fact, recent upgrades to Bitcoin now favor using single-SHA256 where possible (e.g. script hashes in P2WSH).

Hash160

RIPEMD-160(SHA-256(data))

HASH160 is used infrequently in Bitcoin.

This works by putting the data through SHA-256, then taking the result and put it through another hash function called RIPEMD-160.

tool-6841ea61da31a
Tool Icon

HASH160

SHA-256 + RIPEMD-160. Used for shortening a public key or script before converting to an address.

A public key or script for example

0 bytes
Hash Function Icon SHA-256
Hash Function Icon RIPEMD-160

RIPEMD-160(SHA-256(data))

0 bytes
0 secs

HASH160 is only used when constructing legacy addresses:

Upgrades to Bitcoin over the years have not made further use of HASH160 when hashing data, and so now it's only used when constructing addresses for legacy locking scripts.

Code

require 'digest'

def hash160(data)
  # convert hexadecimal string to byte sequence first
  binary = [data].pack("H*") # H = hex string (highest byte first), * = multiple bytes

  # SHA-256
  sha256 = Digest::SHA256.digest(binary)

  # RIPEMD-160
  ripemd160 = Digest::RMD160.digest(sha256)

  # convert from byte sequence back to hexadecimal
  hash160 = ripemd160.unpack("H*").join

  return hash160
end

puts hash160("aa") #=> 58d179ca6112752d00dc9b89ea4f55a585195e26

Why RIPEMD-160?

RIPEMD-160 produces a shorter digest than SHA-256:

Hash Function Digest Size Example
SHA-256 32 bytes (256 bits) e51600d48d2f72eb428e78733e01fbd6081b349528335bf21269362edfae185d
RIPEMD-160 20 bytes (160 bits) 58d179ca6112752d00dc9b89ea4f55a585195e26

So RIPEMD-160 is ideal for creating shorter (yet still secure) fingerprints for public keys and scripts.

As for why Satoshi chose RIPEMD-160 over something like SHA-1 (which also produces 160-bit digests), I'm not sure. It could have been because RIPEMD-160 was known to be more collision resistant at the time (SHA-1 has had collisions since, so a wise choice in hindsight), or because Satoshi simply preferred to use a hash function designed by a separate organization.

Unlike all SHA-1 and SHA-2 algorithms, RIPEMD-160 is the only one that was not designed by NIST and NSA, but rather by a team of European researchers. Even though there is no indication that any of the SHA algorithms are artificially weakened or contain backdoors (introduced by the US government, that is), RIPEMD-160 might appeal to some people who heavily distrust governments.
Christof Paar, Understanding Cryptography
It is worth noting that Satoshi could've used SHA2-256 twice and truncated the second digest to 160 bits as this is equally secure. The fact that he didn't is some evidence to show that his decision was a conscious decision to use RIPEMD-160 over the NSA suit of algorithms.
liamzebedee, bitcoin.stackexchange.com

Either way, RIPEMD-160 is a fine choice for use as a 160-bit hash function, even if we don't really use it much any more in Bitcoin.

The use of SHA-256 + RIPEMD-160 helps to prevent against length extension attacks too (even though this is once again unnecessary).

SHA-256

SHA-256(data)

This is where you just the data through SHA-256 once. Nothing special this time.

tool-6841ea61da6b0
Tool Icon

SHA-256

Single SHA-256. Hash bytes of data using the SHA-256 hash function.

0 bytes
Hash Function Icon SHA-256
0 bytes
0 secs

This tool only accepts bytes of data in the form of hexadecimal characters. This is different to the SHA-256 (Text) tool at the top of the page, which accepts any text data, but that's just an example tool and is not the way data is hashed in Bitcoin.

More recent changes to Bitcoin have started to use a single round of SHA-256 (instead of HASH256):

However, it's nowhere near as prevalent as HASH256. So as a general rule, if you're hashing something in Bitcoin, it's most likely to be HASH256 and not a single SHA-256.

Code

require 'digest'

def sha256(hex)
  # convert hexadecimal string to byte sequence first
  binary = [hex].pack("H*") # H = hex string (highest byte first), * = multiple bytes

  # SHA-256 (single)
  hash = Digest::SHA256.digest(binary)

  # convert from byte sequence back to hexadecimal
  sha256 = hash.unpack("H*")[0]

  return sha256
end

puts sha256("aa") #=> bceef655b5a034911f1c3718ce056531b45ef03b4c7b1f15629e867294011a7d

HMAC-SHA512

HMAC with SHA-512

We use HMAC-SHA512 when we want to hash some data with an additional secret key.

So SHA-512 is the actual hash algorithm, and HMAC (Hash-based Message Authentication Code) is the method for combining the two pieces of data together using that hash algorithm.

tool-6841ea61da99c
Tool Icon

HMAC-SHA512

Used when deriving extended keys.

seed or (private/public key + 4-byte index)

0 bytes

"Bitcoin seed" or chain code

0 bytes
(ASCII)
HMAC Icon HMAC-SHA512

HMAC-SHA512(data, key)

0 bytes
0 secs

HMAC-SHA512 is used in Bitcoin when creating extended keys in HD Wallets:

SHA-512 is used within the HMAC when creating extended keys because it produces a 64 byte hash result, which means you can chop up the result to get a new private key (the first 32 bytes) and a new chain code (the last 32 bytes) to form the child extended key.

Code

require 'openssl'

# Example data
data = "67f93560761e20617de26e0cb84f7234aaf373ed2e66295c3d7397e6d7ebe882ea396d5d293808b0defd7edd2babd4c091ad942e6a9351e6d075a29d4df872af"
key = "Bitcoin seed"

# HMAC-SHA512
hmac = OpenSSL::HMAC.hexdigest(OpenSSL::Digest::SHA512.new, key, [data].pack("H*"))
puts hmac #=> f79bb0d317b310b261a55a8ab393b4c8a1aba6fa4d08aef379caba502d5d67f9463223aac10fb13f291a1bc76bc26003d98da661cb76df61e750c139826dea8b

PBKDF2

Password Based Key Derivation Function 2

PBKDF2 is used when you want to hash data multiple times.

tool-6841ea61dabcf
Tool Icon

PBKDF2

Create a seed for a HD Wallet from a mnemonic sentence (and optional passphrase).

mnemonic sentence

passphrase

(salt must start with prefix "mnemonic")
HMAC Icon
PBKDF2
  • iterations: 2048
  • algorithm: HMAC-SHA512
  • length: 64 bytes

PBKDF2(password, salt, iterations, algorithm, length)

0 bytes

Never enter your mnemonic sentence into a website, or use a mnemonic sentence generated by a website. Websites can easily save the seed and use it to steal all your bitcoins.

0 secs

The fact that PBKDF2 uses multiple iterations for hashing means that it is intentionally slow, which makes it more difficult to perform brute-force attacks.

So PBKDF2 is not actually a hash algorithm itself, but instead uses an existing hash algorithm (e.g. HMAC-SHA512) with multiple repetitions (in a specific way) before producing the final result.

In Bitcoin, PBKDF2 is used on the mnemonic sentence to create the initial seed for use in HD Wallets.

Bitcoin uses 2,048 iterations of PBKDF2 to convert a mnemonic sentence (and optional passphrase) to a seed. This makes it more time-consuming to try and crack someone else's mnemonic sentence and/or passphrase.

Code

require 'openssl'

# Example data
mnemonic = "punch shock entire north file identify"
passphrase = ""

# Prepare data for PBKDF2
password = mnemonic
salt = "mnemonic#{passphrase}" # "mnemonic" is always used in the salt with optional passphrase appended to it
iterations = 2048
keylength = 64
digest = OpenSSL::Digest::SHA512.new

# PBKDF2
result = OpenSSL::PKCS5.pbkdf2_hmac(password, salt, iterations, keylength, digest)
seed = result.unpack("H*")[0] # convert to hexadecimal string
puts seed #=> e1ca8d8539fb054eda16c35dcff74c5f88202b88cb03f2824193f4e6c5e87dd2e24a0edb218901c3e71e900d95e9573d9ffbf870b242e927682e381d109ae882

Usage

Where is hashing used in Bitcoin?

Hash functions are a useful general-purpose tool in programming, and they're used liberally throughout Bitcoin.

Here are the most important examples:

Mining

Diagram showing a block hash needing to get below a target value for it to be added on to the blockchain.

This is the most famous use of the hash function in Bitcoin.

A block header gets hashed, and the resulting block hash is interpreted as an integer. This integer must be below a certain target value for the block to be considered "valid" or "mined".

The fact that the result of the hash function is uncontrollable (preimage resistance) and wildly different for each nonce value (avalanche effect) creates a network-wide lottery, where nobody is in control of when the next block is mined.

Blockchain

Diagram showing how block hashes are used to create a chain of blocks.

Each block in the blockchain references the hash of previous block. This connects all the blocks in the blockchain together, and prevents anyone from changing the contents of a block anywhere in the chain.

Any change to a block lower down in the chain will change its hash, and therefore the blocks above it will no longer be connected to it and will no longer be part of the longest chain.

TXID

Diagram showing a TXID being created by hashing transaction data.

The data for each individual transaction is hashed to create a TXID (Transaction ID). This creates a unique reference number for every transaction (deterministic).

This allows you to reference coins created in previous transactions as inputs for spending in future transactions, as well as being able to search for transactions in a blockchain explorer.

The fact that it's hard to find two pieces of data that hash to the same result (collision resistance) means that every transaction can have its own short and unique reference number.

Merkle Root

Diagram showing a merkle root being created by hashing TXIDs together in pairs and then being committed to the block header.

Every block header includes a fingerprint for all of the transaction data included in the block.

This fingerprint is called the merkle root, and it's basically all of the TXIDs hashed together in a tree-like structure.

Hashing allows you to "commit" all the transaction data to the block header (deterministic). Therefore, if anyone changes transaction data in the block, it will no longer match the fingerprint in the header (collision resistance), and the modified block will be invalid.

Checksum

Diagram showing a checksum being created by hashing data and taking the first 4 bytes of the result.

Some checksums are just the truncated hash of some data.

These checksums are bundled with data to allow you to check if the data has been input correctly. For example, a checksum is included at the end of a legacy bitcoin address, so if you type in one part of the address incorrectly, the data will not match the checksum (or vice versa) (avalanche effect), and the error can be detected before you make the transaction.

Checksums are also used in networking to help make sure the contents of a message have not been lost during transit (deterministic).

Public Key Hash

Diagram showing a public key being shortened by hashing it.

A public key is either 33 or 65 bytes in size. However, before it gets converted to an address, it gets put through a hash function to shorten it to a 20-byte public key hash.

This allows you to create slightly shorter addresses than if you had not hashed the public key beforehand (fixed-length result).

Extended Keys

Diagram showing child keys being derived by hashing a previous extended key.

Hierarchical Deterministic Wallets allow you to create multiple private keys from a single seed.

Each extended private key is created by hashing the previous extended private key, which gives you a completely new, unique, and independent private key to use (collision resistance).

This illustrates the security of hash functions, as each new result of the hash function is reliable enough to use as a private key, because you cannot work backwards (preimage resistance) to work out a previous private key from another.

Signing Transactions

When you sign a transaction, you actually sign a hash of the transaction data.

Hashing data creates a shortened fingerprint for it (fixed-length result), and it's more efficient to sign the hash of a transaction (i.e. 32 bytes) than the full transaction data itself (e.g. 250+ bytes). The hash you sign is also unique for each piece of transaction data (collision resistance), so the resulting signature cannot be reused within a different transaction.

The reason hash functions were invented in the first place was to improve the efficiency of signing long messages.

Notes

How do hash functions work?

Excellent question, I'm glad you asked.

It's at this point I'd usually say that this is "outside the scope of this article" and then distract you with lots of technical terminology and hand waving.

So I made this video on how SHA-256 works instead.

As I say though, a hash function just scrambles and compresses the underlying bits (the 1s and 0s) of computer data. And that's all you really need to know.

A common mistake when hashing

A common mistake when hashing data in Bitcoin is to insert strings into the hash function, and not the underlying byte sequences those strings actually represent.

For example, let's say we have the hexadecimal string ab.

If we insert this directly into the hash function as a string, your programming language will actually send the ASCII encoding of each of these characters into the hash function, which looks like this in binary:

"ab" = 01100001 01100010
sha256(0110000101100010) = fb8e20fc2e4c3f248c60c39bd652f3c1347298bb977b8b4d5903b85055620603
tool-6841ea61db3db
Tool Icon

Byte (ASCII)

Convert between a byte and an ASCII character.

Byte
0 0
0
ASCII Table

The following characters are from Code Page 437, which is a popular 8-bit ASCII character set.

Basically, all ASCII character sets contain the same standard letters and numbers (between 0x20 and 0x7f). These are historically known as the printable characters.

Code Page 437 extended this with an additional 128 characters (between 0x80 and 0xff) to include international, box drawing, and mathematical characters too. This additional set of characters is commonly referred to as "extended ASCII".

Code Page 437 also replaced the obsolete control characters (between 0x01 and 0x1f) from the original ASCII standard (e.g. ISO 646) with decorative characters instead.

There is no specific ASCII character set used in Bitcoin, but this is a popular one, and it's good for demonstrating how bytes can be assigned to characters.

Standard ASCII

Decorative Characters
Binary Hexadecimal Decimal Character
00000000 00 0
00000001 01 1
00000010 02 2
00000011 03 3
00000100 04 4
00000101 05 5
00000110 06 6
00000111 07 7
00001000 08 8
00001001 09 9
00001010 0a 10
00001011 0b 11
00001100 0c 12
00001101 0d 13
00001110 0e 14
00001111 0f 15
00010000 10 16
00010001 11 17
00010010 12 18
00010011 13 19
00010100 14 20
00010101 15 21 §
00010110 16 22
00010111 17 23
00011000 18 24
00011001 19 25
00011010 1a 26
00011011 1b 27
00011100 1c 28
00011101 1d 29
00011110 1e 30
00011111 1f 31
Printable Characters
Binary Hexadecimal Decimal Character
00100000 20 32 (space)
00100001 21 33 !
00100010 22 34 "
00100011 23 35 #
00100100 24 36 $
00100101 25 37 %
00100110 26 38 &
00100111 27 39 '
00101000 28 40 (
00101001 29 41 )
00101010 2a 42 *
00101011 2b 43 +
00101100 2c 44 ,
00101101 2d 45 -
00101110 2e 46 .
00101111 2f 47 /
00110000 30 48 0
00110001 31 49 1
00110010 32 50 2
00110011 33 51 3
00110100 34 52 4
00110101 35 53 5
00110110 36 54 6
00110111 37 55 7
00111000 38 56 8
00111001 39 57 9
00111010 3a 58 :
00111011 3b 59 ;
00111100 3c 60 <
00111101 3d 61 =
00111110 3e 62 >
00111111 3f 63 ?
01000000 40 64 @
01000001 41 65 A
01000010 42 66 B
01000011 43 67 C
01000100 44 68 D
01000101 45 69 E
01000110 46 70 F
01000111 47 71 G
01001000 48 72 H
01001001 49 73 I
01001010 4a 74 J
01001011 4b 75 K
01001100 4c 76 L
01001101 4d 77 M
01001110 4e 78 N
01001111 4f 79 O
01010000 50 80 P
01010001 51 81 Q
01010010 52 82 R
01010011 53 83 S
01010100 54 84 T
01010101 55 85 U
01010110 56 86 V
01010111 57 87 W
01011000 58 88 X
01011001 59 89 Y
01011010 5a 90 Z
01011011 5b 91 [
01011100 5c 92 \
01011101 5d 93 ]
01011110 5e 94 ^
01011111 5f 95 _
01100000 60 96 `
01100001 61 97 a
01100010 62 98 b
01100011 63 99 c
01100100 64 100 d
01100101 65 101 e
01100110 66 102 f
01100111 67 103 g
01101000 68 104 h
01101001 69 105 i
01101010 6a 106 j
01101011 6b 107 k
01101100 6c 108 l
01101101 6d 109 m
01101110 6e 110 n
01101111 6f 111 o
01110000 70 112 p
01110001 71 113 q
01110010 72 114 r
01110011 73 115 s
01110100 74 116 t
01110101 75 117 u
01110110 76 118 v
01110111 77 119 w
01111000 78 120 x
01111001 79 121 y
01111010 7a 122 z
01111011 7b 123 {
01111100 7c 124 |
01111101 7d 125 }
01111110 7e 126 ~
01111111 7f 127

Extended ASCII

International Characters
Binary Hexadecimal Decimal Character
10000000 80 128 Ç
10000001 81 129 ü
10000010 82 130 é
10000011 83 131 â
10000100 84 132 ä
10000101 85 133 à
10000110 86 134 å
10000111 87 135 ç
10001000 88 136 ê
10001001 89 137 ë
10001010 8a 138 è
10001011 8b 139 ï
10001100 8c 140 î
10001101 8d 141 ì
10001110 8e 142 Ä
10001111 8f 143 Å
10010000 90 144 É
10010001 91 145 æ
10010010 92 146 Æ
10010011 93 147 ô
10010100 94 148 ö
10010101 95 149 ò
10010110 96 150 û
10010111 97 151 ù
10011000 98 152 ÿ
10011001 99 153 Ö
10011010 9a 154 Ü
10011011 9b 155 ¢
10011100 9c 156 £
10011101 9d 157 ¥
10011110 9e 158
10011111 9f 159 ƒ
10100000 a0 160 á
10100001 a1 161 í
10100010 a2 162 ó
10100011 a3 163 ú
10100100 a4 164 ñ
10100101 a5 165 Ñ
10100110 a6 166 ª
10100111 a7 167 º
10101000 a8 168 ¿
10101001 a9 169
10101010 aa 170 ¬
10101011 ab 171 ½
10101100 ac 172 ¼
10101101 ad 173 ¡
10101110 ae 174 «
10101111 af 175 »
Box Drawing Characters
Binary Hexadecimal Decimal Character
10110000 b0 176
10110001 b1 177
10110010 b2 178
10110011 b3 179
10110100 b4 180
10110101 b5 181
10110110 b6 182
10110111 b7 183
10111000 b8 184
10111001 b9 185
10111010 ba 186
10111011 bb 187
10111100 bc 188
10111101 bd 189
10111110 be 190
10111111 bf 191
11000000 c0 192
11000001 c1 193
11000010 c2 194
11000011 c3 195
11000100 c4 196
11000101 c5 197
11000110 c6 198
11000111 c7 199
11001000 c8 200
11001001 c9 201
11001010 ca 202
11001011 cb 203
11001100 cc 204
11001101 cd 205
11001110 ce 206
11001111 cf 207
11010000 d0 208
11010001 d1 209
11010010 d2 210
11010011 d3 211
11010100 d4 212
11010101 d5 213
11010110 d6 214
11010111 d7 215
11011000 d8 216
11011001 d9 217
11011010 da 218
11011011 db 219
11011100 dc 220
11011101 dd 221
11011110 de 222
11011111 df 223
Mathematical Symbols
Binary Hexadecimal Decimal Character
11100000 e0 224 α
11100001 e1 225 ß
11100010 e2 226 Γ
11100011 e3 227 π
11100100 e4 228 Σ
11100101 e5 229 σ
11100110 e6 230 µ
11100111 e7 231 τ
11101000 e8 232 Φ
11101001 e9 233 Θ
11101010 ea 234 Ω
11101011 eb 235 δ
11101100 ec 236
11101101 ed 237 φ
11101110 ee 238 ε
11101111 ef 239
11110000 f0 240
11110001 f1 241 ±
11110010 f2 242
11110011 f3 243
11110100 f4 244
11110101 f5 245
11110110 f6 246 ÷
11110111 f7 247
11111000 f8 248 °
11111001 f9 249
11111010 fa 250 ·
11111011 fb 251
11111100 fc 252
11111101 fd 253 ²
11111110 fe 254
11111111 ff 255 (non-breaking space)

Hash functions work on the underlying 1s and 0s of computer data, which is what I'm referring to here with the word "binary".

But what we actually want to send into the hash function is the byte this hexadecimal string represents, which looks like this in binary:

0xab = 10101011
sha256(10101011) = 087d80f7f182dd44f184aa86ca34488853ebcc04f0c60d5294919a466b463831
tool-6841ea61db71c
Tool Icon

Bytes

Byte 2
Byte 1
Byte 2 0 0
Byte 1 0 0
0b0000000000000000
0d0
0x0000

This is why we usually need to "pack" our hexadecimal strings in to bytes first before hashing. Most programming languages will have functions that allow you to do this. For example:

hex = "ab"
binary = [hex].pack("H*") # H = hex string (highest byte first), * = multiple bytes
$hex = "ab"
$binary = pack("H*", $hex);

You will probably see a bunch of jargon text if you print out these converted binary values directly. This makes sense, because your programming language converts this binary data back to ASCII when printing it out, and this binary data probably refers to a weird characters (code points) in the ASCII table.

In short, remember that hash functions take in binary data as the input, so we need to be specific about the binary data we want to insert.

If you forget to convert your hexadecimal strings to their corresponding bytes beforehand, your programming language will assume you want to send the binary representation of the characters in the string, and this will produce a completely different hash result than expected.

This is by far the most common issue people have when hashing data in Bitcoin for the first time. So if you're not getting the right hash results, this is probably where you're going wrong.

And I should know, I've done do it myself.

All bitcoin data is just a bunch of bytes at the end of the day. We just use their hexadecimal string representation for convenience when sharing and displaying them on websites.

Summary

A hash function is the Swiss Army knife in the programmer's toolbox.

You'll be hashing things frequently when working with Bitcoin, so it's worth getting the hang of them in whatever programming language you're using.

Satoshi very much understood their properties, and utilized them for various purposes when developing Bitcoin. But their most ingenious decision was to leverage the unpredictability of hash functions to create a network-wide lottery, which is what underpins the revolutionary mechanism of mining and blockchain technology.

So it just goes to show that if you understand the fundamental properties of a tool, you can find new and interesting ways to use it.

Lastly, if you want a proper technical definition of a hash function:

Cryptographic hash functions map input strings of arbitrary (or very large) length to short fixed length output strings.
Bart Preneel, The First 30 Years of Cryptographic Hash Functions

But I just think of them as fingerprinting machines for data.