Bytes

Storage units for data

Diagram showing 8 individual bits making up a single hexadecimal byte.

This is a quick guide to explain what bytes are, and how they're used in bitcoin. This will be useful if you're working with transactions and other raw data in bitcoin.

You're going to be doing some low-level programming when working with bitcoin data, so it's good to know what you're looking at.

I'm not a computer scientist either, but bytes are pretty straightforward.

Intro

When using bitcoin day-to-day, you will frequently see raw bytes of data:

When working with bitcoin, you'll soon find that all the core data is made up of raw bytes too:

Bitcoin is a computer program, and computers read and communicate using bytes. So if you plan on working with bitcoin on a technical level, it's useful to know what a byte is, how they're displayed, and the types of data they store.

Basics

Before we get on to bytes, we need to know what a "bit" is.

Bit

A bit is the smallest unit of data that a computer can hold. This can either be a 1 or a 0 (i.e. a single transistor being "on" or "off"):

tool-6759907c3a13f
Tool Icon

Bit

Note: Just click the bit to turn it "on" and "off".

These bits are the building blocks for storing data inside computers.

Ben Eater has some great videos on how computers work.

If you ever get stuck when programming, just remember that it's all just a bunch of ones and zeros at the end of the day.

Byte

A byte is just a group of 8 individual bits:

tool-6759907c3a164
Tool Icon

Byte

Byte

A byte is a convenient unit of storage for data on a computer. Just as it's sometimes more practical to measure weight in kilograms instead of grams, it's sometimes more practical to measure data in bytes instead of bits.

Bytes are actually the default measurement for data, and most of the data you'll work with in bitcoin is measured in bytes.

A byte is 8 individual bits. So a 32-byte private key is 256 bits (32 x 8 = 256).

Representing Bytes

We can represent a byte in two ways:

tool-6759907c3a477
Tool Icon

Byte (Hexadecimal)

Byte
0 0

Tip: The lowest value bit is on the right

Tip: Half of a byte is called a "nibble". But that's not important to know for Bitcoin.

This works because if you split a byte in to two 4-bit halves, each of those 4-bit halves can handle 16 different combinations of 1s and 0s. So if you give each of those halves its own hexadecimal character, you can represent a byte using just two characters instead of eight.

This shorter representation is called a hexadecimal byte.

We often display bytes in hexadecimal format to save space, which is much more convenient when you're representing multiple bytes of data:

tool-6759907c3a5ac
Tool Icon

Bytes

Byte 2
Byte 1
Byte 2 0 0
Byte 1 0 0
0b0000000000000000
0d0
0x0000

So the next time you see a private key like this:

61dc9ff8b15450212970c6fa997338bb205dd48ffaca3a056b09a3b44a244d76

You're actually looking at 32 individual bytes:

61 dc 9f f8 b1 54 50 21 29 70 c6 fa 99 73 38 bb 20 5d d4 8f fa ca 3a 05 6b 09 a3 b4 4a 24 4d 76

When I first started working with Bitcoin I made the mistake of interpreting things like private keys as strings of individual letters and numbers. But in reality it's a sequence of bytes, where every two characters is one byte.

It doesn't matter if the hexadecimal characters are uppercase or lowercase. For example, 6b is the same as 6B.

Storing Data in Bytes

So, how can we store useful things like numbers, text, and other things inside these bytes?

It honestly just depends on how you interpret the bits.

A byte can hold different types of data depending on how you interpret that byte. For example, the byte 01100111 could represent the letter "g" or the number 103. It all depends on what kind of data structure you're working with.

Numbers

You can use bytes to store whole numbers (aka integers):

tool-6759907c3a6d4
Tool Icon

Byte (Integer)

Byte
0 0
0

If we want to store a number in a computer, we store it using its binary representation inside the bits of a byte.

And if we want to store bigger numbers, we just use more bytes.

Examples

Numbers are very common inside bitcoin data:

Text

You can use also bytes to store text. You can do this by assigning each character its own byte.

There is no official standard for encoding text inside bitcoin data, but the most commonly used mapping of characters-to-bytes is ASCII:

tool-6759907c3a89c
Tool Icon

Byte (ASCII)

Convert between a byte and an ASCII character.

Byte
0 0
0
ASCII Table

The following characters are from Code Page 437, which is a popular 8-bit ASCII character set.

Basically, all ASCII character sets contain the same standard letters and numbers (between 0x20 and 0x7f). These are historically known as the printable characters.

Code Page 437 extended this with an additional 128 characters (between 0x80 and 0xff) to include international, box drawing, and mathematical characters too. This additional set of characters is commonly referred to as "extended ASCII".

Code Page 437 also replaced the obsolete control characters (between 0x01 and 0x1f) from the original ASCII standard (e.g. ISO 646) with decorative characters instead.

There is no specific ASCII character set used in Bitcoin, but this is a popular one, and it's good for demonstrating how bytes can be assigned to characters.

Standard ASCII

Decorative Characters
Binary Hexadecimal Decimal Character
00000000 00 0
00000001 01 1
00000010 02 2
00000011 03 3
00000100 04 4
00000101 05 5
00000110 06 6
00000111 07 7
00001000 08 8
00001001 09 9
00001010 0a 10
00001011 0b 11
00001100 0c 12
00001101 0d 13
00001110 0e 14
00001111 0f 15
00010000 10 16
00010001 11 17
00010010 12 18
00010011 13 19
00010100 14 20
00010101 15 21 §
00010110 16 22
00010111 17 23
00011000 18 24
00011001 19 25
00011010 1a 26
00011011 1b 27
00011100 1c 28
00011101 1d 29
00011110 1e 30
00011111 1f 31
Printable Characters
Binary Hexadecimal Decimal Character
00100000 20 32 (space)
00100001 21 33 !
00100010 22 34 "
00100011 23 35 #
00100100 24 36 $
00100101 25 37 %
00100110 26 38 &
00100111 27 39 '
00101000 28 40 (
00101001 29 41 )
00101010 2a 42 *
00101011 2b 43 +
00101100 2c 44 ,
00101101 2d 45 -
00101110 2e 46 .
00101111 2f 47 /
00110000 30 48 0
00110001 31 49 1
00110010 32 50 2
00110011 33 51 3
00110100 34 52 4
00110101 35 53 5
00110110 36 54 6
00110111 37 55 7
00111000 38 56 8
00111001 39 57 9
00111010 3a 58 :
00111011 3b 59 ;
00111100 3c 60 <
00111101 3d 61 =
00111110 3e 62 >
00111111 3f 63 ?
01000000 40 64 @
01000001 41 65 A
01000010 42 66 B
01000011 43 67 C
01000100 44 68 D
01000101 45 69 E
01000110 46 70 F
01000111 47 71 G
01001000 48 72 H
01001001 49 73 I
01001010 4a 74 J
01001011 4b 75 K
01001100 4c 76 L
01001101 4d 77 M
01001110 4e 78 N
01001111 4f 79 O
01010000 50 80 P
01010001 51 81 Q
01010010 52 82 R
01010011 53 83 S
01010100 54 84 T
01010101 55 85 U
01010110 56 86 V
01010111 57 87 W
01011000 58 88 X
01011001 59 89 Y
01011010 5a 90 Z
01011011 5b 91 [
01011100 5c 92 \
01011101 5d 93 ]
01011110 5e 94 ^
01011111 5f 95 _
01100000 60 96 `
01100001 61 97 a
01100010 62 98 b
01100011 63 99 c
01100100 64 100 d
01100101 65 101 e
01100110 66 102 f
01100111 67 103 g
01101000 68 104 h
01101001 69 105 i
01101010 6a 106 j
01101011 6b 107 k
01101100 6c 108 l
01101101 6d 109 m
01101110 6e 110 n
01101111 6f 111 o
01110000 70 112 p
01110001 71 113 q
01110010 72 114 r
01110011 73 115 s
01110100 74 116 t
01110101 75 117 u
01110110 76 118 v
01110111 77 119 w
01111000 78 120 x
01111001 79 121 y
01111010 7a 122 z
01111011 7b 123 {
01111100 7c 124 |
01111101 7d 125 }
01111110 7e 126 ~
01111111 7f 127

Extended ASCII

International Characters
Binary Hexadecimal Decimal Character
10000000 80 128 Ç
10000001 81 129 ü
10000010 82 130 é
10000011 83 131 â
10000100 84 132 ä
10000101 85 133 à
10000110 86 134 å
10000111 87 135 ç
10001000 88 136 ê
10001001 89 137 ë
10001010 8a 138 è
10001011 8b 139 ï
10001100 8c 140 î
10001101 8d 141 ì
10001110 8e 142 Ä
10001111 8f 143 Å
10010000 90 144 É
10010001 91 145 æ
10010010 92 146 Æ
10010011 93 147 ô
10010100 94 148 ö
10010101 95 149 ò
10010110 96 150 û
10010111 97 151 ù
10011000 98 152 ÿ
10011001 99 153 Ö
10011010 9a 154 Ü
10011011 9b 155 ¢
10011100 9c 156 £
10011101 9d 157 ¥
10011110 9e 158
10011111 9f 159 ƒ
10100000 a0 160 á
10100001 a1 161 í
10100010 a2 162 ó
10100011 a3 163 ú
10100100 a4 164 ñ
10100101 a5 165 Ñ
10100110 a6 166 ª
10100111 a7 167 º
10101000 a8 168 ¿
10101001 a9 169
10101010 aa 170 ¬
10101011 ab 171 ½
10101100 ac 172 ¼
10101101 ad 173 ¡
10101110 ae 174 «
10101111 af 175 »
Box Drawing Characters
Binary Hexadecimal Decimal Character
10110000 b0 176
10110001 b1 177
10110010 b2 178
10110011 b3 179
10110100 b4 180
10110101 b5 181
10110110 b6 182
10110111 b7 183
10111000 b8 184
10111001 b9 185
10111010 ba 186
10111011 bb 187
10111100 bc 188
10111101 bd 189
10111110 be 190
10111111 bf 191
11000000 c0 192
11000001 c1 193
11000010 c2 194
11000011 c3 195
11000100 c4 196
11000101 c5 197
11000110 c6 198
11000111 c7 199
11001000 c8 200
11001001 c9 201
11001010 ca 202
11001011 cb 203
11001100 cc 204
11001101 cd 205
11001110 ce 206
11001111 cf 207
11010000 d0 208
11010001 d1 209
11010010 d2 210
11010011 d3 211
11010100 d4 212
11010101 d5 213
11010110 d6 214
11010111 d7 215
11011000 d8 216
11011001 d9 217
11011010 da 218
11011011 db 219
11011100 dc 220
11011101 dd 221
11011110 de 222
11011111 df 223
Mathematical Symbols
Binary Hexadecimal Decimal Character
11100000 e0 224 α
11100001 e1 225 ß
11100010 e2 226 Γ
11100011 e3 227 π
11100100 e4 228 Σ
11100101 e5 229 σ
11100110 e6 230 µ
11100111 e7 231 τ
11101000 e8 232 Φ
11101001 e9 233 Θ
11101010 ea 234 Ω
11101011 eb 235 δ
11101100 ec 236
11101101 ed 237 φ
11101110 ee 238 ε
11101111 ef 239
11110000 f0 240
11110001 f1 241 ±
11110010 f2 242
11110011 f3 243
11110100 f4 244
11110101 f5 245
11110110 f6 246 ÷
11110111 f7 247
11111000 f8 248 °
11111001 f9 249
11111010 fa 250 ·
11111011 fb 251
11111100 fc 252
11111101 fd 253 ²
11111110 fe 254
11111111 ff 255 (non-breaking space)

As you can see, each byte represents a different character. You just need the ASCII table at hand to see which byte corresponds to which character.

Examples

Text is not frequently stored inside Bitcoin data, but it does show up in a few places:

Settings (Bit Field)

Sometimes you can use the underlying bits inside bytes of data to represent multiple on/off settings.

This is known as a bit field, and it's an efficient way to store multiple settings in the smallest space possible.

tool-6759907c3b07c
Tool Icon

Bit Field

Bit Field
0b00000000000000000000000000000000
0d0
0x00000000

Examples

Bit fields are used in a handful of places in bitcoin:

Just Bytes

Sometimes bytes don't have to represent anything in particular. They can just be useful enough as a unique piece of data.

Examples

The output from a hash function is just a unique series of bytes, and this makes them useful as "fingerprints" for data:

And sometimes we might just use a random-looking set of bytes with no meaning at all:

So you don't have to interpret bytes as anything at all. Having a unique set of bytes to use as a "fingerprint" for data can be useful enough on its own.

A block hash actually gets interpreted as both a unique identifier and a number. It's usually just a unique identifier for the block, but it also gets interpreted as a number during the mining process to see if the block hash is below the current target value.

Custom Data

Ultimately you can use bytes to represent anything you like. It doesn't have to just be numbers and text (although they're the most common). It all depends on how you choose to interpret the combinations of 1s and 0s inside those bytes.

Examples

These are some encodings that are unique to bitcoin:

Working with Bytes

Being able to work with raw bytes in bitcoin is especially important when it comes to networking or when you're hashing data.

The good news is you can work with raw bytes of data in any decent programming language. This varies between languages, but it usually involves writing out each individual byte value in some sort of special string or array, like so:

"\xF9\xBE\xB4\xD9"
{0xF9, 0xBE, 0xB4, 0xD9}

But this kind of format isn't easy to pass around, so you'll often find yourself converting these bytes in to hexadecimal strings for display purposes:

"f9beb4d9"

This is how I typically display bytes on this website, and how you'll usually see bytes being displayed on blockchain explorers.

Sometimes you'll want to get the integer value that these bytes represent too:

4190024921

So when working with Bitcoin data, you want to be able to convert back and forth between actual bytes, hexadecimal strings, and integers in your programming language of choice.

Here are some quick examples on how to do this in a few common programming languages:

# Work with bytes directly using hexadecimal characters to represent each byte.
bytes = "\xF9\xBE\xB4\xD9" #=> (jargon - tries to display character encoding for each byte value)

# Bytes -> Hexadecimal String
hex_string = bytes.unpack("H*")[0] #=> "f9beb4d9"

# Hexadecimal String -> Bytes
bytes = [hex_string].pack("H*") #=> (jargon - tries to display a character encoding for each byte value)

# Hexadecimal String -> Integer
integer = hex_string.to_i(16) #=> 4190024921

Ruby can be a bit awkward when displaying bytes. If you try to print out bytes directly, Ruby will try to display them using each byte's character encoding (instead of their hexadecimal representation), so you need to convert to hexadecimal to display them when debugging.

The "pack" and "unpack" functions are the most useful when it comes to working with raw bytes of data in languages like Ruby and Python, so it's worth getting to know them.

# NOTE: Python 3.5+

# Create bytes
raw_bytes = b'\xf9\xbe\xb4\xd9'

# Convert bytes to hex string
hex_string = b'\xf9\xbe\xb4\xd9'.hex()
print(hex_string) #=> f9beb4d9

# Convert hex string to bytes
raw_bytes = bytes.fromhex('f9beb4d9')
print(raw_bytes) #=> b'\xf9\xbe\xb4\xd9'

# Convert bytes to integer
integer = int.from_bytes(b'\xf9\xbe\xb4\xd9', byteorder="big") # second argument is endianness
print(integer) #=> 4190024921

See little-endian for details about endianness in bitcoin.

package main

import "fmt"
import "encoding/hex" // converting between byte array and hexadecimal string
import "math/big" // converting between byte array and integer

func main() {

    // Create a byte array
    bytes := []byte{0xF9, 0xBE, 0xB4, 0xD9}
    fmt.Println(bytes) //=> [249 190 180 217]

    // Convert byte array to hexadecimal string
    hex_string := hex.EncodeToString(bytes)
    fmt.Println(hex_string) //=> f9beb4d9

    // Convert hexadecimal string to byte array
    byte_array, _ := hex.DecodeString(hex_string)
    fmt.Println(byte_array) //=> [249 190 180 217]

    // Convert byte array to integer
    integer := new(big.Int).SetBytes(byte_array)
    fmt.Println(integer) //=> 4190024921

    // Convert integer to byte array
    byte_array_from_integer := integer.Bytes()
    fmt.Println(byte_array_from_integer) //=> [249 190 180 217]

}

Summary

A byte is just a group of 8 bits:

byte (binary):
┌─┬─┬─┬─┬─┬─┬─┬─┐
│0│1│1│0│1│0│1│1│
└─┴─┴─┴─┴─┴─┴─┴─┘

To save on space, we usually represent each byte using 2 hexadecimal characters instead of using 8 individual bits:

byte (hexadecimal):
┌─┬─┐
│6│B│
└─┴─┘

Bytes are most commonly used in bitcoin to represent numbers (e.g. private key and output amounts), but they're often used as unique fingerprints for larger pieces of data (e.g. TXIDs and Block Hashes).

But at the end of the day, the combination of 1s and 0s inside bytes can be used to store any kind of data.