Glossary | blk.dat
$4,134.30/BTCBuy

blk.dat

The blk.dat files in ~/.bitcoin/blocks/ contain raw block data received by a bitcoin core node.

These blk.dat files basically store "the blockchain".

How do they work?

Every block that your node receives gets appended to a blk.dat file.

Also, instead of the entire blockchain being stored in one massive file, they are split in to multiple blk*.dat files.

~/.bitcoin/blocks
blk00000.dat
blk00001.dat
blk00002.dat
...

Your node first adds blocks to blk00000.dat, then when it fills up it moves on to blk00001.dat, then blk00002.dat ...and so on.

Example

The data in blk.dat files is stored in binary, and each new block gets appended to the end of the file.

We can look at the genesis block by reading the first 293 bytes of blk00000.dat:

f9beb4d91d0100000100000000000000000000000000000000000000000000000000000000000000000000003ba3edfd7a7b12b27ac72c3e67768f617fc81bc3888a51323a9fb8aa4b1e5e4a29ab5f49ffff001d1dac2b7c0101000000010000000000000000000000000000000000000000000000000000000000000000ffffffff4d04ffff001d0104455468652054696d65732030332f4a616e2f32303039204368616e63656c6c6f72206f6e206272696e6b206f66207365636f6e64206261696c6f757420666f722062616e6b73ffffffff0100f2052a01000000434104678afdb0fe5548271967f1a67130b7105cd6a828e03909a67962e0ea1f61deb649f6bc3f4cef38c4f35504e51ec112de5c384df7ba0b8d578a4c702b6bf11d5fac0000000000

(See the od command below for getting hexadecimal data from a binary file.)

Structure

The data above can be split in to five pieces:

Data

[ magic bytes ][    size     ][        block header        ][  tx count  ][          transaction data          ]
 <- 4 bytes ->  <- 4 bytes ->  <-        80 bytes        ->  <- varint ->  <-            remainder           ->

The size field is what allowed me to figure out that I needed to read 293 bytes to get the whole block.

The size is given as 1d010000, so get this in human format:

  1. Swap the endianness to get 0000011d
  2. Convert to hexadecimal to get 285

So in addition to the initial 8 bytes for the magic-bytes + size, I know the size of the upcoming block data is going to be 285 bytes.

Notes

1. Blocks are not downloaded in order.

If you are parsing the blk.dat files, be aware that blocks are not going to be in order. For example, you may encounter blocks in this order as you run through the file:

A B C E F D

This is because your bitcoin node will download blocks in parallel to download the blockchain as quickly as possible. Your node will download blocks further ahead of the current one as it goes, instead of waiting to receive each block in order.

The maximum distance ahead your node will fetch from (or the "maximum out-of-orderness") is controlled by BLOCK_DOWNLOAD_WINDOW in the bitcoin source code.

2. The maximum blk.dat file size is 128MiB (134,217,728 bytes)

This limit is set by MAX_BLOCKFILE_SIZE

Linux Tools

As mentioned, the data inside a blk.dat file is binary, so you're probably not getting to see much sense if you open one up in a text editor. But no matter, because binary data can be easily converted to hexadecimal, and there are two commands for the job:

1. od

This is a simple one. It dumps out the contents of files in your format of choice.

od -x --endian=big -N 293 -An blk00000.dat
  • -x <- show hexadecimal
  • --endian=big <- display bytes in big endian
  • -N 293 <- number of bytes to read
  • -An <- do not show file offset

"od" stands for octal dump, but you dump out data in formats other than octal.

2. hexdump

This is similar to od, but it also gives you the option of displaying ascii text from the data (which is handy for looking at messages contained inside transaction data).

$ hexdump -C -s 8 -n 285 blk00000.dat

00000008  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000018  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000028  00 00 00 00 3b a3 ed fd  7a 7b 12 b2 7a c7 2c 3e  |....;...z{..z.,>|
00000038  67 76 8f 61 7f c8 1b c3  88 8a 51 32 3a 9f b8 aa  |gv.a......Q2:...|
00000048  4b 1e 5e 4a 29 ab 5f 49  ff ff 00 1d 1d ac 2b 7c  |K.^J}._I......+||
00000058  01 01 00 00 00 01 00 00  00 00 00 00 00 00 00 00  |................|
00000068  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000078  00 00 00 00 00 00 ff ff  ff ff 4d 04 ff ff 00 1d  |..........M.....|
00000088  01 04 45 54 68 65 20 54  69 6d 65 73 20 30 33 2f  |..EThe Times 03/|
00000098  4a 61 6e 2f 32 30 30 39  20 43 68 61 6e 63 65 6c  |Jan/2009 Chancel|
000000a8  6c 6f 72 20 6f 6e 20 62  72 69 6e 6b 20 6f 66 20  |lor on brink of |
000000b8  73 65 63 6f 6e 64 20 62  61 69 6c 6f 75 74 20 66  |second bailout f|
000000c8  6f 72 20 62 61 6e 6b 73  ff ff ff ff 01 00 f2 05  |or banks........|
000000d8  2a 01 00 00 00 43 41 04  67 8a fd b0 fe 55 48 27  |*....CA.g....UH'|
000000e8  19 67 f1 a6 71 30 b7 10  5c d6 a8 28 e0 39 09 a6  |.g..q0..\..(.9..|
000000f8  79 62 e0 ea 1f 61 de b6  49 f6 bc 3f 4c ef 38 c4  |yb...a..I..?L.8.|
00000108  f3 55 04 e5 1e c1 12 de  5c 38 4d f7 ba 0b 8d 57  |.U......\8M....W|
00000118  8a 4c 70 2b 6b f1 1d 5f  ac 00 00 00 00           |.Lp+k.._.....|)
0000125
  • -C <- display data in the same byte-order that is used in bitcoin, and also ascii text
  • -s <- start point (offset in bytes)
  • -n <- number of bytes to read

Show hexadecimal data only.

You can chain some commands together so that you only get raw hexadecimal data output:

hexdump -C -s 8 -n 285 blk00000.dat | cut -c 11-58 | tr '\n' ' ' | tr -d ' '
  • cut -c 11-58 <- cuts out anything outside the columns from characters 11 to 58 (on each line)
  • tr '\n' ' ' <- translate new lines in to spaces
  • tr -d ' ' <- deletes all spaces

Resources