Little-Endian
The order of bytes that a computer likes to read in.
“Little-endian” refers to the order in which bytes of information can be processed by a computer.
For example, here’s a hexadecimal number:
12345678
And here’s that same number in little-endian:
78563412
A computer processor will read this number in the same way that we read it as 12345678
… it’s just two different ways of reading the same thing.
Bitcoin likes using little-endian format for some data, so when working with code, you often have to get things in to little-endian format for them to work.
How does it work?
Let’s look at that first number again:
12345678
When you read it from left to right, you start with the most-significant value first (i.e starting with 10000000
and ending with 8
). So as you’re processing this number in your head, you could say you’re starting with the big-end first.
But if you think about it, it could make just as much sense to start with the little-end first.
87654321
In this format, you’re working your way up from the least-significant value and finishing with the most-significant value.
Sure, it’s a complete shift in the way you’re used to reading numbers, but if you’re a computer, this is arguably a much more logical way to do it. And that’s why many computer processors read data the “little-endian” way.
But wait, this still isn’t the same format as the actual little-endian number from the start…
Bytes.
Computers read through data in chunks. Or to be more precise, they process data in “bytes”.
Now, 1 byte is just some space in a computer’s memory, and holds 2 hexadecimal characters. So as you’re reversing the data in to being little-end first, you actually do it 1 byte at a time:
Big Endian: +------+------+------+------+ Byte Number: | 0 | 1 | 2 | 3 | +------+------+------+------+ Data: | 12 | 34 | 56 | 78 | +------+------+------+------+ Little-Endian: +------+------+------+------+ Byte Number: | 0 | 1 | 2 | 3 | +------+------+------+------+ Data: | 78 | 56 | 34 | 12 | +------+------+------+------+
And so by reversing 2 characters (1 byte) at a time, we get 12345678
in little-endian:
78563412
Ta da.
So the little end is still first, but you’re taking 1 byte of that little-end at a time (and not simply 1 character at a time).
Why is little-endian used in bitcoin?
Because that’s the way Bitcoin was designed.
It may not be the most user-friendly (or popular1) choice, but modern computers almost always use the little-endian format internally, so this decision is a way of improving speed.2
Example of little-endian in bitcoin.
The majority of the fields in transaction data are in little-endian format.
The first 4 bytes (8 characters) of any serialized transaction tells you the version number of that transaction.
0100000001528dd30e90e54ff5321758214b86b344c1867140b44e49975934727051158a0a000000008b4830450221008e332006edbbbda724f5955f55e29ec1dd526f9a7f7599b5c801860b3e378e4e02201c3f501bf1f43010e85a25abbd0fc4547491c334744cc4728d86914a59811dd4014104212b6993b785b677e55a886f9353b1d216c939c86b96d5d86e8f3bd8d8ffe2164ecf7c0f6ecc8c525a4850f896af1a7612fb7520ad88f77717ee4c824ab00582ffffffff01f06c3577000000001976a914d1a4db47565243b566b5fc400ff59400ac254cb988ac00000000
010000000 = version number (little-endian)
000000001 = version number (big-endian)
Or in other words, the format of this transaction is version 1.
Remember, when converting from little-endian to big-endian, swap each pair of characters first, then reverse the string.
Code
Bash
Here’s how you can swap endianness on the command line (using grep
to match every 2 characters, tac
to reverse the order, then tr
to remove the line breaks to give you a string):
echo 12345678 | grep -o .. | tac | echo $(tr -d '\n')
PHP
Here’s a similar method in PHP (switching every 2 characters again):3
// converts a string to little-endian
$string = 12345678;
$little-endian = implode('', array_reverse(str_split($string, 2)));
echo $little-endian;
Gif
