Into the Weeds: Bitcoin Wallets, Keys, and Seeds — VegaX Research Report
In this installment of the Bitcoin Educational series, we return from a brief stint battling misconceptions around the bitcoin network’s energy usage, and instead expand upon cryptographic fundamentals first explored here. Readers are sure to get deep into the weeds of then cryptographic technologies that underlie the bitcoin network.
For those who have followed crypto markets, or engaged with any of the protocols, the phrases “wallet” and “seed” are likely familiar. But before venturing into the forest of complexity, let us first take a moment to define those terms in the context of bitcoin. Generally speaking, a wallet is a software application that displays a “balance” of coins associated with a user’s public keys (addresses) and additionally secures a user’s private keys — with which one can access and spend coins. A seed, on the other hand, is just a piece of data. Specifically, a seed is a randomly generated 64-byte piece of data from which all subsequent private and public keys will be derived.
The content of this installment will be focused on examining the concepts and features of Hierarchical Deterministic (HD) wallets, and why these types of wallets are so prevalent in the bitcoin landscape today. Because of the technical nature of this content, we will proceed in as methodical a manner as possible.
We will cover:
- A brief review of wallet seeds and their different iterations.
- What an HD wallet is, and why a person might use one?
- How do HD wallets work?
- “Master” and “Extended” Private and Public keys.
- “Child” keys and how they are derived.
Randomly Generated Security
As previously stated, a “seed” is a randomly generated 64-bit number represented in hexadecimal format — this is the base from which all of a wallet’s private / public key pairs and addresses are derived. In modern wallets, the seed is represented by a mnemonic sentence: a series of 12 to 24 words containing the sequence of randomly generated numbers (seed) within them.
While most wallets do not display the series of steps taken to create the mnemonic sentence, it is beneficial to understand the process. The first step is to generate entropy — the source of randomness. This can be thought of as a very large number that has not been generated before, and will never be generated again. It’s best to consider entropy as a series of bits (e.g. 001010101110101) as this is how computers store data. The second step is to translate that entropy into our mnemonic sentence; the exact (technical) method of this conversion surpasses the scope of this piece, but interested readers can review the standard for themselves outlined by the BIP-39 proposal.
The final step would then be to convert the mnemonic sentence into the 64-byte hexadecimal seed. This happens by putting the mnemonic sentence with an optional passphrase through the Password-Based Key Derivation Function 2 (PBKDF2). Essentially this function hashes the two inputs multiple times until it produces the final 64-byte (512-bit) seed.
HD Wallets: What & Why?
A hierarchical deterministic wallet is a wallet that generates all of its keys and addresses from a single source. It is deterministic because it generates keys and addresses the same way each time, and is hierarchical because those keys and addresses can be organized into a tree.
One benefit of using an HD wallet is that only a single source of backup is required for security. A basic wallet would generate a new private / public key pair independently, each time you received a new payment. HD wallets, on the other hand, enable the creation of master private keys which can generate billions of “child” private / public key pairs. In that case, one would need only to back up the single seed, as the master private key would then be able to deterministically generate all child keys.
Each child key in the wallet can also generate its own keys, meaning that users can organize their keys into hierarchical trees. One use case for this could be to utilize different parts of the tree to differentiate between accounts.
Master private keys have a corresponding master public key, this key can generate the exact same child public keys without any knowledge of the associate private key. One could therefore send a master public key to a different computer / server stack to generate new receiving addresses, without fear that the private keys would be compromised in the event of a server hack. This is useful for applications like hardware wallets, which store private keys on a dedicated device but do not sacrifice the convenience of being able to generate receiving addresses on separate devices. These are all great reasons to use an HD wallet, and in the next section, we will explore the processes that enable the wallet to function as it does.
How does an HD Wallet Work?
It all begins with the seed, the randomly generated 64-byte hexadecimal value. Again, in modern wallets, this value is most commonly represented as a mnemonic sentence — also known as a “seed phrase.” A wallet needs a large, randomly generated number from which to derive all subsequent keys and addresses, and storing that number as a mnemonic sentence is a convenient way of doing so.
The next step is to create the wallet’s master private key. This is done by putting the seed through a hash function called a “hash-based message authentication code” (HMAC) to generate another set of 64 bytes. Within the new set of 64 bytes, the first 32 bytes are the private key and the last 32 bytes are the chain code. A chain code is an extra 32 bytes that we couple with the private key — this then creates what is known as an extended key. We will revisit this concept later, but for now, understand that a chain code is a “secret” piece of data that is not broadcasted on the network. This will be important for later discussion on child keys.
Circling back to extended keys: these are private or public keys that can be used to derive new keys (child keys) in an HD wallet. A single extended private key can be used as a source for all subsequent child private / public key pairs in a wallet. Additionally, an extended public key will be able to generate the exact same child public keys (as the extended private key), without any knowledge of the associated private keys. The private key that makes up half of the extended key can still be used to create a corresponding public key as described in a previous installment.
Child Keys and Nuances
Recall that in an HD wallet, new keys and addresses are derived from the same source (seed) and those subsequent outputs can be organized into a tree — they are hierarchical. In this way, we can take our master extended private key (private key + chain code) and run it through the HMAC function and thus generate child keys. An index number is also hased with the master extended private key in this step. This is included so that multiple child keys can be created from a single master extended private key, as each time the index number changes, the hashing output is different. For reference, a single extended key can generate 2,147,483,648 of these child keys.
We mentioned in the previous section that it is possible to create an extended public key that can generate the same child public keys as would be generated by an extended private key — in the resulting private / public key pairs. To construct the extended public key, we would need the public key from the extended private key, and couple that with the same chain code used to create the extended private key.
The master extended private key creates child private keys by running the contents of the corresponding extended public key through the HMAC function and adding the result to the original private key. Similarly, the master extended public key creates new child public keys by putting its contents through the HMAC function, and adding the result to the original public key. And so, because the original private key and public key have been adjusted by the same amount (chain code and index number), the new child private keys and public keys will correspond.
Why do we extend keys with a chain code?
In short, we add this code to a key because it delineates the subsequent child keys generated from the original key — as they are not solely derived from the preceding key. Say, for example, that we used one of the public keys to receive some bitcoin, this public key would then be visible on the blockchain. If chain codes were not used to derive the child key that we used, someone would be able to take that public key and derive all the children for it.
But by using chain codes, which are secret pieces of data that are not broadcasted to the network, any would-be malicious entities are unable to derive the children. And to reiterate: when we couple a normal key with that piece of chain code, we refer to that output as an extended key.
Moreover, it is not possible to tell that two public keys (or addresses) in a tree are part of the same wallet — i.e. that they were derived from the same master extended key. Although the child keys are derived from the extended key in a specific manner, the actual public keys themselves do not share any resemblance; it should appear that they were generated completely independently.
One should take great care to properly secure the extended keys, as again, anyone who can access them would be able to derive all of their children. For example, if someone’s extended master public key was leaked, anyone willing to would be able to derive all children and thus find all addresses associated with those child public keys. Now, because those are public keys, the funds are not at risk of being stolen, but the would-be malicious entity could see exactly how much bitcoin is owned in that wallet.
In the event that an extended public key and any child private key were leaked, a malicious entity would be able to take those two extended keys and calculate the master extended private key. In such a case, the entity would be able to generate all private / public key pairs and therefore steal all funds owned by the wallet. Something to keep in mind with regard to security protocols.
Conclusions
The bitcoin network continues to benefit from advances made in public key cryptography. While these concepts may seem more technical than other aspects of the bitcoin network’s technology stack, they are nonetheless important to understand to increase one’s knowledge of the network.
Along those same lines, because the majority of wallets today abstract away the processes described in this piece, most users are completely unaware of what goes on “behind the scenes.” Speaking from experience, increasing one’s general knowledge of the technological systems that underlie the bitcoin network is a surefire way to become more comfortable with transacting on the network, as well.
We hope this piece has helped expand your technical knowledge, thank you for reading!
🚀Better Indexes, Better Returns — VegaX
Questions? 👉🏼 info@vegaxholdings.com
🎯Follow, Message, Tweet, Clap, & Join Us Here:
As we continue to expand our digital footprint, here is where you can find us:
VegaX Site | VegaX Twitter | VegaX FB | VegaX LinkedIn | VegaX IG
VegaX Holdings creates next-generation indices and financial products that are needed to support the growth of the cryptocurrency industry and improve returns for investors worldwide. Learn how to get better returns today: www.VegaXHoldings.com