pk.org: Computer Security/Lecture Notes

Bitcoin Overview

An architecture walkthrough

Paul Krzyzanowski – 2025-10-05

Bitcoin is not just digital money. It is a distributed system that maintains a shared, tamper-resistant record of transactions without a central authority.

To understand how it works, we’ll start with the cryptographic structures that make this possible, then see how transactions, blocks, and consensus fit together into one coherent design.

Cryptographic Foundations

Bitcoin's cryptographic security relies on hash functions and public key cryptography in the form of digital signatures. We won't review public key cryptography, but before we look at Bitcoin itself, we need to cover three basic structures related to hashing: hash functions, hash pointers, and Merkle trees. You’ve already seen hash functions before, but we’ll review them briefly in context.

Hash functions

A hash function takes any input, whether a short message or an entire book, and produces a fixed-length output, called a digest. Bitcoin uses SHA-256, which generates a 256-bit (32-byte) hash.

Cryptographic hash functions have three crucial properties:

  1. Preimage resistance: given a hash, it’s infeasible to find an input that produces it.

  2. Collision resistance: it’s infeasible to find two inputs that produce the same hash.

  3. Avalanche effect: a one-bit change in the input drastically changes the output.

A cryptographic hash serves as a compact checksum for a message. If a message has been modified, it will yield a different hash. By associating a hash with a message, we have a basis for managing the integrity of that message: being able to detect if the message gets changed.

Bitcoin relies on them to detect tampering, commit to large sets of data, and limit the rate at which new blocks are added.

Hash pointers

Pointers are used in data structures to allow one data element to refer to another. In processes, a pointer is a memory location. It could also be an object reference, like a file name. In distributed systems, a pointer may be the name or IP address of a computer, along with an object identifier.

A hash pointer is like a regular pointer, but with authentication built in.

Instead of only storing the address of some data, it also stores the hash of that data. If anyone changes the data, its hash changes too, and the mismatch becomes evident.

Hash pointers are common in authenticated data structures. For example, Git uses them to link commits. Each commit stores the hash of its parent and the hash of the directory tree it represents. If a file changes, its hash changes, which propagates up to the commit hash. That’s why Git can instantly detect if history was rewritten.

Blockchains: tamper-resistant linked lists

The same structures that use pointers can be adapted to use hash pointers to create tamper-evident structures.

For example, a linked list can be constructed with each element containing a hash pointer to the next element instead of a pointer.

A blockchain Adding a new block is easy. You allocate the block, copy the head hash pointer into it (the next pointer), and update the head hash pointer to point to the new block and contain a hash of that block.

If an adversary modifies, say, data block 1, we can detect that: the hash pointer in Data-2 will point to Data-1, but the hash of Data-1 will no longer match the hash in the pointer.

For a successful attack, the adversary will also need to modify the hash value in the hash pointer in block 2. That will make the hash pointer in block 3 invalid, so that will need to be changed.

The adversary will need to change all the hash pointers leading up to the head of the list. If we can protect the head of the list so that the adversary cannot modify it, then we will always be able to detect tampering. A linked list connected by hash pointers is called a blockchain.

In a simple version, there is a “root pointer” that identifies the head of the chain (just like with linked lists). If anything deep in history changes, the root hash will no longer match. As long as the root pointer is carefully protected, tampering with the list can always be detected.

Bitcoin doesn’t have a single central root pointer, but each node implicitly tracks the chain’s head through these hashes.

Merkle Trees

A Merkle tree (proposed by Ralph Merkle in 1979) is a binary tree structure used to summarize and verify large collections of data efficiently. Each node in the tree contains a hash, and the hash of a parent node is computed from the hashes of its two children.

Merkle Tree At the bottom, leaf nodes store hash pointers to data blocks. Each leaf stores both a reference to a data block and the hash of its contents. Every non-leaf node stores the hash of the concatenation of its two child hashes. This continues up the tree until we reach a single Merkle root, which uniquely represents the entire dataset.

If even one data block changes, its hash changes, which in turn alters every hash up to the root. This means the Merkle root serves as a compact fingerprint for the integrity of all underlying data.

Merkle trees also allow efficient proofs of inclusion. To verify that a specific data block is part of a dataset, we only need to provide the hashes along the path from that leaf to the root. The verifier can recompute the root and compare it to the known value. This proof requires only a small number of hashes, which grows logarithmically with the number of data blocks, making it practical even for very large datasets.

Real-world examples:

In Bitcoin, each block header contains a single Merkle root that summarizes all transactions in that block. This design lets lightweight Bitcoin clients verify that a transaction is included in a block, without needing to download or store every transaction. The Merkle root links all transaction data to the block header, which is then protected by proof of work, making the entire structure tamper-evident.


The Ledger: Transactions vs. Accounts

Bitcoin was introduced anonymously in 2009 by a person or group under the pseudonym Satoshi Nakamoto and is considered to be the first blockchain-based cryptocurrency. Bitcoin was designed as an open, distributed, public system. There is no authoritative entity, and anyone can participate in operating the servers.

To understand Bitcoin’s ledger, start by comparing it to a familiar model: a bank account.

Account-based systems

A traditional bank maintains one entry per customer: an account balance. When you transfer $100 to someone, your balance decreases and theirs increases. The system modifies account records directly. Beyond auditing, there is no need to maintain a log of all transactions; we simply care about tracking account sums. With a centralized system, all trust resides in this trusted third party. The system fails if the bank disappears, the banker makes a mistake, or if the banker is corrupt.

Bitcoin’s transaction-based system

Bitcoin achieves decentralization by tracking the flow of value through transactions rather than maintaining account balances. Every transfer of bitcoin is a record of ownership changing from one party to another, and the complete history of these transfers forms the ledger.

Instead of keeping mutable balances, Bitcoin records a list of transactions, each describing how existing coins are spent and who receives them. The current state of the system is the set of all unspent transaction outputs (UTXOs).

For example:

Each new transaction consumes previous outputs and creates new ones, forming a chain of ownership. No one can spend an output twice, and no central authority is needed to maintain balances. We’ll look next at how individual transactions are constructed, broadcast, and verified.


Keys and Addresses

So far, we have talked about who sends and receives Bitcoin, but we have not defined who that really is. Bitcoin has no usernames or accounts. Identity is defined by cryptography.

Keys

Every participant generates a public/private key pair using Elliptic Curve Cryptography (ECC). The private key is a random 256-bit number known only to its owner. The public key is derived from it mathematically.

When someone creates a transaction, they include a digital signature produced with their private key. Anyone can verify that signature using the corresponding public key, which confirms that the transaction was authorized by whoever controls that key.

Ownership in Bitcoin simply means the ability to prove control over a private key. This gives Bitcoin its pseudonymity: a person’s on-chain identity is a key pair.

Addresses

Public keys are long and not convenient to use directly. To simplify and add safety, Bitcoin uses addresses, which are compact, readable identifiers derived from public keys.

Conceptually, a Bitcoin address answers the question: who can spend this output? When you send Bitcoin, your transaction output specifies the recipient’s address. Later, when that recipient spends the funds, they reveal their public key and digital signature. Each node verifies that the public key hashes to the same address and that the signature is valid, proving that the spender controls the matching private key.

All Bitcoin addresses are created in roughly the same way:

  1. Start with the public key.

  2. Hash it with SHA-256, then hash that result with RIPEMD-160 to get a short 160-bit fingerprint.

  3. Add a version identifier and checksum to detect typing errors.

  4. Encode the result in a human-friendly alphabet.

The earliest format uses a 58-character alphabet called Base58Check. These addresses typically begin with a 1. They are still common and supported everywhere.

A newer format uses Bech32, which is designed to be more compact and easier to read or type correctly. Bech32 addresses usually start with bc1. They are used by modern wallets because they integrate with Bitcoin’s segregated witness feature, which reduces transaction size and fees.

No matter the format, both types of addresses serve the same purpose: they are hashes of public keys used to identify who can spend a given output. Given a public key, anyone can compute the address. Given only the address, it is computationally infeasible to recover the public key or private key.

Transactions

A transaction contains inputs and outputs. Inputs identify where the bitcoin comes from, and outputs identify to whom it is being transferred.

If Alice wants to send bitcoin to Bob, she creates a message that is a Bitcoin transaction and sends it to one or more Bitcoin nodes. When a node receives a transaction, it forwards it to its peers, other nodes it knows about. Typically, within a few seconds, most nodes on the network will have a copy of the transaction and can verify it.

The Bitcoin network is not a database. It is built around a ledger, the complete list of all transactions. There are no user accounts that can be queried. In her transaction, Alice must provide one or more links to previous transactions that together add up to at least the amount she wants to send. These links to earlier transactions are called inputs. Each input names a prior transaction by its ID and identifies a specific output from that transaction.

Alice’s effective balance is not stored anywhere. It is the sum of all transactions that sent her money, minus the transactions where she spent those outputs. This total is the sum of her unspent transaction outputs (UTXO).

When a Bitcoin node receives a transaction, it performs several checks:

  1. Signature validation: For each input, the node verifies the digital signature using the provided public key to confirm that the spender controls the corresponding private key.

  2. Address match: The node hashes the provided public key and checks that it matches the address that the previous output locked to.

  3. Double-spend protection: The node ensures that none of the referenced outputs have already been spent by another transaction.

  4. Value verification: The node checks that total input value is at least the total output value. Any difference is the transaction fee.

A Bitcoin transaction typically contains:

Inputs
Each input identifies where coins come from by referencing a specific output of an earlier transaction, and it provides a signature and public key. A transaction may include multiple inputs, possibly from different addresses controlled by the same user.
Outputs
Each output specifies a destination address and the amount to transfer to that address.
Change
Because every input must be completely spent, any excess value is returned to the sender as a new output that sends the remaining amount back to one of the sender’s addresses.
Transaction fee
The small remainder between total inputs and outputs, paid to the miner who includes the transaction in a block. Block space is limited. Bitcoin uses a block weight limit, roughly equivalent to about 1 MB of data. A typical transaction is around 250 bytes. To get confirmed quickly, users may offer higher fees when the network is busy.

Blocks: Grouping Transactions

If Bitcoin is just a list of transactions, why bother grouping them into blocks?

Blocks make it possible to:

Every few minutes, new transactions circulate across the network. Some Bitcoin nodes, known as miners, collect batches of valid transactions into a block and attempt to publish the next block in the chain. Other nodes, called full nodes, do not mine but verify all transactions and blocks to maintain the integrity of the ledger.

Block structure

Each block contains:

The header’s previous block hash links it to the chain. The Merkle root summarizes all transactions in the block. If even one transaction changes, the Merkle root changes, which in turn changes the block hash.

Immutability

This design makes the chain tamper-evident. If someone alters a past transaction, its hash changes, invalidating the Merkle root and the block hash. Every later block references that hash, so the entire chain from that point onward no longer matches. By linking each block to the one before it through hashing, the blockchain creates a permanent, verifiable record of history. The next section explains why rewriting that history is not just detectable but computationally infeasible.

Proof of Work and Consensus

Now that we have blocks, we need a way for the network to agree on which block becomes the next one.

The concurrency problem

Multiple miners can discover new blocks around the same time. If two blocks reference the same previous block, the chain splits temporarily, a fork. The network needs a way to resolve these conflicts consistently.

Hashcash and the origin of Proof of Work

The solution comes from Hashcash, proposed by Adam Back in 1997 as a way to fight email spam. It required the sender to perform a small, useless computation, finding a value whose hash started with a certain number of zeros, before sending a message. That small cost discouraged mass spamming but was negligible for legitimate users.

Bitcoin repurposes this idea for consensus.

Proof of Work

In Bitcoin, miners must find a nonce, a number in the header that they can freely vary, so that the resulting hash of the block header is less than a target value. Formally:

\[H(\text{block header}) < \text{target}\]

Since hash outputs are unpredictable, the only way to find a valid nonce is by trial and error. This computation takes time and energy, making block creation intentionally slow and costly.

Difficulty adjustment

To keep the average time between blocks near 10 minutes, the network automatically adjusts the difficulty every 2016 blocks (roughly two weeks). If miners collectively produce blocks too fast, the target value decreases, making the puzzle harder.

Why this works

Once a valid block is found, verifying it is trivial; any node can check the hash in microseconds. But generating it took substantial work. This asymmetry lets the network recognize the most “expensive” chain as the legitimate one.

Chain selection

Nodes always follow the longest valid chain, meaning the one with the greatest cumulative proof of work. If a fork occurs, miners continue building on whichever branch grows fastest.

For an attacker to modify an earlier transaction, they’d have to redo all proof of work from that block onward and surpass the rest of the network. With thousands of miners contributing massive computational power, catching up is practically impossible.

Why rewriting history is impractical

Each block’s hash is the result of extensive proof of work. Changing even a single transaction in an old block would require recomputing the proof of work for that block and for every subsequent block that builds upon it. While this is theoretically possible, the rest of the network continues to add new blocks, making it nearly impossible for an attacker to catch up.

Over time, Bitcoin mining has evolved from individuals using CPUs, to GPUs, to FPGAs, and finally to custom ASICs (Application-Specific Integrated Circuits) designed solely for SHA-256 hashing. Because the odds of finding a valid block are low, miners often combine their resources into mining pools and share rewards based on contributed work.

Finding a valid block is essentially a random event, similar to winning a lottery, where the chance of success is proportional to the miner’s computational power.

An attacker who controlled more than half of the total computational power of the network, known as a 51% attack, could in principle rewrite recent history by outpacing honest miners. However, the cost of acquiring and operating enough hardware to do this across the global Bitcoin network is so high that such an attack is effectively infeasible in practice.


Mining Rewards and Incentives

Why would anyone devote their hardware and electricity to perform this computation?

Block rewards

Each newly mined block includes one special transaction, the coinbase transaction, that creates new bitcoins from nothing. This is how new coins enter circulation.

The initial reward in 2009 was 50 BTC per block. Every 210,000 blocks (roughly every four years), it halves. As of 2025, the reward is 3.125 BTC per block:

Transaction fees

In addition to the block reward, miners collect all the transaction fees from the transactions in the block. Each fee equals the difference between the total input and output values of that transaction.

Over time, as the block reward continues to halve, transaction fees are expected to become the main incentive for mining.

In total, there will be 32 Bitcoin halvings. After that, the reward will reach zero and there will be a maximum of around 21 million bitcoins in circulation. At that point, the only reward will be the sum of all the transaction fees in that block.

Economic balance

Proof of work creates security through cost. Miners act honestly because their revenue depends on following the rules. Any attempt to cheat or fork the chain would destroy their own reward. This self-interest forms the backbone of Bitcoin’s decentralized stability.


Putting It All Together

Bitcoin’s architecture layers several ideas into one coherent system:

Layer Purpose
Cryptography Provides integrity and authorization via hashes and signatures.
Data structures The blockchain and Merkle trees maintain an authenticated, append-only ledger.
Consensus Proof of work synchronizes a distributed network without trust.
Economics Rewards and fees motivate participation and honest behavior.

Key ideas applied in Bitcoin's architecture include:

Each transaction transfers ownership of specific outputs. Miners package these transactions into blocks, perform proof of work to secure them, and broadcast the new blocks. Every node validates each block and updates its local copy of the ledger. The chain with the most accumulated work becomes the authoritative record.


Bitcoin is a system where cryptography replaces trust, computation replaces authority, and incentives replace enforcement. Its design shows how a distributed network of strangers can maintain a shared, immutable ledger without any central control.


Addendum: The Evolution of Cryptocurrencies After Bitcoin

Bitcoin introduced the world to a working model of decentralized digital money. It remains the largest and most influential cryptocurrency, but it also inspired a vast ecosystem of alternatives. Since Bitcoin’s launch in 2009, more than 10,000 cryptocurrencies have been created. Most have failed or never gained significant adoption, but a few introduced important new ideas.

Ethereum and Smart Contracts

The most influential successor to Bitcoin is Ethereum, proposed in 2013 by Vitalik Buterin and launched in 2015. While Bitcoin’s purpose is narrow—to securely transfer value—Ethereum was designed as a general-purpose platform for decentralized applications.

Ethereum introduced the concept of a smart contract: a small program that runs on the blockchain and automatically executes when its conditions are met. Smart contracts are written in a high-level language called Solidity and compiled into bytecode that runs on the Ethereum Virtual Machine (EVM), a global, distributed computation engine replicated across all Ethereum nodes.

Every node must execute the same contract code and arrive at the same result, which ensures consensus but also makes computation expensive and slow. Because execution happens redundantly on thousands of nodes, even simple operations consume energy and time. Smart contracts must therefore be deterministic: no contract can rely on random values or external data unless those values are provided through a trusted intermediary known as an oracle.

Bugs or design flaws in smart contracts cannot easily be patched once deployed, and errors have caused several major losses, such as the 2016 DAO exploit that led to a controversial Ethereum hard fork.

A simple smart contract might hold escrowed funds and release them automatically when a deadline passes or when certain data from another contract is received. More complex examples include:

This idea transformed blockchain systems from passive ledgers into programmable platforms. Ethereum’s flexibility led to a surge of experimentation and new tokens built on its network, though it also introduced security challenges, scalability limits, and high transaction fees during peak use.

Beyond Proof of Work

Bitcoin’s proof of work mechanism provided decentralized consensus but at high computational and environmental cost. Over time, other systems have explored alternatives that maintain security while reducing energy use.

Proof of Stake (PoS)

The leading alternative is proof of stake, where validators are chosen to create blocks based on the amount of cryptocurrency they hold and are willing to “stake” as collateral. If they act dishonestly, they lose part of their stake. This approach replaces computation with financial commitment. Ethereum transitioned to proof of stake in 2022 during its major Merge upgrade, cutting energy consumption by more than 99%.

Other Models

Other systems experiment with hybrid or alternative consensus designs:

These models aim to improve scalability, speed, and efficiency while maintaining decentralization.

Stablecoins

One of the biggest practical problems in cryptocurrency is volatility: the value of most cryptocurrencies fluctuates wildly, making them speculative investments (gambles) rather than useful vehicless for commerce. A solution emerged in the form of stablecoins, which are digital tokens pegged to the value of a stable asset such as the U.S. dollar.

Common types include:

Stablecoins are now critical in cryptocurrency trading and DeFi applications because they provide a stable unit of account within an otherwise volatile ecosystem.

NFTs and Meme Coins

By 2020, public interest in blockchain extended far beyond currency. Two particularly visible, though sometimes questionable, phenomena were non-fungible tokens (NFTs) and meme coins.

NFTs

Bored Apes
Bored Apes

An NFT is a unique token that represents ownership of a specific digital asset, such as an artwork, song, or in-game item. Each token is recorded on a blockchain and can include metadata linking to the digital object it represents.

In most cases, the token does not contain the actual image or file, only a URL or reference to where that file is stored, often on a centralized web server or a distributed file system such as IPFS. If the file or the hosting server disappears, the link in the NFT may no longer resolve. Anyone can still view or copy the underlying file, which makes the concept of “ownership” ambiguous: the blockchain records ownership of the token, not exclusive rights to the asset itself.

First 5000 Days
First 5000 Days

NFTs became a cultural phenomenon in 2021. Collections such as Bored Ape Yacht Club (BAYC) and CryptoPunks sold for hundreds of thousands to millions of dollars per image. Some NFTs of digital art were auctioned at traditional houses such as Christie’s, where Beeple’s Everydays: The First 5000 Days sold for over $69 million. Its current value is estimated to be less than $100.

Much of the excitement was speculative, driven by social status, scarcity, and celebrity endorsements rather than intrinsic artistic or technical value.

While NFTs demonstrated new ways to represent digital ownership, the market’s collapse in 2022 highlighted how little of that ownership translated into lasting worth or control.

Meme Coins

Meme coins are cryptocurrencies created as jokes or social experiments, often referencing internet memes. The most famous example is Dogecoin, created in 2013 as a parody of Bitcoin.

Despite its origins, it gained a strong community and widespread attention after public endorsements from figures such as Elon Musk. Thousands of imitators followed, most with little purpose beyond speculation and humor. These coins show how easy it is to create tokens, but also how difficult it is for most to maintain value or legitimacy.

The Landscape Today

Bitcoin remains the first, largest, and most influential cryptocurrency by both market value and cultural impact. It proved that decentralized consensus could sustain a working financial system. Most later projects either built upon or reacted to Bitcoin’s model:

As of today, thousands of cryptocurrencies exist, but the vast majority have little adoption or real utility. Bitcoin’s original principles, decentralization, transparency, and cryptographic trust, remain the foundation of the entire digital currency ecosystem.


References

Info about Bitcoin's cryptography:

Information about Biutcoin mining difficulty

Consensus

Other aspects of Bitcoin:

Bitcoin statistics:

Other info (mostly touching upon Ethereum):