Special thanks to Vishwanath Raman from Oasis Labs and David Whittington from AR.IO for feedback and insights.
Can distributed ledger technology decrease cloud outages, helping companies save over $26.5 billion per year? Or can blockchains help us build a fully connected and autonomous driving network, helping reduce the nearly 1.9 million annual traffic deaths worldwide? Could we even use the same technology that powers Bitcoin and Ethereum to enable humans to become an interplanetary species?
The unifying element in these use cases–cryptocurrency, cloud outages, autonomous vehicle networks, and even space travel–is overcoming byzantine failures. While blockchains can’t yet support the throughput or performance required for many of these applications, overcoming byzantine failures and allowing businesses and institutions to easily share open data in trustless environments has the potential to realize over $3 trillion per year in economic value.
At Kwil, we often describe our database software as a “Byzantine Fault Tolerant Postgres”. This article will explain precisely and practically what “Byzantine Fault Tolerant Postgres” means and how it enables real-world, trustless digital infrastructure.
“Byzantine Fault Tolerant Postgres” is the combination of two distinct technologies: byzantine fault tolerant (BFT) consensus algorithms and the PostgreSQL database engine.
Byzantine fault tolerance arises from the “Byzantine Generals Problem”, a distributed systems problem conceived by researchers in 1982. The Byzantine Generals Problem imagines a hypothetical scenario of Byzantine armies, each with its own general, camped outside an enemy city. Without a central authority to command and coordinate attacks, the generals must collectively vote on whether to attack or retreat via messengers.
The key challenge in the Byzantine Generals Problem is the generals have no way of knowing whether the other generals are loyal or traitors. For example, General 1 could tell Generals 2 and 3 to attack, and General 2 could tell General 3 that General 1 said to retreat (as shown below). General 3 cannot know whether General 1 or General 2 is the traitor, and thus, the three generals cannot progress safely.
For a system of n number of traitors to proceed safely, there must be 3n + 1 generals; therefore, > 2/3 of the nodes must agree and act honestly. If >= 1/3 of the nodes in the system act deceptively, then the nodes will not be able to discern the safe path forward (i.e. the system loses liveness). I won’t go into the mathematical proof for the 2/3 threshold, but you can read a succinct and accessible explanation here.
The key breakthrough with byzantine fault tolerance is that networks can be run and maintained by parties that do not trust each other. The most popular application of byzantine fault tolerance today is cryptocurrency networks; however, other real-world examples include enabling fault-tolerant safety systems on Boeing airplanes and protecting against nodes corrupted by radiation in SpaceX’s International Space Station landing procedures.
In a Postgres database network, byzantine fault tolerance means that more than 2/3 of the nodes must agree on the safe path forward. Assuming a blockchain-based approach, this means that greater than 2/3 of the nodes must agree on each block, including all the transactions within the block, before progressing to the next block. Therefore, BFT Postgres networks can be run without trusting the other parties running nodes.
PostgreSQL, also known as Postgres, is an advanced, open-source, object-relational database engine. It is among the most popular pieces of software worldwide, used in production at organizations ranging from the International Space Station to Reddit and Instagram. Relational databases comprise 60% of the total database market, with Postgres commanding 17.4% alone. In the last week, you have probably used or interacted with an application that relies on a Postgres database.
The key to Postgres (and other SQL databases) is that it uses the relational model, storing data in structured tables. Tables are comprised of columns and rows, where each row represents an individual piece of data (also called a record), and each column represents an attribute. By storing data in a relational structure, data can be normalized, enabling better maintainability as a dataset grows and the ability to combine data in ways that yield unique insights. For example, a university that needs to manage information about its academic staff and their professional activities could maintain a relational database with the following normalized structure:
In the example above, storing data in a normalized structure offers several advantages. First, the normalized structure minimizes data redundancy: each piece of information is stored only once, reducing the likelihood of inconsistencies and errors. Second, it simplifies updates in the database: changes to a professor’s details, like a department change, only need to be made in one table, making the update process more efficient and less error-prone. Third, the relational structure makes it easier to combine relationships and queries across tables, such as the relationship between courses and publications that are relevant to the subject. Lastly, normalization makes database maintenance easier, especially as the volume of data grows over time.
A further benefit of the relational model, especially in contrast to NoSQL databases, is transactional integrity, guaranteeing that all or none of a database operation is executed and that concurrent transactions do not interfere (also known as ACID transactions). This is critical in fields where precision, consistency, and reliability are key, such as banking and healthcare.
As inferred from the name, “Byzantine Fault Tolerant Postgres” combines the two technologies mentioned above: a byzantine fault tolerant consensus algorithm running over multiple Postgres databases.
In a Kwil network, each node maintains a BFT blockchain and a Postgres database. When a state-changing transaction enters the system (e.g. database creation, database drop, insert/update/delete statement, or a funds transfer), the node first broadcasts the transaction to the other validator nodes. Each validator node then uses the transaction's digital signature to determine whether the transaction’s origin is legitimate.
To guarantee strong consistency and data storage and deletion, nodes must first achieve consensus on the valid transactions in a block. Once a block is confirmed, the nodes execute the transactions on their local Postgres database. To propose or validate the next block, a validator must prove that it correctly applied the changes from the previous block. This is proven by generating a proof containing the result of all the proposed changes from the previous block, applying a deterministic ordering schema, and generating a resulting hash. The resulting hash is then included in the next block, and each node must have the correct hash to continue in the block mining process. This process ensures that each node stores the data on its Postgres database, guaranteeing that nodes properly insert, update, and delete data.
The idea of a byzantine fault tolerant relational database is not new; multiple academic articles have been published about how a generalizable BFT Database could help improve reliability and uptime while protecting against malicious actors in a distributed system. However, until recently, consensus algorithms were neither sophisticated enough nor could handle the data throughput required to be truly byzantine fault tolerant. Recent interest and funding for blockchain technology, largely due to public interest in cryptocurrencies, has spurred significant advances in BFT research and development.
The main advantage of a byzantine fault tolerant database is that it allows for databases to be operated and shared among parties that may not “trust” each other. In a cryptocurrency network, parties may not “trust” each other because malicious actors seek to subvert the system for financial gain. In space landing systems, parties may not “trust” each other because of radiation-induced “bit flips” that corrupt a computer’s memory. Byzantine fault-tolerant databases enable trustless digital infrastructure.
A key advantage of trustless digital infrastructure is that, for the first time, it establishes inviolable rules for how an application will behave. Because no centralized actor can subvert, change, or otherwise disrupt the infrastructure, users can confidently rely on that infrastructure without the de-platforming and high rent-seeking risks in centralized systems. For example, the idOS Network uses Kwil to store user data across a multi-node, multi-party, BFT database network. The idOS’s top consumers—regulated DeFi protocols such as Gnosis Pay—require a stable and reliable identity protocol, not an identity company, to identify and KYC its users. Using a centralized identity company would mean that Gnosis Pay could not make cryptographic guarantees about how its entire protocol would function; the identity company could suddenly disrupt the protocol. Using an identity protocol means that Gnosis Pay has an inviolable guarantee for how its user onboarding and KYC will behave, and it can build its platform on the premise that those guarantees will not change. For the idOS, using a byzantine fault tolerant database allows it to make these inviolable guarantees needed to power regulated DeFi protocols.
A “Byzantine Fault Tolerant Postgres” database is the unlock for trustless digital infrastructure, enabling a freer, fairer, and more resilient internet. As throughput, latency, and generalized computing capabilities continue to accelerate, trustless technologies like a BFT database will be able to reshape the ways we think about and require trust in technology.
Does building with a BFT database sound interesting to you? Kwil’s binaries are available and free to use, and we are excited to go open source with our v0.7 release next month. Anyone can start a multi-node, BFT database network with just four terminal commands.
Although the current Kwil release (v0.6) is based on the simpler, less feature-rich SQLite database engine, the upcoming release (v0.7) uses the PostgreSQL engine.
Questions or comments? Disagree with this article? Join our Discord, or reach out to [email protected]. As always, we welcome community feedback and the opportunity to push decentralized technologies collaboratively into the future.
[1] Image source: https://www.cl.cam.ac.uk/teaching/2122/ConcDisSys/dist-sys-notes.pdf
[2] Image source: https://www.cl.cam.ac.uk/teaching/2122/ConcDisSys/dist-sys-notes.pdf