RESOURCE CENTER
Highway: Efficient Consensus with Flexible Finality
January 19, 2021 | We propose Highway, a new consensus protocol that is safe and live in the classical partially synchronous BFT model, while at the same time offering practical improvements over existing solutions.
Daniel Kane1, Andreas Fackler, Adam Gągol3, and Damian Straszak
Computer Science and Engineering Department, UC San Diego
CasperLabs AG | Cardinal Cryptography
Abstract
There has been recently a lot of progress in designing efficient partially synchronous BFT consensus protocols that are meant to serve as core consensus engines for Proof of Stake blockchain systems. While the state-of-the-art solutions attain virtually optimal performance under this theoretical model, there is still room for improvement, as several practical aspects of such systems are not captured by this model. Most notably, during regular execution, due to financial incentives in such systems, one expects an overwhelming fraction of nodes to honestly follow the protocol rules and only few of them to be faulty, most likely due to temporary network issues. Intuitively, the fact that almost all nodes behave honestly should result in stronger confidence in blocks finalized in such periods, however it is not the case under the classical model, where finality is binary.
We propose Highway, a new consensus protocol that is safe and live in the classical par- tially synchronous BFT model, while at the same time offering practical improvements over existing solutions. Specifically, block finality in Highway is not binary but is expressed by fraction of nodes that would need to break the protocol rules in order for a block to be re- verted. During periods of honest participation finality of blocks might reach well beyond 1∕3 (as what would be the maximum for classical protocols), up to even 1 (complete certainty). Having finality defined this way, Highway offers flexibility with respect to the configuration of security thresholds among nodes running the protocol, allowing nodes with lower thresh- olds to reach finality faster than the ones requiring higher levels of confidence.
1. Introduction
Since the introduction of Bitcoin [Nak08] and the concept of a decentralized, tamperproof database – a blockchain – a number of different paradigms have been developed to design such databases. Recently, the idea of building such systems based on PoS (Proof of Stake) has gained significant popularity. While in the original PoW (Proof of Work, as used in Bitcoin) mechanism that is used for incentivizing participation and securing the system, the voting power of a participant is proportional to the amount of computational power possessed, in PoS the voting power is pro- portional to the amount of tokens (digital currency specific to this system). A popular choice in such systems is then to periodically delegate a fixed size committee of participants which then is responsible for running the consensus on which blocks to add to the blockchain. This way of building a blockchain has two substantial advantages over vanilla PoW systems such as Bitcoin:
1) it allows to run one of the classical permissioned consensus protocols that have been developed over the last 4 decades,
2) it allows to not only reward nodes for participation but also penalize misbehavior, by slashing security deposits of the offending committee members.
There has been recently tremendous progress in the design of permissioned consensus protocols that can be used as core engines in such PoS blockchains [AMN+19, BKM18, BG17, CS20, GLSS19, GAG+19, YMR+19, ZRAP18]. A vast majority of them are designed in the partially synchronous BFT model [DLS88] which asserts that communication between nodes becomes eventually synchronous and that no more than a given fraction of nodes, say 1∕3 (which is optimal in this model), are dishonest and may violate the protocol in an arbitrary way. State-of-the-art protocols such as Hotstuff [YMR+19], Tendermint [BG17] and Streamlet [CS20] come close to optimality with respect to bandwith, latency of finalization and, also importantly, simplicity. However, there are several practical properties of such blockchain systems that are not captured by this classical model, and consequently, significant room for improvement remains. One such important aspect is that the partition of nodes into honest and Byzantine might not accurately reflect their true attitude. In fact, according to the model, even “honest” nodes that have missed several protocol messages because of a DDoS attack or even a temporary network failure, are considered Byzantine. In a situation where more than 1∕3 of nodes suffered (even for a few seconds) from such a network issue, protocols in the classical BFT model are not guaranteed to function properly.
On the other hand, besides these occasional offline periods, it is fair to assume that in a real-world system an overwhelming fraction, if not all, of the nodes honestly follow the protocol rules. This is a consequence of the financial incentives for honest participation. Indeed, it is in the best interest of committee members to make sure they actively participate in the consensus protocol, as they are paid a salary for honest work and are penalized for being offline or not contributing enough to the protocol progress. In fact, because of penalties for protocol offences, it is highly unlikely that an adversary tries an attack which is not guaranteed to succeed, as otherwise it risks significant losses. Therefore, with the only exception of large-scale, coordinated attacks that are intended to bring down the whole system, one should always expect almost all nodes to behave honestly.
Motivated by this realization there have been several works that design protocols which are safe in the classical sense while at the same time trying to offer better guarantees in “typical” scenarios. In this paper we propose a new protocol – Highway – that contributes to this line of work. The security of Highway is still formalized on grounds of the partially synchronous BFT model, thus in particular it achieves safety and liveness in the most demanding setting when 1∕3 of all nodes are Byzantine. However, on top of that, Highway offers the following two features that make it particularly attractive in real-world deployments. First of all, in periods of honest participation of a large fraction of nodes, it allows to reach finality of blocks with “confidence” much higher than the typical threshold of 1∕3. To give an example, if a block reaches finality confidence of 0.8 (which is possible in Highway) then at least 80% of the nodes would need to violate the protocol in order to revert the block from the chain. This stands in contrast with the classical notion of finalization that is binary: either a block is finalized (this means finality confidence of 1∕3) or it is not. The second practical improvement in Highway is that it achieves flexibility akin to the notion defined in [MNR19]. The nodes participating in Highway might be configured with different security trade-offs between the allowed number of Byzantine and crashing nodes (nodes that might go offline but are otherwise honest) in the protocol. Flexibility then means that despite these differences in configuration, all the nodes run a single version of the protocol and perform the same actions, only the finality decisions they make depend on the chosen parameters. A practical consequence is that nodes with lower security thresholds might reach finality much faster than nodes with higher thresholds, but as long as both these nodes’ assumptions are satisfied they finalize the same blocks and stay in agreement.
Technically, Highway can be categorized as a DAG-based protocol [Bai16, GLSS19, MMS99, ZRAP18], in which nodes jointly maintain a common history of protocol messages, forming a directed acyclic graph representing the causality order. In its design, Highway derives from the CBC-Casper approach [ZRAP18] and significantly improves upon it by the use of a new finality mechanism, message creation schedule and spam prevention mechanism. We believe that the conceptual simplicity of DAG-based protocols along with the desirable practical features of the Highway protocol make it a solid choice for a consensus engine in a Proof of Stake-based blockchain.
2. Our Results
2.1 Model
We consider a system with a fixed set of n validators, each of them equipped with a public key that is known to other nodes. This model matches the scenario of “permissioned blockchain”, but the protocol can be applied to semi-permissionless scenario as well by rotating the set of validators. Our model makes the following assumptions:
- (Reliable point-to-point communication) We assume that channels do not drop messages and all messages in the protocol are authenticated by digital signature of the sender.
- (Partially synchronous network) There exists a publicly known bound Δ and an unknown Global Stabilization Time (GST) so that after GST, whenever a validator sends a message, it reaches the recipient within time Δ. Additionally, we assume that validators have bounded clock drift1. Such version of partial synchrony is known as a known Δ flavour, for the discussion on the version without publicly known Δ, see Subsection 6.3.
- (Byzantine faults) We assume that f out of n validators are under total control of an ad- versary, and hence can arbitrarily deviate from the protocol. We do not make a global as- sumption on the relation between f and n, as safety and liveness require different bounds, and the latter have an interaction with number of crashing nodes as well.
- (Crashing faults) We assume that c out of n nodes can become permanently unresponsive at some point in the protocol
2.2 Consensus in the Context of Blockchain
In a typical blockchain system, validators are tasked with performing an iterative consensus on an ever-growing chain of transactions that they receive from the external environment. In the process, they enclose transactions into blocks forming a blockchain, in which each block refers to its predecessor by hash. The first one, the genesis block, is part of the protocol definition.
In Highway, as the set of validators is either constant or subject only to very controlled changes (between different eras, see Subsection 4.2), specific validators are directly appointed to construct a block in a given time slot. They do so by enclosing transactions from their local queue and hash of the block that they believe should be the predecessor.
As it may happen that a validator does not refer to the last constructed block (either intentionally, or due to a network failure), the set of blocks is a tree, with a unique path leading from each block to the root: the genesis block G. The main goal of the consensus protocol in such a scenario is to choose a single branch from such a tree. For a block B, we refer to all the blocks that are on the other branches as competing with B, as if any of them would be chosen, B could not.
2.3 Practical Challenges
Strong optimistic finality. While since the initial definition of the partially synchronous model by Dwork, Lynch and Stockmeyer [DLS88] a vast body of research was created to optimize vari- ous parameters of protocols in this setting, most of it was written under the semi-formal assump- tion that the existence of more than n∕3 dishonest nodes predates the existence of any provable guarantees for such protocols. Such an assumption stems from the fact that, as proven in the orig- inal paper, it is not possible for a partially synchronous protocol to guarantee both liveness and finality if n < 3f + 1.
In spite of this negative result, it is however possible to provide additional finality guarantees in the scenario where for a prolonged period even dishonest nodes do not deviate from the protocol and actively work to finalize blocks. Although it may not seem like an important observation from the perspective of classical security models, we stress that it carries significant practical consequences – in most cases of blockchain deployments it may be assumed that most of the time basically all the nodes will actively collaborate to achieve consensus2. It is hence possible to provide much stronger finality guarantees for blocks created during such periods, so to revert them much more than n∕3 validators would have to collude.
Flexibility. Once the classical n < 3f + 1 bound is left aside, a natural tradeoff between finality and liveness occurs – the stronger finality guarantee we require, the more validators need to honestly collaborate to finalize the block with such a guarantee. The tradeoff could be resolved by all the validators agreeing on a common finality threshold that they intend to use, but such solution would have significant limitations.
In Highway, however, there is no need to agree upon a common threshold, so every validator is able to use a different one, or even several different thresholds. Besides eliminating the need to make an additional consensus on this particular hyperparameter, one important implication of such feature is that it allows validators to play slightly different roles in the ecosystem – for example some validators may deal mainly with finalizing relatively small transactions, in which case small latency is more important than very high security (and, as will become apparent after the protocol is presented, reaching higher thresholds usually takes more time), while others can prioritize safety over latency3 .
2.4 Our Contribution
We present Highway - a consensus protocol achieving strong optimistic finality that is flexible by allowing validators to use different confidence thresholds to convince themselves that a given block is “finalized” (both confidence threshold and finality will be properly defined in Subsection 3.3).
Unless some validators actively deviate from the protocol, the finality of a block can only increase for a given validator, which intuitively corresponds to the ever-increasing number of confirma- tions for a block in PoW scenario. However, unlike in PoW, in Highway the confidence levels for a given block can be directly interpreted as the number of validators that would need to misbehave in order to reverse such a block, what we formalize as the following theorem:
Theorem 1 (Finality). If an honest validator reaches finality with confidence threshold t ≥ f for a given valid block B, then no honest validator will ever reach finality with confidence threshold t for a block competing with B.
Note that while due to the aforementioned impossibility result [DLS88] it is not possible to prove liveness for confidence threshold n and higher, in practice the vast majority of validators will not deviate from the protocol most of the times. In such times, arbitrarily high confidence thresholds can be reached, which makes block constructed during such periods virtually impossible to revert.
Next, we provide a bound on the number of honest validators needed to guarantee that the protocol will continue finalizing blocks with given confidence threshold. We note that it is in line with the classical n ≥ 3f + 1 bound with the added notion of “crashing faults”, denoted by c, which disrupts the consensus process, but not as much as the Byzantine ones.
Theorem 2 (Liveness). For every confidence threshold 0 ≤ t < n, if f ≤ t and c < n−3t, then 3 2 the chain of blocks finalized with confidence t grows indefinitely for each honest validator.
2.5 Related Work
The line of work on partially synchronous protocols was initiated with the seminal work of Dwork, Lynch and Stockmeyer [DLS88], and gained popularity with the introduction of PBFT[CL99] protocol and its numerous versions [BKM18, KAD+09, MNR19]. Classically, protocols in this model attain resilience against less than n∕3 malicious nodes, due to a known bound stating that is it not possible to provide both safety and liveness with higher number of Byzantine faults [DLS88]. Some of the works however explore the concept of “flexibility” understood usually as providing the strongest n∕3 security in the general partially synchronous model, and additional guarantees in case some additional conditions, such as network synchronicity or limited adver- sarial behavior, are met.
Gasper[BHK+20], the newly proposed candidate for Ethereum 2.0 consensus mechanism, ana- lyzed the case of additional guarantees in case the network satisfies synchronicity assumptions. Later very similar considerations got a proper formal treatment with the introduction of the snap- and-chat family of protocols[NTT20]. The snap-and-chat protocols define two “levels of finality” (formalized in the paper as two ledgers, one being an extension of another). The first one, faster, relies on the classical partially synchronous assumptions and is guaranteed to be live and safe as long as less than n∕3 nodes are faulty. The second level provides a stronger n∕2 resilience against adversarial nodes, but is live only as long as the network is synchronous. As in practical deployment assuming network synchronicity requires rather pessimistic assumptions about net- work latency, the second finality can be assumed to progress significantly slower. The provided construction is very modular and allows to use a wide variety of protocols to provide for the first and second finality levels, and the consistency between both levels is guaranteed.
Protocol that carries perhaps the most similarities with Highway when it comes to achieved se- curity guarantees is Flexible BFT[MNR19]. It defines the new type of faulty node, the alive-but- corrupt node, that fully cooperates with honest nodes unless it is able to perform a successful attack on protocol safety. Intuitively, it models the practical situation in which nodes without an explicit incentive do not cheat, as that would mean loosing rewards in the PoS system. Similarly as in Highway, the protocol is able to tolerate much more adversarial nodes if they do not aim to merely break the liveness, but are interested only in breaking safety - the bounds match the bounds in our paper. Flexible BFT also introduces, as the name suggests, certain flexibility for the nodes when it comes to choosing the parameters related to the finality - each of the nodes can have independent assumptions about number of faulty nodes of each kind, and it is guaranteed that two honest nodes with correct assumptions can’t finalize competing blocks, and if all nodes have correct assumptions, the protocol will continue making progress. The biggest conceptual difference is that in Flexible BFT, assumptions held by nodes explicitly influence their behavior in the protocol, while in Highway the assumed confidence threshold influences only the local computations performed on the DAG whose form is not influenced by specific choices of confi- dence thresholds made by validators. There are two main consequences of this difference. First, in Highway validators can update their confidence thresholds and they are able to recompute fi- nality of all of the blocks without the need of communicating with other validators, while in case of Flexible BFT that would require rerunning the whole protocol. Second, perhaps even more importantly, Flexible BFT can stall for everyone in case some of the honest validators incorrectly assume the number of faulty parties. In contrast, in Highway unit production never stalls, and the consensus may stall from the perspective of a given validator only if that specific validator made incorrect assumptions on the number of faulty parties.
To illustrate this difference, consider the scenario in which there is a big group of overly-conservative honest nodes incorrectly assuming that 90% of the nodes are honest – in such scenario in Flexible BFT even less conservative honest nodes will not be able to finalize blocks, while less conserva- tive honest validators in Highway will not be influenced by such a choice of other validators, as it does not influence the communication between them in any way – in fact, validators doesn’t even have explicit means of checking confidence thresholds chosen by the others.
3 Protocol
We denote by G the “genesis block” of the blockchain, which is considered part of the protocol definition. Except the genesis block, every other block B consists of a reference to its parent, denoted prev(B), and the content of the block – typically a list of transactions. The parent refer- ence in B is realized by including the hash of prev(B) in B and thus there cannot be any cycles in the block graph. We also denote by next(B) the set of all blocks for which B is the parent. We recursively define the height H of a block: the height H(G) of the genesis block is 0 and H(B) = 1 +H(prev(B)) for any other block B. We say that a block B1 is a descendant of B2 and write B2 ≤ B1 in case when one can reach B1 from B2 by following parent links (in particular H(B2) ≤ H(B1)...
CONTINUE READING
Download White Paper PDF >
DOWNLOAD NOW