Cosmos without Tendermint: Exploring Narwhal and Bullshark

07.29.2022|joachimneuGeorgios Konstantopoulosandrewkirillov

As more and more blockchain systems get deployed to production, two problems are frequently encountered:

  1. Achieving consensus with high throughput and low latency
  2. Building a distributed application on top of that consensus

One system which addresses these two problems is Cosmos. Cosmos uses Tendermint, a high-performance BFT consensus algorithm, and the Cosmos SDK, a toolkit which enables developers to launch their own proof-of-stake blockchain on top of Tendermint.

However, Tendermint was conceived years ago, and researchers have made great strides on BFT consensus since then.

Many of us at Paradigm have been excited about the latest developments on high-throughput & low-latency consensus using directed acyclic graphs (DAGs), in particular the Narwhal mempool and the Tusk and Bullshark consensus algorithms, and beyond. (Check out these blog posts for a more gentle introduction to DAG-based consensus protocols.)

Narwhal/Bullshark (N/B) promises higher transaction throughput, and responsiveness (meaning confirmation latency of N/B is a function of the actual network delay, rather than of the network delay upper bound ∆ assumed under eventual synchrony). In contrast, Tendermint is not responsive and consequently its latency is bottlenecked by the pessimistic delay bound ∆. Due to how the interaction between mempool (Narwhal) and consensus (Bullshark) works, N/B’s performance also does not suffer as much from faulty or malicious behavior, or from network hiccups, as other leader-based consensus protocols.

Given the above advantages, we thought to ourselves: Can we replace Tendermint with Narwhal & Bullshark (N/B), while targeting compatibility with the Cosmos SDK stack?

At a recent two-day internal hackathon, we built a proof-of-concept which achieved that! For this purpose, we attached to the N/B codebase a small shim that allows it to “speak the language” (APIs) of Cosmos clients on the one hand and Cosmos applications on the other. That was enough to port over some of the simple decentralized application code examples given in the Cosmos documentation to use N/B instead of Tendermint.

You can find the proof-of-concept code here: https://github.com/gakonst/narwhal-abci-evm

How Does The Cosmos Stack Work?

To understand what exactly we did, let’s take a look at the anatomy of a Cosmos node:

A Cosmos node consists of an instance of Tendermint Core (TC) for everything related to consensus, and an instance of the application consisting of the useful logic whose execution we want to decentralize. TC offers RPC endpoints to end-user clients (e.g., for submitting transactions, or for querying the application’s state).

TC talks to the local instance of the application state machine via the Application Blockchain Interface (ABCI). For the purposes of ABCI, TC acts as a client which initiates requests, and the application acts as a server which replies with responses. A simple request/response pair is “Query”. TC uses “Query” to forward to the application any end-user queries about the application state received via the RPC.

Other important request/response pairs “deliver” the ledger of transactions that consensus has been reached upon to the application, where they are used as inputs to drive the application’s state machine. In particular, when a new block is confirmed in consensus, “BeginBlock” is called with block metadata, followed by “DeliverTx” for each transaction in the block, “EndBlock” again with block metadata, and “Commit” to persist the resulting state. Note that since all Tendermint instances reach consensus on the transaction ledger and thereby on the sequence of ABCI calls to the application, the application state machine gets replicated in lockstep across all nodes of the network.

What We Did

A natural point to hook into this stack was thus to remove TC, and replace it with N/B, augmented with a shim that both provides an RPC endpoint to clients, and delivers the consensus ledger via ABCI to the application:

Indeed, after about two days of hacking, we are able to run a simple ABCI app consisting of an EVM execution environment on top of N/B, where we could issue transactions and query their outcome via TC RPC:

The demo consensus network is run by four nodes (each running on localhost), whose RPC endpoints are reachable on TCP ports 3002, 3009, 3016, and 3023, respectively. There are three accounts, Alice (initially 1.5 ETH), Bob (initially 0 ETH), and Charlie (initially 0 ETH). Alice performs a double spend, sending 1 ETH each to Bob and Charlie in two different transactions that get input to the nodes at ports 3009 and 3016, respectively. Note that only one transaction can make it. Eventually, nodes reach consensus on which transaction gets executed in Foundry's EVM, and the application state is updated in lockstep across all nodes. The update is reflected in subsequent balance queries.

Conclusion

We built a prototype Cosmos/ABCI application that used Narwhal/Bullshark as the consensus algorithm instead of Tendermint.

In that process, we learned that ABCI is quite Tendermint-specific, despite its aspiration to be more generic. For instance, it assumes a simple blockchain structure, with certain metadata present in the block headers (e.g., the previous block’s state root). The latest consensus protocols, however, whether they are based on multiple parallel chains or DAGs, do not fit this simple corset anymore.

To move beyond the proof-of-concept stage, more would need to be done:

  • Benchmarking and optimizing performance. While we implemented a proof of concept which successfully shows delivery and execution of EVM transactions to/in the application, we did not fully benchmark the system (i.e., to provide TPS numbers of EVM execution), or minimize overhead such as in the communication between consensus and application. We leave that as future work, looking to also support optimizations such as EVM parallelization to further improve throughput.
  • A turn-key testnet/L1 with high-performance consensus, EVM execution, and Ethereum JSON-RPC. The application we built uses just Foundry’s EVM, and does not support Ethereum JSON-RPC APIs. It’d be nice if we could instead integrate N/B with Foundry’s Anvil.
  • Specifically, an RPC shim would redirect new transactions to N/B for ordering and proxy all remaining RPC calls to Anvil’s Ethereum JSON-RPC. The sequence of ordered transactions from N/B would be fed to Anvil (for instance via ABCI’s BeginBlock/DeliverTx/EndBlock/Commit) to drive (modified-to-be-deterministic) lockstep block production in Anvil, replicating the state of Anvil and its EVM across all participants. The result could be a simple turn-key high-performance testnet/L1 featuring N/B’s powerful consensus and Anvil/Ethereum/EVM’s expressive RPC and execution. Similar to the Cosmos stack seen earlier, in this tandem, all the networking logic is provided by the consensus layer, and Anvil can just act as the EVM runtime & state persistence layer.
  • Full Cosmos proof-of-concept. We built a scoped-down ABCI app where we only support ABCI consensus messaging. To support the full Cosmos SDK and showcase a full end-to-end integration, the rest of ABCI (and probably ABCI++) would need to be implemented, necessitating extending the ABCI/RPC shim (and potentially the N/B codebase) with functionalities such as validator set reconfiguration or light client support.
  • Improvements of the N/B implementation itself. For instance, the research codebase we have used does not provide the asynchronous fallback described in the Bullshark paper.

Acknowledgments: Special thanks to Zaki Manian, Lefteris Kokoris-Kogias, Matt Huang, and Dan Robinson for fruitful discussions and comments on an earlier draft of this post, and to Achal Srinivasan for the beautiful diagrams.

Written by

joachimneuGeorgios Konstantopoulosandrewkirillov

Disclaimer: This post is for general information purposes only. It does not constitute investment advice or a recommendation or solicitation to buy or sell any investment and should not be used in the evaluation of the merits of making any investment decision. It should not be relied upon for accounting, legal or tax advice or investment recommendations. This post reflects the current opinions of the authors and is not made on behalf of Paradigm or its affiliates and does not necessarily reflect the opinions of Paradigm, its affiliates or individuals associated with Paradigm. The opinions reflected herein are subject to change without being updated.

Copyright © 2024 Paradigm Operations LP All rights reserved. “Paradigm” is a trademark, and the triangular mobius symbol is a registered trademark of Paradigm Operations LP