Blog

Exploring a Hybrid Host Polkad...

Exploring a Hybrid Host Polkadot Implementation

March 7, 2023

–

10 min read

Security

Zondax explores an hybrid Polkadot host node implementation by analyzing specs, code and preparing a clear and detailed execution plan that takes into consideration possible risks and blockers.

This article summarizes Zondax's efforts and findings in the evaluation of potential risks and impact of a project to create an alternate host node for the Polkadot network, using a hybrid and progressive approach. This initial pre-engineering phase focussed on analyzing the current situation and preparing a clear and detailed execution plan that takes into consideration possible risks and blockers.

A Deep Dive into the Polkadot's Host Node and its specifications

What is it and why is it important?

A Polkadot host provides the environment in which the runtime executes, and serves as a gateway for validating transactions and ensuring network security. Host nodes play a key role in maintaining the integrity of the network, coordinating with other nodes to validate transactions, and storing a complete copy of the blockchain.

The host node provides the underlying infrastructure for the runtime to execute, such as access to the network, access to the blockchain ledger, and access to the consensus protocol. The runtime, in turn, provides the necessary logic for the host node to operate, such as the rules for validating transactions and executing smart contracts.

It all started with the specifications

The Polkadot network (and other related chains) have been successfully running for a few years. Though there are official specifications for the host node, there is a collective evolution and experience beyond the official specifications. While the official specifications and test suite provided by the w3f are helpful, when compared with the only host node in production (that of Parity Technologies) we recognized opportunities for improvement and have observed that there are areas where additional documentation could be beneficial.

Regarding the specification tests, which may be ran against the different implementations of Polkadot, these may not cover all requirements of a host node, as there are many aspects deep inside APIs that are difficult to test.

Other implementations

We inspected alternate host node implementations, to get some intuition for why they have not yet reached production level. Both of the considered host nodes -- Kagome and Gossamer -- are being written from scratch, based on the official specifications for the Polkadot host node.

For the reasons mentioned in the previous paragraph, we have reached the conclusion that relying solely on the official specifications may not be sufficient to build a production-level host node. Furthermore, given the fast pace of changes in the Parity Technologies Polkadot code-base, which often precedes the specifications, it can be challenging for projects building a host node from scratch to successfully adapt and keep up with Parity's rhythm.

We believe that this variance between Parity's implementation and the w3f specification partly explains why Kagome and Gossamer have not yet achieved production level quality.

Hybrid Hosts to the Rescue

This motivates our chosen methodology: we propose a hybrid and progressive approach that involves starting with the existing Parity host node (which is implemented in Rust) and gradually replacing substantial areas with new C++ implementations.

Such a modular approach presents less risk than starting from scratch on a new host node implementation, and will allow us to fully comprehend the role of each replaced component as we go along.

How we will do it

In a little more detail, we will start from a fork of the parity host node implementation, and gradually replace crates with our own C++ implementations. To maintain a modular structure, we will create a separate repository for all of the developed C++ code. This repository will also contain Rust crates, which will act as wrappers for the C++ code. The fork will depend directly on these crates. Our goal is for it to be as simple as possible to switch out the original implementation for our own by simply changing the dependency from the original module to our crate.

There are two approaches we can take to achieve this structure: use -sys style crates, or use the cxx crate.

We chose to use cxx, as it reduces the complexity of the project, allowing us to put more focus on the actual node implementation rather than the interface between the Parity node and our modules. Of course, even with a cxx crate, we need to create some code to connect the foreign function interface (FFI) with our fork of Substrate's node.

The mayon repository will contain all our C++ code, as well as the wrappers for this code, wrapping it into Rust crates, which the forked node will depend on. The goal of the mayon repository is to facilitate the integration of re-written modules into the existing node; the latter should simply depend on corresponding crates within mayon in a seamless manner.

All crates associated with this project are located within the crates folder, with each sub-folder representing a distinct crate. Each crate is responsible for its own build process, though they should be relatively similar in structure. The core of each crate comprises the build.rs and CMakeLists.txt files. These files are used in conjunction to enable seamless interoperability and build processes for each crate or C++ library.

Why C++

There are several reasons why C++ was chosen for re-writing the host node:

Lower-level language: C++ is a lower-level language than Rust and is closer to the hardware, which provides greater control over system resources and over the code. This can be helpful for fine-tuning and optimizing performance, and can lead to more efficient and faster code.
Larger user base: C++ has a large user base and has been around for a long time, which means there are many existing code libraries, tools, resources and examples to work with. This can save time and effort in development.

Overall, with these factors, re-writing the host node in C++ is the most appropriate choice.

Collaborating with the Spec Team

Based on the knowledge we gain during the development process we plan to collaborate with the web3 foundation's Spec team to improve Polkadot's specifications and to bring them up to scratch. Our goal is to have specifications in place before implementing changes in the Polkadot ecosystem. This will benefit future alternate implementations of host nodes and establish a truly decentralized ecosystem for Polkadot. Also, high-quality specifications will mitigate the risks of losing expertise when key members leave the Polkadot community.

Evaluating and Maintaining Parity's Polkadot Host Node Implementation

Adopting a hybrid approach certainly comes with its own challenges. Our reliance on Parity's Polkadot host node implementation means that we must remain up to date of any changes made to their node. Therefore, we evaluated the implementation in terms of stability, code churn, number of dependencies, code readability, and features that may be difficult to translate from Rust to C++

To objectively evaluate code stability, we used a tool called Hercules to gather statistics on the history of project files. This will allow us to prioritize the re-implementation in C++ of crates which are deemed stable, as these will require less maintenance work to keep up to date with Parity’s host node. This approach allows for a more efficient re-writing process.

To keep our C++/rust implementation up to date with the potential evolution of Parity's node implementation, we plan to separate the code-base of our C++ changes from the clone of paritytech/polkadot fork. This separation will simplify the process of changing the dependencies of the latter to point to our C++ crates. Automated tools will be set up to inform us of any changes. Once these tools are running smoothly, updating node components that have already been implemented should be a minor effort compared to the overall project.

Managing Dependencies in Rust and C++ for a Hybrid Node

The Polkadot host node heavily relies on external crates, particularly from Substrate, which poses risks to the project. The use of Rust's Cargo and third-party tools in C++ can also result in compile-time errors. To address these issues, the project considers using Nix, a powerful package manager that ensures reproducible builds, creates isolated development environments, and supports multiple programming languages. A proof of concept was created using the crane and fenix libraries to package the Rust project and its dependencies.

The Challenges

As we embark on our project to create an alternate host node for the Polkadot network, we face numerous challenges, including complications stemming from our choice to use the C++ programming language.

Handling asynchrony in C++

One of the most pressing challenges is handling asynchrony. Unlike Rust, C++ does not have built-in concurrency support, which makes writing concurrent programs and handling asynchronous calls between the fork of Parity's node in Rust and our C++ code more difficult.

To call asynchronous C++ code from Rust, we must write the asynchronous code in C++, then use cxx to call the C++ functions that perform the asynchronous operations. The asynchrony will thus be handled at the level of C++ code.

We are exploring various libraries, such as Asio, Seastar, and cppcoro, to find the best fit for our needs. While cppcoro seems to be more ergonomic, its API design is quite similar to Rust, as it uses monads to concatenate operations.

To assess interoperability between Rust async and C++ async paradigms, we decided to write a proof of concept, with asynchronous Rust tasks interacting with C++ tasks using Asio. The tasks use asynchronous channels to talk to each other, mirroring the current Polkadot design which uses channels to decouple subsystems.

Although using cxx to call asynchronous C++ code from Rust is possible, it is more challenging than using Rust's built-in async/await functionality. It requires effort and care to handle synchronization and communication between the Rust and C++ code correctly.

One of the subtle differences between Rust's asynchronous model and that of C++ is that C++ does not use a wake-up mechanism to inform the executor poll that the task is ready. Therefore, it will be necessary to add a layer between them to coordinate and connect tasks. We note that there exists an opinionated Rust crate called cxx-async, based on cxx, which aims to facilitate this process.

Translating Rust Macros to C++

In addition to the challenges posed by Parity's node's extensive dependencies and high code churn, it also heavily relies on Rust macros, which will add an overhead to the task of translating the project to C++.

Rust macros are expanded at compile-time, and while there is a tool available in the Rust ecosystem to expand them into normal Rust code, this can result in complex code that may not be easily translated to C++.

To re-implement macros in C++, we will need to manually translate their logic and use constexpr functions and variables to perform compile-time code generation and manipulation. However, this process is time-consuming and error-prone, as we must ensure that the behavior of the macros is preserved in the C++ code.

Our approach into practice - PoC replacing core primitives

We have demonstrated the effectiveness of our development process by creating a proof of concept, which involves substituting one of Parity's rust modules with our C++ implementation.

We decided to start with the core-primitives and parachain primitives crates, as these are important for many Polkadot node crates.

The primitives are mostly re-exports of types that are defined in Substrate, but with a concrete type bound. This makes the primitives crates a good target to get an insight on how the overall process will be.

We emphasize that our current re-write -- the core-primitives and parachain repositories in mayon/crates -- mostly interfaces some types to C++, so that other C++ modules can import them. We believe that it is better to implement an FFI layer which interfaces with the definitions of imported types. This implies that at some point we will need to re-write the most used substrate modules and include them in our code base.

This proof of concept exposed various complications which we had not foreseen. Firstly, limitations posed by the cxx crate, which only supports simple enum types, does not have support for tuple structs, or for custom derives on structs or enums that are defined as shareable. The lack of support for custom derive macro calls does not pose a significant challenge, however, it will lead to repetitive code, as an intermediary layer is necessary for managing type conversion between the FFI and the original type. And secondly, issues due to API compatibility. Indeed when making changes to one module of the code, this should not have any impact on other Rust modules using that module through a clearly defined API. Even if the type used in one module is meant to be fully shareable, it may need to derive custom traits. These custom traits could be the only thing that other Polkadot components know about the type, so the type may need to be defined differently in order to accommodate these custom traits

The testing

At the moment, our C++ code is compiled as part of the Rust compilation step. Attempting to compile the C++ code directly for testing will result in failure, as the 'cxx' crate generates header files that contain type definitions required by the C++ code. So as to ensure that unit testing the Rust code will test the C++ code, we will write our unit tests as special Rust code that calls the Rust-C++ modules we want to test.

A similar approach is used in Substrate, and we believe this should work effectively. Precisely, we will have a Rust layer that interacts with C++, and it is the layer that our unit tests use. There may also be some C++ code that is not part of the Rust layer, but is still used as a dependency for the interfaced C++ code. In that case, the C++ code can be tested directly using C++ unit testing, and a make rule can be created to run the tests. The C++ code in question may include core functions such as getting hashes, performing verification, and performing type conversions.

To test C++ modules, we believe doctest would be a good candidate testing library. It is a lightweight and flexible testing framework that allows to write tests directly in ones source code, making it easy to test both C++ and Rust code in the same project. It is also fast, easy to use, and has good support for a variety of programming languages, including C++ and Rust.

As for integration testing, we currently have several proofs of concepts on the mayon repository. However, when we fork the paritytech/polkadot repository, it should be a simple matter of running the integration test suite defined in Parity's repository. As for unit testing, we plan to use our already defined C++ testing framework, and add unit tests while writing C++ code. This is because we expect to diverge from Parity's code-base as we move parts to C++.

Conclusion

In conclusion, the task of rewriting the Polkadot host node in C++ using a hybrid approach is challenging but exciting opportunity. Despite the complexity of the original project’s dependencies and difficulties in asynchronous programming, the team is confident in completing the project with careful planning and a modular approach. Access to the knowledge of Web3 Foundation and Parity developer community is a valuable resource in achieving this goal.The project requires a stable team with diverse skills and reliable communication channels. Zondax is confident about the project's potential and success and excited about its positive impact on the Polkadot ecosystem.

Ressources

Mayon repository: https://github.com/Zondax/mayon/

Final report: https://github.com/Zondax/mayon/blob/main/docs/report/HybridHost_Zondax_Report.pdf

Polkadot Ledger Migration Assistant: Now Live

The Polkadot Ledger Migration Assistant offers a complete solution for migrating from legacy Ledger accounts to the new Polkadot generic app. With support for 17 networks users can safely transition their entire portfolio through one unified interface.

Experiencing LaBitConf: An evening of Crypto, and the Argentinian Community

LaBitConf set the stage for our side event with Filecoin Orbit at Crypster Club. Nearly 50 attendees enjoyed Web3 talks, games, and Argentine treats like fernet and medialunas. Insights from Filecoin and Zondax made it memorable.

Zondax in the Ledger Ecosystem

Zondax, a Ledger partner, brings expertise in blockchain security. We develop & test Ledger apps, ensuring a user-friendly Web3 experience.