The EVM’s ability to run computations on-chain has a major weakness: the cost of running code is often too great. Given a lot of the use-cases for this kind of public computation involve more interesting operations, we’ve seen a rapid rise in the use of systems that aim to alleviate those costs.

One such class of solutions are Zero Knowledge Proof (ZK, ZKP) systems, which give us the means to ship the expensive computation off to a bare-metal server while letting us reassure an on-chain user that we did the computation correctly. As long as checking that proof is cheaper—be it in terms of memory usage or computation time—than computing the result directly on-chain, such systems are a boon to those with computationally-intensive use-cases.

Despite its utility in such situations, the state of ZKP tooling is woefully insufficient. Traditional programming has tools developed over many years to make our lives easier, but these novel uses of ZKP sees the ecosystem under-equipped to deal with any problems that might arise.

While these uses of ZKP are crucial to many blockchain-based systems, there are no widely-available tools to ensure that such usages are correct and bug free. Despite this, more and more systems are pushing critical paths off-chain using this technique, which puts the ecosystem as a whole in an awkward position.

The solution to all this is Formal Verification (FV), a technique that lets us prove that our ZK circuits work as they should on an ongoing basis.

Recently, Reilabs worked with Worldcoin to design, develop, and then later formally verify the key components of the World ID Protocol. We focused in particular on the Semaphore Merkle Tree Batcher—also known as Semaphore MTB—which is a service responsible for creating proofs for two key ZK circuits that are then verified on chain. These circuits ensure the correct insertion and removal of user identities from World ID’s public user record, and are hence extremely important.

Zero Knowledge Proving

If you’ve not run into the term before, a Zero-Knowledge proof system is a cryptosystem that aims to provide its users with two main features.

Confidentiality: This means that one of the entities in the cryptographic exchange (the one that we call the prover) is able to demonstrate to the other entity (that we call the verifier) that a given statement is true. The key is that the prover wants to be able to do this without revealing any more information to the verifier than the fact that the statement is true!

Conciseness: If the data and computations required to verify a proof of a claim are as big as just doing the computation yourself, there’s no point to this at all. To that end, these proof systems try to ensure that the size of the proof is only a fraction of the size of the data required to perform the computation oneself. At the same time, they also want to ensure that the process of checking that proof by the verifier is cheaper than just doing the whole thing on chain.

These expensive computations get written in the ZK system as constructs called circuits, which take a set of inputs. These inputs have restrictions and transformations imposed on them by the circuit definition, and we call these restrictions constraints.

Semaphore MTB uses circuits written using the very flexible Gnark library. As a very well-designed ZK proof system—or, more accurately, a few proof systems in one—Gnark made it very simple to design and develop the core circuits for the protocol. One has to prove that a set of identities were correctly inserted into a tree, while the other had to prove that identities were properly removed from that same tree.

So What’s the Problem?

Despite fantastic libraries like Gnark, the tooling for working with ZK proofs as part of a larger protocol is still extremely immature, a problem compounded by the lack of industry expertise in working with such systems.

All told, this makes it very easy to get things wrong, which is a worrying state of affairs when the security and correctness of key protocols—like World ID—rely on those involved getting it right. Indeed, it is at a point where even those that write the tools and libraries can easily make mistakes; this vulnerability in Gnark, discovered recently by Reilabs, is one such example.

We’ve tended to find that vulnerabilities in ZK systems usually arise from missing constraints. In other words our circuit—or the building blocks of which it is made up that we call gates—allow inputs or outputs that shouldn’t be allowed. As a result it becomes possible to prove things that are not intended to be provable.

This becomes even more of a problem when we tie the ability to verify a proof to changing the state of the protocol. If we could, for example, prove that we inserted some identity into the tree of identities without actually doing that insertion, then we could break the soundness of World ID. There would be an identity that the system thinks exists, but doesn’t really, which puts the protocol into a state that its design does not and cannot account for, and leads to a compromise of the protocol.

Formal Verification to the Rescue

Traditional approaches to finding bugs in protocols that use ZK circuits have mostly amounted to manual audits. Someone sitting down to read both what the circuit should be doing and what the circuit actually does—and then trying to spot where the two differ—is unfortunately exceedingly error-prone.

Circuits and their definitions are exceedingly subtle. A successful audit requires not only an understanding of how the circuits should work and how they do work, but also an appreciation for the intricacies of the techniques used to define those circuits, and for how those circuits are used in the system as a whole. It is only through this understanding that confidence in such a system can be built.

What formal verification allows us to do is take the circuits, as they are actually defined, and import them into a formal setting. These formal representations are designed to be reasoned about using mathematical and logical systems, so we can take them and attempt to prove that all of the specified properties hold.

The process can be broadly broken down into four steps:

Extraction: We integrate with the system where the circuits are defined, and use automated tooling—such as the Gnark Lean Extractor for Semaphore MTB and its use of Gnark—to turn these circuits into formal definitions. These definitions then exist in Lean, our Theorem Proving environment of choice, which means we’re now ready to reason about them.

Equivalence: The extracted representation is not identical to the circuits, as circuit definitions themselves are quite opaque to formal reasoning. Instead, the formal definitions are in an equivalent but not identicalform to the originals. This means we can actually prove properties about them without resorting to logical gymnastics that might obscure the proving process.

Invariants: Once we have ensured the equivalence of these circuit representations, we can use mathematical and logical reasoning on them to construct theorems about what the circuits should do. The aim of this step is to make sure that we have a theorem for each of the required properties for those circuits; this way our prover will tell us if the circuits violate any of them.

Compositional Reasoning: We can use our theorems to reason about the correctness of circuits from the correctness of gates, but also the correctness of systems from the correctness of the circuits they depend on. We can prove theorems about how these circuits and their verification can change the system state, giving us far more confidence than if we only looked at the individual pieces.

For World ID we were easily able to show that both the insertion and deletion circuits worked as they should; that means no pretending to insert identities or fake-deleting them. What’s even better, though, is that we were able to show that the whole system would always make progress. It can never get stuck in a state where identities could not be inserted or removed, and thereby cannot be censored.

This is a hugely important result for such a large-scale protocol, and is a huge vote of confidence for World ID’s robustness.

The Benefits

These would be good results for a protocol like World ID even if they were a one off thing, but one of biggest benefits of formal verification is that it is an ongoing process.

Continuous Auditing: Where an audit is a one off check that things work how they are meant to at the time of the audit, FV can go a step beyond that. Because we can extract the circuits on CI every time something changes, the theorems holding can be a prerequisite for merging new code. With a traditional auditing approach, you’d be looking at the cost of a new audit every time you changed even the smallest part of the system, but with FV you have a CI failure any time a change breaks one of those key system invariants.

Machine Checking: The more traditional audits you have, the more chance you have of an auditor making a mistake. With an approach based on formal verification, all of the properties we want to check are checked by a machine. We know that if the theorems hold, then those theorems are mathematically and logically sound; in other words, we know that the system behaves how it should.

Clear Specifications: One of the hardest parts of a traditional audit is getting to the bottom of exactly what a system or component should do. We think that formal verification is a great aid here as well, as once the theorems are proven you get a clear and concise specification of the intended behavior. This specification is unambiguous and machine-checkable, as well as existing alongside any asumptions that have been made, and we think it’s a great complement to manual auditing efforts.

All of these above benefits were realized for World ID and Semaphore MTB.

We proved theorems that match all of the requirements for the World ID system, and thereby showed that the system should work as intended and expected.

Those theorems were set up to be checked on CI after every single change to the code. This has made it impossible to unknowingly merge code that breaks the invariants of the protocol.

We also accompanied our formal work by a detailed report that describes both our approach to the protocol audit and the results of our work. Having this report serves as a fantastic jumping off point for any future work, and also as an explanation of why readers should have confidence in World ID.

Formal Verification is Pretty Great

At Reilabs, we tend to think about this whole process as a huge vote of confidence for the integrity of any protocol that gets formally verified. This is true whether we find issues or not.

With World ID this is confidence not only for its current state, but confidence in its evolution as the “audit” process keeps going in the background. To us—and we would argue that it should be the same for everybody else—we think that such a clear result demonstrates that such a protocol is well-engineered and can be trusted by not only the implementors at Worldcoin, but by the ecosystem as a whole.

If you’re interested in bringing our Formal Verification efforts to bear on your project, please contact us at hi@reilabs.io.

Hope you have enjoyed this post. If you'd like to stay up to date and receive emails when new content is published, subscribe here!

Dencun upgrade went live promising up to 10x cost savings to L2 operators. In the lead up to the upgrade, the Ethereum community performed the largest trusted setup ceremony to date, with over 140k participants. Reilabs contributed the backend and parts of cryptography code to this ceremony.

Reilabs introduces Cairo Hints, an extension to Cairo language that makes STARK programs easier to implement and cheaper to execute. Hints enable developers to supplement their programs with data that is difficult to obtain in ZK circuits, allowing them to solve new classes of problems with Cairo.

One of the most foundational concepts on the blockchain is immutability. While this is one of its greatest strengths, it is also a weakness—being able to fix bugs is crucial to building reliable software, but being able to do this is a challenge on chain.

Running any operation on Ethereum network requires careful cost management. For years Polygon has made on-chain applications cheaper and more scalable. Now, they are taking scalability to the next level with the Polygon Miden project, and Reilabs is assisting in making it even more cost-effective.