Back to all posts

Introducing Clean, a Formal Verification DSL for ZK Circuits in Lean4

We are really excited to share our initial steps towards building clean, an embedded DSL and formal verification framework for ZK circuits in Lean4. As we recently shared, Zero Knowledge circuits are full of bugs, but fortunately, techniques like formal verification can provide a huge confidence boost in the correctness of ZK circuits. Clean enables us to define circuits in Lean4, specify their desired properties, and -- most importantly -- formally prove them!
This work is part of the zkEVM Formal Verification Project which aims to provide infrastructure and tooling to enable formal verification for zkEVMs. Read about clean below, or watch our presentation on it in the zkEVM project updates call.

Objectives

Our objective is to build an embedded DSL for writing ZK circuits in Lean4, that allows us to reason about them in a formal way. We believe that co-locating circuit definitions with their desired specification and correctness proof will allow us to create a robust library of reusable formally verified circuit gadgets.

We currently target AIR arithmetization, and we assume to have a table lookup primitive available by the underlying proof system. However, our long-term goal is to turn clean into a universal ZK framework that targets all other arithmetizations as well.

How to formally verify ZK circuits

In order to reason formally about ZK circuits, we first need to define a formal model. This involves the following steps:

  1. Defining what primitives our circuit language supports, i.e., which are the operations that we can use to define circuits.

  2. Defining the semantics of such primitives.

  3. Defining the properties we are interested in formally proving for a given circuit.

Our language allows us to specify a circuit, which is composed of two main objects.

At a fundamental level, for a given circuit, we are interested in how the satisfaction of the constraints and the witness are related. In other words: if a witness satisfies the constraints, what property can we infer about it?

Let's make a concrete example. Consider a circuit that is composed of only one constraint and defined over one variable x:

C1 : x * (x - 1) === 0
This is a very common gadget that ensures that x is a boolean value, i.e., it is either 0 or 1. The specification of this circuit can be expressed as:
x = 0 ∨ x = 1
Albeit being a very simple example, it shows the basic idea: we are interested in formally proving that if an assignment to the variables satisfies the constraints, then the specification holds as well. We are also interested in the other direction: if an honest prover holds a witness that satisfies the specification, then there exists an assignment of the variables that satisfies the constraints. Take the following alternative implementation of a boolean check:
C2 : x === 0
This new constraint is sound, because the only valid assignment that satisfies it is x = 0, which is a boolean value. However, it is not complete, as an honest prover that holds a valid boolean x = 1 cannot provide a witness that satisfies the constraint.

More formally, the two properties we want to prove are:

DSL design

In our DSL, we support four basic operations for defining circuits.

inductive Operation (F : Type) [Field F] where
  | witness : (m: ) -> (compute : Environment F -> Vector F m) -> Operation F
  | assert : Expression F -> Operation F
  | lookup : Lookup F -> Operation F
  | subcircuit : {n : } -> SubCircuit F n -> Operation F

Indeed, we can:

To enhance usability, we provide a way to define a circuit using a monadic interface, with a lot of convenience functions. This interface allows us to define the circuits using very natural syntax constructs.

Design of the composable verification framework

The main building block of our framework is the FormalCircuit structure.

structure FormalCircuit (F: Type) (β α: TypeMap)
  [Field F] [ProvableType α] [ProvableType β]
where
  main: Var β F -> Circuit F (Var α F)
  assumptions: β F -> Prop
  spec: β F -> α F -> Prop

  soundness:
     offset : ,  env,
    -- for all inputs that satisfy the assumptions
     b_var : Var β F,  b : β F, eval env b_var = b ->
    assumptions b ->
    -- if the constraints hold
    constraints_hold.soundness env (circuit.main b_var |>.operations offset) ->
    -- the spec holds on the input and output
    let a := eval env (circuit.output b_var offset)
    spec b a

  completeness:
    -- for all environments which _use the default witness generators for local variables_
     offset : ,  env,  b_var : Var β F,
    env.uses_local_witnesses (circuit.main b_var |>.operations offset) ->
    -- for all inputs that satisfy the assumptions
     b : β F, eval env b_var = b ->
    assumptions b ->
    -- the constraints hold
    constraints_hold.completeness env (circuit.main b_var |>.operations offset)

This structure tightly packages, in a dependent-type way, the following objects:

A FormalCircuit encapsulates a formally proved, reusable gadget: when instantiating a FormalCircuit as a subcircuit, we are able to reuse the soundness and completeness proofs of the subcircuit to prove the soundness and completeness properties of the whole circuit. This is accomplished by automatically replacing the constraints of a subcircuit with its (formally verified) spec. In this way, we can formally verify even large circuits by applying a divide-et-impera approach: we start by defining and proving correctness of low-level reusable gadgets, and then combine them to build more and more complex circuits.

A concrete example: 8-bit addition

Let's walk through one of the simple gadgets we have implemented and verified: addition on 8-bit numbers. This is a gadget that takes as input two bytes and an input carry, and returns the sum of the two bytes modulo 256, and the output carry.
First, we need to define the input and output shapes.

structure Inputs (F : Type) where
  x: F
  y: F
  carry_in: F

structure Outputs (F : Type) where
  z: F
  carry_out: F

Now, we define the assumptions the circuit makes on the input values, and the specification that the circuit should satisfy.

def assumptions (input : Inputs (F p)) :=
  let { x, y, carry_in } := input
  x.val < 256  y.val < 256  (carry_in = 0  carry_in = 1)

def spec (input : Inputs (F p)) (out : Outputs (F p)) :=
  let { x, y, carry_in } := input
  out.z.val = (x.val + y.val + carry_in.val) % 256 
  out.carry_out.val = (x.val + y.val + carry_in.val) / 256

The main circuit is defined as follows:

def add8_full_carry (input : Var Inputs (F p)) :
    Circuit (F p) (Var Outputs (F p)) := do
  let { x, y, carry_in } := input

  -- witness the result
  let z <- witness (fun eval => mod_256 (eval (x + y + carry_in)))

  -- do a lookup over the byte table for z
  lookup (ByteLookup z)

  -- witness the output carry
  let carry_out <- witness (fun eval => floordiv (eval (x + y + carry_in)) 256)

  -- ensures that the output carry is boolean
  -- by instantiating the Boolean.circuit as a subcircuit
  assertion Boolean.circuit carry_out

  -- main 8-bit addition constraint
  assert_zero (x + y + carry_in - z - carry_out * (const 256))

  return { z, carry_out }
Finally, we define the FormalCircuit structure, which packages all those definitions, together with the soundness and completeness proofs
def circuit : FormalCircuit (F p) Inputs Outputs where
  main := add8_full_carry
  assumptions := assumptions
  spec := spec
  soundness := by
    ...
  completeness := by
    ...

Notice that this definition is generic over the field prime p. However, we require an additional assumption on the prime, namely p > 512, otherwise the circuit is not sound!

variable {p : } [Fact p.Prime]
variable [p_large_enough: Fact (p > 512)]

Verifying concrete AIR tables

The FormalCircuit abstraction provides a modular definition of verified circuits, and it is mostly arithmetization-agnostic. However, we want to target AIR as a concrete arithmetization, as it is a very popular choice in the zkVM design space. AIR circuits are defined over traces: constraints are specified together with an application domain, which represent which rows they should be applied to. In principle, one could choose an arbitrary domain. However, in practice we choose domains that have a succinct representation in terms of vanishing polynomial.

Here are some examples of domains that have succinct representations and are used in practice:

As a concrete example, let's say that we want to give a constraint over a trace for correctly computing the Fibonacci sequence, which is defined as follows:

$$ \begin{cases} F_0 = 0 \\ F_1 = 1 \\ F_n = F_{n-2} + F_{n-1} \end{cases} $$

We can implement it with a trace composed of two columns: $x$ and $y$. The invariant we want to prove is: on the $i$-th row, $x_i$ should contain $F_i$ and $y_i$ should contain $F_{i+1}$. We can achieve this behaviour by imposing the following constraints:

Fibonacci trace

It is straightforward to see that if a trace satisfies those constraints, then in the $i$-th row we will find the $i$-th Fibonacci number in the first column.

Notice that this set of constraints can also be thought of as defining a correct sequence of states, one for each row:

In our framework, we support AIR constraints by:

You can check out the soundness proof for the Fibonacci table using 8-bit addition here, which satisfies the following, slightly more complicated, specification:

$$ \text{For every row } i, \quad x_i = (F_i \mod 256) $$

Upcoming work

The framework is still in early stages of development, but it is already showing promising results. Some planned next steps are:

The whole project is open source and available on GitHub, make sure to check it out!

zkSecurity offers auditing, research, and development services for cryptographic systems including zero-knowledge proofs, MPCs, FHE, and consensus protocols.

Learn more →

Share This Article