// inside head tag

Towards formal verification of the first RISC-V zkVM

Security

February 17, 2025

Overview

We are excited to announce our initial effort towards formal verification of RISC Zero, the first zkVM based on the RISC-V architecture, using the Lean proof assistant. The outcome of this important milestone is a proof-of-concept methodology enabling mechanised reasoning about RISC Zero zk circuits, together with the verification of selected RISC-V circuit components to demonstrate both the feasibility and the scalability of the approach. By leveraging the rigour of formal methods, we can eliminate entire classes of potential bugs and vulnerabilities, paving the way for more secure and reliable zkVMs in blockchain and privacy-preserving applications.

The RISC Zero zkVM is implemented in a custom DSL called Zirgen, specifically designed to meet the requirements of AIR-circuit design. Zirgen's compilation pipeline transforms programs written in this high-level DSL (at the time of this project, this was an eDSL in C++, but will be soon migrated out into a separate DSL) into a Rust/C++/CUDA/Metal prover/verifier pair. This process involves an intermediate compilation stage to an MLIR dialect (which we refer to as the “Zirgen MLIR” onwards or, simply, the “MLIR”), which serves as a flexible and extensible abstraction layer. This step ensures that Zirgen programs can be optimised and lowered systematically before being converted into the final circuit format, which is used to generate and verify zk-proofs.

We perform the formal verification of the RISC Zero zkVM at the level of the Zirgen MLIR, rather than trying to reason about C++, the host language of the Zirgen eDSL, or the high-level Zirgen circuit descriptions. We chose to target the MLIR because of its concise syntax, use of Static Single Assignment (SSA) form, and well-defined semantics. These features make the MLIR an ideal foundation for formal reasoning: its compact representation reduces complexity, SSA simplifies data flow analysis, and its clear semantics ensure precise interpretation of program behaviour. Although this effort, stricto sensu, does not offer complete guarantees that the final generated circuit is bug-free (as for that one would also have to prove correctness of the compilation step from the MLIR to the actual circuit), it still offers extremely high confidence in the correctness of the core logic of the VM and ensures that bugs are substantially less likely to originate from errors in the higher-level representations or their transformation to MLIR. This significantly reduces the potential surface area for critical or any other errors in the overall system.

Formalising the Zirgen MLIR

In order to formalise the semantics of the Zirgen MLIR, we:

  1. define Abstract Syntax Trees (ASTs) that are used to represent the syntax of Zirgen MLIR programs;
  2. define a structure used to represent state of programs; and
  3. define semantic functions to assign meaning to individual nodes in ASTs in terms of their effects on program state.

In the remainder of the blog, we diverge slightly from the formalisation only in that we rename some of the constructs to be more informative to the reader.

MLIR syntax

We start by defining in Lean the various inductive types required for capturing the syntax (ASTs) and semantics of the Zirgen MLIR. The MLIR operates with the following types of values:

  1. Non-negative integers from the finite prime field $\mathbb{F}_p$, where $p = 15 * 2^{27} + 1$, represented by the  Felt (short for field element) type. Prime fields are essential for zk proofs, and different frameworks opt for different values of the underlying prime.
  2. The standard propositions, represented by the Prop type in Lean. These are predominantly used, via a shallow embedding, to represent circuit constraints, and we therefore refer to them as constraints going onwards.
  3. Arrays of optional felts, which we refer to as buffers, and which are used to represent general-purpose storage. In our formalisation, their corresponding type is Buffer. Intuitively, one can think of a Buffer as a row of a zk proof witnesses.
  4. Lists of buffers, denoted by BufferList, the need for which we will discuss shortly.

The MLIR also comes with program variables, which are separated in Lean into FVars, CVars, and BVars (according to the type of values they hold—felts, constraints, or buffers), but the reader can think of them as just Strings for all intents and purposes.

Also, as the execution comes in two modes—deterministic and non-deterministic, discussed shortly—we introduce the appropriate inductive type that captures this:

inductive Mode := | Det | NonDet

Next, we define the ASTs of MLIR expressions/operators (which are pure in the sense that they do not change the program state):

inductive Op : Mode → Type where
  -- Constants
  | Const : Felt → Op x              -- F_p field elements
  | True  :        Op x              -- the Boolean true
  -- Felts
  | Add : FVar → FVar → Op x         -- addition in F_p
  | Inv : FVar        → Op NonDet    -- *-inverse in F_p (always non-det)
  -- Logic and constraints
  | Isz    : FVar        → Op NonDet -- check-for-zero (always non-det)
  | AndEqz : CVar → FVar → Op x      -- conjunction with a check-for-zero
  -- Buffers
  | Alloc : ℕ               → Op x   -- buffer allocation with size
  | Get   : BVar → Back → ℕ → Op x   -- buffer lookup
  -- | ...

together with the AST for MLIR statements:

inductive MLIR : Mode → Type where
  -- Meta
  | Assign   : String      → Op x   → MLIR x   -- assignment
  | Eqz      : FVar                 → MLIR x   -- check-for-zero
  | If       : FVar        → MLIR x → MLIR x   -- conditional
  | Nondet   : MLIR NonDet          → MLIR Det -- non-deterministic block
  | Sequence : MLIR x      → MLIR x → MLIR x   -- sequencing
  -- Ops
  | Fail     :                   MLIR x        -- execution failure
  | Set      : BVar → ℕ → FVar → MLIR Nondet   -- set a buffer cell
  -- | ...

where the Back parameter of a Get expression (discussed shortly) can be thought of as Nat. This is a fairly straightforward language definition, with mostly standard constructs. There are, however, two points that we would like to highlight:

  • both expressions and statements are parameterised by the execution mode (most can be executed in both modes, but, for example, Inv and Isz are exclusively non-deterministic,  Nondet carries a block to be executed in non-deterministic mode, etc.)—having this allows us, by construction, to disregard certain classes of ill-formed programs and simplifies subsequent semantics; and
  • the language, interestingly has no notion of jumps or loops, just a conditional If.

Technical detail: Non-deterministic blocks, in essence, give the witness generator the freedom to compute in whichever way it sees fit the witness values for which the constraints do not provide an explicit computational formula. For example, if we end up with a constraint y=x^2  the witness generator might not have the explicit means to calculate x given y , and encoding the square root computation procedure into the constraints would be unreasonably expensive (in addition, there could be two square roots of a finite-field element if the element is a quadratic residue, and so the constraints would somehow also have to encode the choice of which one should be returned).

MLIR program state

For us to define the semantics of MLIR statements, we first need to understand what our program state looks like. This state is defined in Lean as follows, eliding several fields that have been created for technical reasons only:

structure State where
  cycle       : ℕ
  hasFailed   : Prop
  felts       : Map FVar Felt
  constraints : Map CVar Prop
  buffers     : Map BVar BufferList
  -- ...

We can view this state as having:

  • two global variables: cycle, the meaning of which we discuss shortly; and hasFailed, which tells us whether or not the current execution has failed; and
  • a typed first-order variable store, separated into felts and  constraints; and
  • a persistent storage, captured by buffers.

Technical detail: Execution of MLIR programs occurs in cycles, captured by the global variable cycle, which is initialised to zero. At the start of every cycle, to preserve history, a copy of the entire current memory is created by extending every BufferList in the codomain of buffers with a copy of its last element. This enforces the invariant that all these lists of buffers are of length at most cycle + 1. Values from previous cycles can be obtained via the Back parameter of the Get operator. As all of our verified examples used only a single cycle, we onward abstract BufferLists and treat storage as a map from BVars to Buffers.

MLIR semantics

Let us first take a look at how to give meaning to some of the operators from Op, recalling that these operators are all pure:

def Op.eval {x} (st : State) (op : Op x) : Option Val :=
  match op with
  -- Felts
  | Const felt => .some <| .FVal felt
  -- Arith
  | Add fx fy => .some <| .FVal <| st.felts[fx].get! + st.felts[fy].get!
  | Inv fx    => .some <| .FVal <| if st.felts[fx]!.get! = 0 then 0 else st.felts[fx]!.get!⁻¹
  -- Logic 
  | Isz fx => .some <| .Val <| if st.felts[fx]!.get! = 0 then 1 else 0
  -- Constraints  
  | True         => .some <| .CVal _root_.True
  | AndEqz cx fx => .some <| .CVal <| st.constraints[cx].get! ∧ st.felts[fx].get! = 0
  -- Buffers
  | Alloc size         => .some <| .BVal <| List.replicate size .none
  | Get bx back offset => getImpl st bx back offset
  -- ...

We can see that the operator evaluation function, Op.eval, takes a state and an operator, performs the appropriate evaluation, and returns an optional value:

inductive Val where
  | BVal : Buffer → Val
  | CVal : Prop   → Val
  | FVal : Felt   → Val 

which can be either:

  • a single new row (Buffer): for example, Alloc returns a new empty buffer, that is, a fixed-width sequence of .none : Option Felt; or
  • a polynomial constraint (Prop), noting that we use a shallow embedding into Lean: for example, the operation True returns a constraint _root_.True : Prop, that is, the constant True from Lean’s underlying theory, whereas the AndEqz c x  conjuncts constraints[c] with the constraint felts[val] = 0; or
  • an integer in $\mathbb{F}_p$ (Felt): for example, the operation Add a b evaluates to a + b : Felt after querying a and b from the felts variable store of the underlying state st.

Unsurprisingly, as operators are not side-effecting, they are combined with an assignment in the evaluation of commands, to which we turn immediately:

def MLIR.run {α : Mode} (program : MLIR α) (st : State) : State :=
  match program with
    | Assign x op          => st.update x (op.eval st)
    | Eqz fx               => withEqZero st.felts[fx]!.get! st
    | If fx prog           => if st.felts[fx]!.get! = 0 then st else prog.run st
    | Nondet prog          => prog.run st
    | Sequence prog₁ prog₂ => prog₂.run (prog₁.run st)
    | Fail                 => {st with hasFailed := true}
    | Set bx offset fx     => st.set! bx offset st.felts[fx]!.get!
    -- ...

We can see that the behaviour of MLIR statement is described as a state transformer, that is, a function State → State. This behaviour is fairly straightforward and expected, with only a few notes:

  • the variable assignment, Assign, takes a string and updates the appropriate underlying variable store based on what Op.eval returns, thus tying the two functions together;
  • Nondet, which activates non-deterministic mode, only runs the underlying prog, with the type constraints ensuring the switch back to deterministic mode afterwards;
  • we represent programs with nested sequencing using Sequence, instead of having an overarching sequence of commands.

With these two evaluation functions, we have successfully explained to Lean what Zirgen MLIR is.

Reasoning about circuits

With the syntax and semantics in place, we could now construct an MLIR program by hand:

def ToyProgram : MLIRProgram :=
  .Sequence (.Assign "a" (.Const 42)) (.Assign "b" (.Const 24601))

state some of its properties:

def BoringProperty : Prop :=
  ((ToyProgram.runProgram .initial).getConstraints = True

and prove them:

example : BoringProperty := by
  simp [BoringProperty, MLIR.run, Op.eval, State.update, State.initial, Map.empty]

This technically works and we could now write down all RISC-V circuits, specify their expected behaviour and prove they all conform to the specifications. The problem with this approach is that the first and the last steps do not scale beyond toy examples. Let us take a closer look at why.

First of all, writing programs by hand with nested constructors is tedious. Furthermore, even if one ends up generating/translating programs from some source description, they are still difficult to understand for humans. The way we address this is with lightweight notation, based on the syntax of the actual Zirgen MLIR. For example, the ToyProgram above can be written as:

def ToyProgram : MLIRProgram :=
  "a" ←ₐ .Const 42;
  "b" ←ₐ .Const 24601

Second of all, and more importantly, proofs of complex propositions become unwieldy very, very quickly. There are several facets to this issue, as statements to prove at this stage contain:

  • semantics of the Zirgen MLIR, adding overhead to what needs to be reasoned about;
  • artefacts stemming from technical details of how we model the semantics in Lean, introducing unnecessary complexity; and
  • large unoptimised terms that can grow to legendary proportions, making proving intractable for both Lean and human beings.

To address all of the above, we have an external tool, which is essentially a pre-processing step that generates Lean in which we:

  1. Read a circuit description in Zirgen. This happens individually for both the witness and the constraints.
  2. Optimise programs and generate proofs that all optimisations are semantics-preserving. Two of the most important optimisations are:
    1. splitting the programs into smaller chunks; and
    2. reordering the reads and writes so as to minimise the amount of information we need to keep track of.
  3. State two verification conditions (VCs), separately for witness and constraints generation. One can view these VCs as simplified statements of what the circuit-description programs do, devoid of most artefacts introduced by the Lean model itself, distilling their meanings into general mathematical formulas.


As such, instead of having to reason about the model in terms of the language semantics (ToyProgram.runProgram .initial), as we did in the proof of the BoringProperty, we can utilise the verification condition. Importantly, since it follows directly from the semantics of the language that assignments never generate any constraints, the verification condition we obtain is in the approximate form vc : (ToyProgram.runProgram .initial).getConstraints ↔ True. This allows us to write:

example : BoringProperty := by unfold BoringProperty; rewrite [vc]; rfl

which happens to convince Lean that BoringProperty holds. The main takeaway here is that vc allows us to sidestep the language semantics when reasoning about the effects of running the program.

With all of this infrastructure in place, we are ready to reason about some circuits! We normally prove two main results for each circuit:

  1. constraints_of_witness, which is of the approximate form:

∀ input output,
  Witness.Code.run Witness.start_state input output → 
  Constraints.Code.run Constraints.start_state input output

and which proves that witnesses generated by circuits comply with the constraints synthesised by their respective circuits; and

  1. spec_of_constraints, which is of the approximate form:

∀ input output,
  Constraints.Code.run Constraints.start_state input output →
  Specification input output

and which proves that constraints generated by circuits conform to their respective Specifications, that is, the expected behaviours.

The user is, of course, free to prove additional properties about the programs: for example, if they wish to prove absence of unconstrainedness, they would prove the right-to-left direction of the implication in constraints_of_witness.

Case study - IsZero

Running the extractor

Considering Zirgen MLIR is a C++ EDSL, our initial input is a C++ program. The IsZero component is defined as follows:

Val IsZeroImpl::set(Val val) {
  NONDET {
    isZeroBit->set(isz(val));
    invVal->set(inv(val));
  }
  IF (isZeroBit) { eqz(val); }
  IF (1 - isZeroBit) { eq(val * invVal, 1); }
  return isZeroBit;
}

The extractor then performs the following operations for both witness and constraints generation:

  1. Instructs the C++ codebase to emit the intermediate MLIR.
  2. Translates the MLIR into something that Lean can understand. As mentioned, we do not synthesise the AST directly, but instead have a nice representation that resembles the MLIR, ending up with Lean-readable constraints. For comparison, below we give a small, illustrative excerpt, with MLIR on the top and the corresponding Lean on the bottom:

// C++ EDSL
%0 = cirgen.const 1
%1 = cirgen.true
%2 = cirgen.get %arg0[0] back 0 : <1, constant>
%3 = cirgen.get %arg1[0] back 0 : <2, mutable>
%4 = cirgen.and_eqz %1, %2 : <default>

// Lean
"%0" ←ₐ .Const 1;
"%1" ←ₐ ⊤;
"%2" ←ₐ .Get ⟨"in"0 0;
"%3" ←ₐ .Get ⟨"data"0 0;
"%4" ←ₐ ⟨"%1"⟩ &₀ ⟨"%2"⟩;

  1. Optimises each of the programs. For example, we forget felts whose values we no longer need, we reorder reads and writes, and we split constraint generation into parts 0, 1 and 2.
  2. Generates and proves the Constraints.closed_form lemma that equates the semantics of the program with its verification condition:

-- LHS: Semantics of Cosntraints.Code
Constraints.Code.run (start_state ([in0]) ([data0, data1])) ↔
-- RHS: Verification condition
((if data0 = (0 : Felt)
    then True
    else in0 = (0 : Felt)) ∧
  if (1 : Felt) - data0 = (0 : Felt)
  then True
  else in0 * data1 - (1 : Felt) = (0 : Felt))

Note that the entire statement is synthesised, including the right-hand side, which we obtain by tracing the semantics of the Zirgen language encoded in the model and doing some optimisations. For future reference, here is the Witness.closed_form lemma as well:

-- LHS: Semantics of Witness.Code
Witness.Code.run (start_state [in0]) ([data0,data1]) ↔
-- RHS: Verification condition
(some (if in0 = (0 : Felt)
       then (1 : Felt)
       else (0 : Felt)) = data0 ∧
 some (if in0 = (0 : Felt)
       then (0 : Felt)
       else in0⁻¹) = data1) ∧
 (¬((1 : Felt) -
   if in0 = (0 : Felt)
   then (1 : Felt)
   else (0 : Felt)) = (0 : Felt) →
     (if in0 = (0 : Felt)
      then (0 : Felt)
      else in0 * in0⁻¹) - (1 : Felt) = (0 : Felt))

The job of the machine is now done and, unfortunately, a human is required from here on.

Verifying the circuit

As mentioned, we customarily state and prove two statements. For the constraints_of_witness, we have:

1  theorem constraints_of_witness :
2    Witness.Code.run (Witness.start_state [input]) (output) →
3    Constraints.Code.run (Constraints.start_state [input] output) := by
4    -- Generic cleanup, could be automated
5    rcases output with _ | ⟨y1, _ | ⟨y₂, ⟨_ | _⟩⟩⟩ <;> try simp at *
6    simp only [Option.isSome_iff_exists] at h₁
7    rcases h₁ with ⟨⟨w₁, h₁⟩, ⟨w₂, h₂⟩⟩; subst h₁ h₂
8    -- Use the verification conditions
9    rw [Constraints.WP.closed_form, Witness.WP.closed_form]
10   -- Proof of the main invariant
11   aesop

For the proof above, Lean does most of the work for us. The first several steps (lines 4-7), are bookkeeping that could be easily automated. After we use the verification conditions in line 9, we end up with a very simple statement that can be automated by invoking aesop in line 11, which is a general-purpose proving procedure. Note that the verification condition has trivialised the proof to the point where aesop is able to prove the main invariant.

For the spec_of_constraints, we first need to write a specification (that is, the expected/desired behaviour), of the circuit. The IsZero circuit has a single input x, a value to be checked for being equal to zero, and slightly counter-intuitively, two outputs y₁ and y₂. We would like to set the former to 1 if the input is zero, and to 0 otherwise. For the latter, while we would not go as far as to say that ‘we desire’ it be set to the inverse of x if x is non-zero and allow it to be unconstrained otherwise, but we can prove (and do prove) that this is indeed the behaviour of the circuit.

Funnily enough, this goes a little bit against what specifications should be, as ideally one would not so much as look at the implementation of the circuit before stating its desired behaviour, so as to avoid that specifications that essentially encode bugs of the implementation.

Without further ado, here is the specification:

def Spec (x : Felt) (y₁ y₂ : Option Felt) : Prop :=
  ( x = 0 ∧ y₁ = some 1 ) ∨
  ( x ≠ 0 ∧ y₁ = some 0 ∧ y₂ = x⁻¹ )

And here is the lemma itself:

1  theorem spec_of_constraints :
2    Constraints.Code.run (Constraints.start_state [x] [y₁, y₂]) →
3    Spec x y₁ y₂ := by
4    -- Generic cleanup, could be automated
5    unfold Spec
6    simp only [Option.isSome_iff_exists] at hy₁ hy₂
7    rcases hy₁ with ⟨is0, hy₁_val⟩; subst y₁
8    rcases hy₂ with ⟨inv, hy₂_val⟩; subst y₂
9    -- Use the verification condition
10   rewrite [Constraints.WP.closed_form]
11   -- Proof of the main invariant
12   simp [sub_eq_iff_eq_add]
13   intros hy₁ hy₂
14   split_ifs at hy₁ <;> aesop'
15   rw [inv_eq_of_mul_eq_one_right hy₂]

Initially, the proof follows a similar structure to constraints_of_witness. Everything until line 11 is mostly bookkeeping and the subsequent proof of the main invariant is once again made simple by the verification condition, where most of the work is really just using the lemma inv_eq_of_mul_eq_one_right that gives us a * b = 1 → a⁻¹ = b. We have thus established that the constraints imply our desired behaviour, that is, the Specification.

Conclusions and Future Work

In summary, the two main outcomes of this formal verification endeavour are:

  1. A methodology for using the Lean proof assistant to reason about zk circuits written using Zirgen. This methodology is based on: a formal model of the Zirgen MLIR in Lean; a verification condition generator; and a semantics-preserving optimiser.
  2. A selection of parts of the RISC-V circuit that we have analysed, as a proof-of-concept. These proofs demonstrate both the viability and the scalability of the approach. In particular, in addition to the IsZero gadget described above, we have worked on:
    • OneHot<k>, a family of circuits that are parameterised over k and encode the invariant that among a sequence of bits, exactly one of them is set. To confirm that the proof extractor can process large circuits, we made it generate the verification condition for OneHot20 (constraints, witness), and also proven correct its baby version, OneHot2. We did not perform the proof for OneHot20 due to time constraints, but believe that the proof of OneHot2 would scale in Lean.
    • Decoder, which is a complex circuit that uses bit manipulation to split inputs into bit-vectors for instruction decoding. Both the specification and the proofs are quite involved.

With this initial infrastructure in place, we very much look forward to the next steps. First and foremost, it is important to note that this formalisation was conducted only over a period of several months in early-to-mid 2023. Given the experimental nature of the project, it was rapidly iterated on in order to establish the fastest-yet-reasonable way forward. Therefore, the model, while being conceptually sound, does not cover all of the features and is not necessarily optimal for formal reasoning, thus requiring a bit of a spruce-up before it can be used for the verification of more complex circuits. Moreover, since this project was completed RISC Zero has made significant progress, and we need to reflect these changes in the mechanisation. Finally, the version of Lean 4 from 2023 is positively ancient by today’s standards, and we need to bring the formalisation up to date in order to take advantage of some of the recently introduced features.

Our ultimate goal is to work closely with RISC Zero to verify the entirety of the RISC-V (RV32I) circuit against the upcoming Lean extraction of the SAIL RISC-V specification by Galois and Lindy Labs, while also specifying and verifying custom extensions unique to the RISC Zero implementation.

Latest articles