// inside head tag
We are excited to announce our initial effort towards formal verification of RISC Zero, the first zkVM based on the RISC-V architecture, using the Lean proof assistant. The outcome of this important milestone is a proof-of-concept methodology enabling mechanised reasoning about RISC Zero zk circuits, together with the verification of selected RISC-V circuit components to demonstrate both the feasibility and the scalability of the approach. By leveraging the rigour of formal methods, we can eliminate entire classes of potential bugs and vulnerabilities, paving the way for more secure and reliable zkVMs in blockchain and privacy-preserving applications.
The RISC Zero zkVM is implemented in a custom DSL called Zirgen, specifically designed to meet the requirements of AIR-circuit design. Zirgen's compilation pipeline transforms programs written in this high-level DSL (at the time of this project, this was an eDSL in C++, but will be soon migrated out into a separate DSL) into a Rust/C++/CUDA/Metal prover/verifier pair. This process involves an intermediate compilation stage to an MLIR dialect (which we refer to as the “Zirgen MLIR” onwards or, simply, the “MLIR”), which serves as a flexible and extensible abstraction layer. This step ensures that Zirgen programs can be optimised and lowered systematically before being converted into the final circuit format, which is used to generate and verify zk-proofs.
We perform the formal verification of the RISC Zero zkVM at the level of the Zirgen MLIR, rather than trying to reason about C++, the host language of the Zirgen eDSL, or the high-level Zirgen circuit descriptions. We chose to target the MLIR because of its concise syntax, use of Static Single Assignment (SSA) form, and well-defined semantics. These features make the MLIR an ideal foundation for formal reasoning: its compact representation reduces complexity, SSA simplifies data flow analysis, and its clear semantics ensure precise interpretation of program behaviour. Although this effort, stricto sensu, does not offer complete guarantees that the final generated circuit is bug-free (as for that one would also have to prove correctness of the compilation step from the MLIR to the actual circuit), it still offers extremely high confidence in the correctness of the core logic of the VM and ensures that bugs are substantially less likely to originate from errors in the higher-level representations or their transformation to MLIR. This significantly reduces the potential surface area for critical or any other errors in the overall system.
In order to formalise the semantics of the Zirgen MLIR, we:
In the remainder of the blog, we diverge slightly from the formalisation only in that we rename some of the constructs to be more informative to the reader.
We start by defining in Lean the various inductive types required for capturing the syntax (ASTs) and semantics of the Zirgen MLIR. The MLIR operates with the following types of values:
Felt
(short for field element) type. Prime fields are essential for zk proofs, and different frameworks opt for different values of the underlying prime.Prop
type in Lean. These are predominantly used, via a shallow embedding, to represent circuit constraints, and we therefore refer to them as constraints going onwards.Buffer
. Intuitively, one can think of a Buffer
as a row of a zk proof witnesses.BufferList
, the need for which we will discuss shortly.The MLIR also comes with program variables, which are separated in Lean into FVar
s, CVar
s, and BVar
s (according to the type of values they hold—felts, constraints, or buffers), but the reader can think of them as just String
s for all intents and purposes.
Also, as the execution comes in two modes—deterministic and non-deterministic, discussed shortly—we introduce the appropriate inductive type that captures this:
inductive Mode := | Det | NonDet
Next, we define the ASTs of MLIR expressions/operators (which are pure in the sense that they do not change the program state):
inductive Op : Mode → Type where
-- Constants
| Const : Felt → Op x -- F_p field elements
| True : Op x -- the Boolean true
-- Felts
| Add : FVar → FVar → Op x -- addition in F_p
| Inv : FVar → Op NonDet -- *-inverse in F_p (always non-det)
-- Logic and constraints
| Isz : FVar → Op NonDet -- check-for-zero (always non-det)
| AndEqz : CVar → FVar → Op x -- conjunction with a check-for-zero
-- Buffers
| Alloc : ℕ → Op x -- buffer allocation with size
| Get : BVar → Back → ℕ → Op x -- buffer lookup
-- | ...
together with the AST for MLIR statements:
inductive MLIR : Mode → Type where
-- Meta
| Assign : String → Op x → MLIR x -- assignment
| Eqz : FVar → MLIR x -- check-for-zero
| If : FVar → MLIR x → MLIR x -- conditional
| Nondet : MLIR NonDet → MLIR Det -- non-deterministic block
| Sequence : MLIR x → MLIR x → MLIR x -- sequencing
-- Ops
| Fail : MLIR x -- execution failure
| Set : BVar → ℕ → FVar → MLIR Nondet -- set a buffer cell
-- | ...
where the Back
parameter of a Get
expression (discussed shortly) can be thought of as Nat
. This is a fairly straightforward language definition, with mostly standard constructs. There are, however, two points that we would like to highlight:
Inv
and Isz
are exclusively non-deterministic, Nondet
carries a block to be executed in non-deterministic mode, etc.)—having this allows us, by construction, to disregard certain classes of ill-formed programs and simplifies subsequent semantics; andIf
.
Technical detail: Non-deterministic blocks, in essence, give the witness generator the freedom to compute in whichever way it sees fit the witness values for which the constraints do not provide an explicit computational formula. For example, if we end up with a constraint y=x^2 the witness generator might not have the explicit means to calculate x given y , and encoding the square root computation procedure into the constraints would be unreasonably expensive (in addition, there could be two square roots of a finite-field element if the element is a quadratic residue, and so the constraints would somehow also have to encode the choice of which one should be returned).
For us to define the semantics of MLIR statements, we first need to understand what our program state looks like. This state is defined in Lean as follows, eliding several fields that have been created for technical reasons only:
structure State where
cycle : ℕ
hasFailed : Prop
felts : Map FVar Felt
constraints : Map CVar Prop
buffers : Map BVar BufferList
-- ...
We can view this state as having:
cycle
, the meaning of which we discuss shortly; and hasFailed
, which tells us whether or not the current execution has failed; andfelts
and constraints
; andbuffers
.
Technical detail: Execution of MLIR programs occurs in cycles, captured by the global variable cycle, which is initialised to zero. At the start of every cycle, to preserve history, a copy of the entire current memory is created by extending every BufferList in the codomain of buffers with a copy of its last element. This enforces the invariant that all these lists of buffers are of length at most cycle + 1. Values from previous cycles can be obtained via the Back parameter of the Get operator. As all of our verified examples used only a single cycle, we onward abstract BufferLists and treat storage as a map from BVars to Buffers.
Let us first take a look at how to give meaning to some of the operators from Op
, recalling that these operators are all pure:
def Op.eval {x} (st : State) (op : Op x) : Option Val :=
match op with
-- Felts
| Const felt => .some <| .FVal felt
-- Arith
| Add fx fy => .some <| .FVal <| st.felts[fx].get! + st.felts[fy].get!
| Inv fx => .some <| .FVal <| if st.felts[fx]!.get! = 0 then 0 else st.felts[fx]!.get!⁻¹
-- Logic
| Isz fx => .some <| .Val <| if st.felts[fx]!.get! = 0 then 1 else 0
-- Constraints
| True => .some <| .CVal _root_.True
| AndEqz cx fx => .some <| .CVal <| st.constraints[cx].get! ∧ st.felts[fx].get! = 0
-- Buffers
| Alloc size => .some <| .BVal <| List.replicate size .none
| Get bx back offset => getImpl st bx back offset
-- ...
We can see that the operator evaluation function, Op.eval, takes a state and an operator, performs the appropriate evaluation, and returns an optional value:
inductive Val where
| BVal : Buffer → Val
| CVal : Prop → Val
| FVal : Felt → Val
which can be either:
Buffer
): for example, Alloc
returns a new empty buffer, that is, a fixed-width sequence of .none : Option Felt
; orProp
), noting that we use a shallow embedding into Lean: for example, the operation True
returns a constraint _root_.True : Prop
, that is, the constant True
from Lean’s underlying theory, whereas the AndEqz c
x conjuncts constraints[c]
with the constraint felts[val] = 0
; orFelt
): for example, the operation Add a b
evaluates to a + b : Felt
after querying a
and b
from the felts
variable store of the underlying state st
.Unsurprisingly, as operators are not side-effecting, they are combined with an assignment in the evaluation of commands, to which we turn immediately:
def MLIR.run {α : Mode} (program : MLIR α) (st : State) : State :=
match program with
| Assign x op => st.update x (op.eval st)
| Eqz fx => withEqZero st.felts[fx]!.get! st
| If fx prog => if st.felts[fx]!.get! = 0 then st else prog.run st
| Nondet prog => prog.run st
| Sequence prog₁ prog₂ => prog₂.run (prog₁.run st)
| Fail => {st with hasFailed := true}
| Set bx offset fx => st.set! bx offset st.felts[fx]!.get!
-- ...
We can see that the behaviour of MLIR statement is described as a state transformer, that is, a function State → State
. This behaviour is fairly straightforward and expected, with only a few notes:
Assign
, takes a string and updates the appropriate underlying variable store based on what Op.eval
returns, thus tying the two functions together;Nondet
, which activates non-deterministic mode, only runs the underlying prog
, with the type constraints ensuring the switch back to deterministic mode afterwards;Sequence
, instead of having an overarching sequence of commands.With these two evaluation functions, we have successfully explained to Lean what Zirgen MLIR is.
With the syntax and semantics in place, we could now construct an MLIR program by hand:
def ToyProgram : MLIRProgram :=
.Sequence (.Assign "a" (.Const 42)) (.Assign "b" (.Const 24601))
state some of its properties:
def BoringProperty : Prop :=
((ToyProgram.runProgram .initial).getConstraints = True
and prove them:
example : BoringProperty := by
simp [BoringProperty, MLIR.run, Op.eval, State.update, State.initial, Map.empty]
This technically works and we could now write down all RISC-V circuits, specify their expected behaviour and prove they all conform to the specifications. The problem with this approach is that the first and the last steps do not scale beyond toy examples. Let us take a closer look at why.
First of all, writing programs by hand with nested constructors is tedious. Furthermore, even if one ends up generating/translating programs from some source description, they are still difficult to understand for humans. The way we address this is with lightweight notation, based on the syntax of the actual Zirgen MLIR. For example, the ToyProgram
above can be written as:
def ToyProgram : MLIRProgram :=
"a" ←ₐ .Const 42;
"b" ←ₐ .Const 24601
Second of all, and more importantly, proofs of complex propositions become unwieldy very, very quickly. There are several facets to this issue, as statements to prove at this stage contain:
To address all of the above, we have an external tool, which is essentially a pre-processing step that generates Lean in which we:
As such, instead of having to reason about the model in terms of the language semantics (ToyProgram.runProgram .initial
), as we did in the proof of the BoringProperty
, we can utilise the verification condition. Importantly, since it follows directly from the semantics of the language that assignments never generate any constraints, the verification condition we obtain is in the approximate form vc : (ToyProgram.runProgram .initial).getConstraints ↔ True
. This allows us to write:
example : BoringProperty := by unfold BoringProperty; rewrite [vc]; rfl
which happens to convince Lean that BoringProperty
holds. The main takeaway here is that vc
allows us to sidestep the language semantics when reasoning about the effects of running the program.
With all of this infrastructure in place, we are ready to reason about some circuits! We normally prove two main results for each circuit:
constraints_of_witness
, which is of the approximate form:
∀ input output,
Witness.Code.run Witness.start_state input output →
Constraints.Code.run Constraints.start_state input output
and which proves that witnesses generated by circuits comply with the constraints synthesised by their respective circuits; and
spec_of_constraints
, which is of the approximate form:
∀ input output,
Constraints.Code.run Constraints.start_state input output →
Specification input output
and which proves that constraints generated by circuits conform to their respective Specification
s, that is, the expected behaviours.
The user is, of course, free to prove additional properties about the programs: for example, if they wish to prove absence of unconstrainedness, they would prove the right-to-left direction of the implication in constraints_of_witness
.
Considering Zirgen MLIR is a C++ EDSL, our initial input is a C++ program. The IsZero
component is defined as follows:
Val IsZeroImpl::set(Val val) {
NONDET {
isZeroBit->set(isz(val));
invVal->set(inv(val));
}
IF (isZeroBit) { eqz(val); }
IF (1 - isZeroBit) { eq(val * invVal, 1); }
return isZeroBit;
}
The extractor then performs the following operations for both witness and constraints generation:
// C++ EDSL
%0 = cirgen.const 1
%1 = cirgen.true
%2 = cirgen.get %arg0[0] back 0 : <1, constant>
%3 = cirgen.get %arg1[0] back 0 : <2, mutable>
%4 = cirgen.and_eqz %1, %2 : <default>
// Lean
"%0" ←ₐ .Const 1;
"%1" ←ₐ ⊤;
"%2" ←ₐ .Get ⟨"in"⟩ 0 0;
"%3" ←ₐ .Get ⟨"data"⟩ 0 0;
"%4" ←ₐ ⟨"%1"⟩ &₀ ⟨"%2"⟩;
Constraints.closed_form
lemma that equates the semantics of the program with its verification condition:
-- LHS: Semantics of Cosntraints.Code
Constraints.Code.run (start_state ([in0]) ([data0, data1])) ↔
-- RHS: Verification condition
((if data0 = (0 : Felt)
then True
else in0 = (0 : Felt)) ∧
if (1 : Felt) - data0 = (0 : Felt)
then True
else in0 * data1 - (1 : Felt) = (0 : Felt))
Note that the entire statement is synthesised, including the right-hand side, which we obtain by tracing the semantics of the Zirgen language encoded in the model and doing some optimisations. For future reference, here is the Witness.closed_form
lemma as well:
-- LHS: Semantics of Witness.Code
Witness.Code.run (start_state [in0]) ([data0,data1]) ↔
-- RHS: Verification condition
(some (if in0 = (0 : Felt)
then (1 : Felt)
else (0 : Felt)) = data0 ∧
some (if in0 = (0 : Felt)
then (0 : Felt)
else in0⁻¹) = data1) ∧
(¬((1 : Felt) -
if in0 = (0 : Felt)
then (1 : Felt)
else (0 : Felt)) = (0 : Felt) →
(if in0 = (0 : Felt)
then (0 : Felt)
else in0 * in0⁻¹) - (1 : Felt) = (0 : Felt))
The job of the machine is now done and, unfortunately, a human is required from here on.
As mentioned, we customarily state and prove two statements. For the constraints_of_witness
, we have:
1 theorem constraints_of_witness :
2 Witness.Code.run (Witness.start_state [input]) (output) →
3 Constraints.Code.run (Constraints.start_state [input] output) := by
4 -- Generic cleanup, could be automated
5 rcases output with _ | ⟨y1, _ | ⟨y₂, ⟨_ | _⟩⟩⟩ <;> try simp at *
6 simp only [Option.isSome_iff_exists] at h₁
7 rcases h₁ with ⟨⟨w₁, h₁⟩, ⟨w₂, h₂⟩⟩; subst h₁ h₂
8 -- Use the verification conditions
9 rw [Constraints.WP.closed_form, Witness.WP.closed_form]
10 -- Proof of the main invariant
11 aesop
For the proof above, Lean does most of the work for us. The first several steps (lines 4-7), are bookkeeping that could be easily automated. After we use the verification conditions in line 9, we end up with a very simple statement that can be automated by invoking aesop
in line 11, which is a general-purpose proving procedure. Note that the verification condition has trivialised the proof to the point where aesop
is able to prove the main invariant.
For the spec_of_constraints
, we first need to write a specification (that is, the expected/desired behaviour), of the circuit. The IsZero
circuit has a single input x
, a value to be checked for being equal to zero, and slightly counter-intuitively, two outputs y₁
and y₂
. We would like to set the former to 1
if the input is zero, and to 0
otherwise. For the latter, while we would not go as far as to say that ‘we desire’ it be set to the inverse of x
if x
is non-zero and allow it to be unconstrained otherwise, but we can prove (and do prove) that this is indeed the behaviour of the circuit.
Funnily enough, this goes a little bit against what specifications should be, as ideally one would not so much as look at the implementation of the circuit before stating its desired behaviour, so as to avoid that specifications that essentially encode bugs of the implementation.
Without further ado, here is the specification:
def Spec (x : Felt) (y₁ y₂ : Option Felt) : Prop :=
( x = 0 ∧ y₁ = some 1 ) ∨
( x ≠ 0 ∧ y₁ = some 0 ∧ y₂ = x⁻¹ )
And here is the lemma itself:
1 theorem spec_of_constraints :
2 Constraints.Code.run (Constraints.start_state [x] [y₁, y₂]) →
3 Spec x y₁ y₂ := by
4 -- Generic cleanup, could be automated
5 unfold Spec
6 simp only [Option.isSome_iff_exists] at hy₁ hy₂
7 rcases hy₁ with ⟨is0, hy₁_val⟩; subst y₁
8 rcases hy₂ with ⟨inv, hy₂_val⟩; subst y₂
9 -- Use the verification condition
10 rewrite [Constraints.WP.closed_form]
11 -- Proof of the main invariant
12 simp [sub_eq_iff_eq_add]
13 intros hy₁ hy₂
14 split_ifs at hy₁ <;> aesop'
15 rw [inv_eq_of_mul_eq_one_right hy₂]
Initially, the proof follows a similar structure to constraints_of_witness. Everything until line 11 is mostly bookkeeping and the subsequent proof of the main invariant is once again made simple by the verification condition, where most of the work is really just using the lemma inv_eq_of_mul_eq_one_right that gives us a * b = 1 → a⁻¹ = b. We have thus established that the constraints imply our desired behaviour, that is, the Specification.
In summary, the two main outcomes of this formal verification endeavour are:
IsZero
gadget described above, we have worked on:
OneHot<k>
, a family of circuits that are parameterised over k
and encode the invariant that among a sequence of bits, exactly one of them is set. To confirm that the proof extractor can process large circuits, we made it generate the verification condition for OneHot20
(constraints, witness), and also proven correct its baby version, OneHot2
. We did not perform the proof for OneHot20
due to time constraints, but believe that the proof of OneHot2
would scale in Lean.
Decoder
, which is a complex circuit that uses bit manipulation to split inputs into bit-vectors for instruction decoding. Both the specification and the proofs are quite involved.With this initial infrastructure in place, we very much look forward to the next steps. First and foremost, it is important to note that this formalisation was conducted only over a period of several months in early-to-mid 2023. Given the experimental nature of the project, it was rapidly iterated on in order to establish the fastest-yet-reasonable way forward. Therefore, the model, while being conceptually sound, does not cover all of the features and is not necessarily optimal for formal reasoning, thus requiring a bit of a spruce-up before it can be used for the verification of more complex circuits. Moreover, since this project was completed RISC Zero has made significant progress, and we need to reflect these changes in the mechanisation. Finally, the version of Lean 4 from 2023 is positively ancient by today’s standards, and we need to bring the formalisation up to date in order to take advantage of some of the recently introduced features.
Our ultimate goal is to work closely with RISC Zero to verify the entirety of the RISC-V (RV32I) circuit against the upcoming Lean extraction of the SAIL RISC-V specification by Galois and Lindy Labs, while also specifying and verifying custom extensions unique to the RISC Zero implementation.