Design of Symmetric-Key Primitives for Advanced Cryptographic Protocols

,


Introduction
Block ciphers are a fundamental primitive of modern cryptography.They are used in a host of symmetric-key constructions, e.g., directly as a pseudorandom permutation to encrypt a single block of data; inside a mode of operations to generate a stream cipher for authenticated encryption; or, after some tweaking, in a Merkle-Damgård or sponge construction to generate hash functions.This last example, hash functions, are a fundamental primitive in their own right for their fitness to approximate a random oracle, and thereby admit a security proof based on this idealization.
While the security of standard block ciphers and hash functions such as AES, 3DES, SHA2-256, SHA3/Keccak, is well understood and widely agreed upon, their design targets an efficient implementation in software and hardware.The design constraints that make these primitives efficient in their niche, are different from the constraints that would make them efficient for use in advanced cryptographic protocols such as zero-knowledge proofs, and multi-party computation (MPC).The mismatch in design constraints has prompted a departure from the standardized basic algorithms in favor of new designs, such as LowMC [1], MiMC [3], and Jarvis [5].The distinguishing feature of these ciphers is the alternative target for optimization: running time, gate count, memory footprint, power consumption, are all left by the wayside in favor of the number of non-trivial arithmetic operations.These ciphers can thus be characterized as arithmetization-oriented, as opposed to traditional ciphers which do not have the alternative optimization target.
Arithmetization-oriented cipher design should not be understood from the perspective of traditional cipher design.The relevant attacks and security analyses are different.Traditional constructions and modes of operation must be lifted to the arithmetic setting and their security proofs must be redone.The target applications are different and provide the designer with a new collection of tools to secure their design against attacks without adversely affecting efficiency.This efficiency is captured in terms of arithmetic metrics that vary subtly by applications -but jointly stand in stark contrast to traditional efficiency metrics such as the ones mentioned above.
As a field in its own right, the design of arithmetization-oriented ciphers is in its nascency.Rather than blindly optimize for a single vaguely defined metric and shipping the resulting construction as soon as possible, it is worthwhile and timely to stop and re-evaluate formerly optimal strategies with respect to this new field.The contribution of this work is not just the proposal of two new ciphers, although that was -and still is -certainly its motivation.The more important contribution consists of the steps taken towards a more systematic exploration and mapping of the problem and design landscape that these ciphers inhabit.Our ciphers, Vision 4 and Rescue 5 , merely represent the Marvellous culmination of our journey.
This paper is structured in accordance with a progressive refinement of focus.First, in Section 2, we characterize the common features of the advanced cryp-tographic protocols that arithmetization-oriented ciphers cater to and identify and clarify the exact and various efficiency metrics that are relevant in those contexts.Next, in Section 3 we explore the space of design considerations.In particular, we identify important differences (compared to standard symmetric cipher design) in terms of the security analysis as well as in terms of the tricks and techniques that can be employed in order to marry security with efficiency.Having surveyed the design space we then motivate our position in Section 4; here we provide concrete answers to questions raised in the preceding sections regarding the security rationale, potential pitfalls, and application constraints.Lastly, in Sections 5 and 6 we present the logical consequence of these design decisions: concrete specifications for Vision and Rescue, two families of the Marvellous universe.
We use three advanced cryptographic protocols as running examples of applications, and guiding beacons, throughout this paper: zero-knowledge proof systems for the Turing or RAM models of computation, for the circuit model of computation, and multi-party protocols.In particular, our discussion characterizes all three as arithmetic modalities of computation.We spend ample time identifying the correct efficiency metrics along with non-trivial design tools that these protocols enable.Finally, in Section 7 we compare Vision and Rescue to MiMC with respect to the relevant metrics in these applications.

Arithmetization
Zero-knowledge proof systems for arbitrary computations, multi-party computation, and indeed, even fully homomorphic encryption, share more than just a superficially similar characterizer of complexity.Underlying these advanced cryptographic protocols is something more fundamental: the protocols stipulate applying algebraic operations to mathematical objects, and somehow these operations correspond to computations.This correspondence is not new.It was originally introduced by Razborov [37] as a mechanical method in the context of computational complexity and first applied to cryptographic protocols by Lund et al. [32].This method, known as arithmetization, characterizes a computation as a sequence of natural arithmetic operations on finite field elements.
Arithmetization translates computational problems -such as determining whether a nondeterministic Turing machine halts in T steps -into algebraic problems involving low-degree multivariate polynomials over a finite field.A subsequent interactive proof system that establishes the consistency of these polynomials, simultaneously establishes that the computation was performed correctly.Similarly, the arithmetic properties of finite fields enable the transformation of a computational procedure for one machine -for instance, calculating the value of a function f (x 1 , x 2 , x 3 ) -into a procedure to be run jointly by several interactive machines.The practical benefit of this transformation stems from the participants' ability to provide secret inputs x i , and to obtain the function's corresponding value without revealing any more information about those inputs than is implied by this evaluation.In both cases, the complexity of the derived protocol is determined by that of the arithmetization.
In the remainder of this section we survey three applications of arithmetization in cryptography: zero-knowledge proofs in the Turing or RAM models of computation, zero-knowledge proofs in the circuit model of computation, and multi-party computation.The purpose of this survey is to introduce the mechanics and to set the stage for analyzing efficiency and design techniques.These modalities of computation provide the reference frame according to which the rest of the paper proceeds.
The astute reader will notice that fully homomorphic encryption is frequently listed among the target applications of arithmetization-oriented ciphers and yet is missing from both the above discussion and the surveys below.The ciphers proposed in this paper rely heavily on a family of techniques we call acausal computation, in which the state of the system at the next computational step cannot be described as having been caused by the state at the previous step -see Section 3.1 for a more precise description of this term.To the best of our knowledge, fully homomorphic encryption does not presently admit acausal computations.As a result, our ciphers are ill-suited to this application scenario.We opt to restrict focus to context and aspects relevant for Vision and Rescue; while the design of ciphers for fully-homomorphic encryption shares much of the context surveyed in this paper and can properly be characterized as a subfield of arithmetization-oriented cipher design, we leave this particular question out of the scope of the present paper and to future work.

Zero-Knowledge Proofs
A zero-knowledge (ZK) proof system is a protocol between a prover and a verifier whereby the former convinces the latter that their common input ℓ is a member of a language L ⊂ {0, 1} * .The proof system is complete and sound with soundness error ϵ if it guarantees that the verifier accepts (outputs 1) when ℓ ∈ L and rejects with probability ≥ 1 − ϵ when ℓ ̸ ∈ L. When this soundness guarantee holds only against computationally bounded provers we call it an argument system.The proof system is zero-knowledge if the transcript is independent of the membership or non-membership relation. 6e are concerned here with languages L that capture generic computations in different models of computation.
Scalable, transparent arguments of knowledge.Let L be a language decidable in nondeterministic time T (n) , like the NEXP-complete bounded halting problem, a nondeterministic machine that halts within T cycles.}Following [10], we say that a ZK proof system for L is scalable if two conditions are satisfied simultaneously for all instances ℓ, |ℓ| = n: (i) proving time scales quasi-linearly, like poly(n) + T (n) • poly log T (n) , and (ii) verification time scales like poly(n) + poly log T (n) .
transparent if all verifier messages are public coins.These systems require no trusted setup phase.argument of knowledge if there exists an extractor that efficiently recovers a witness to membership of ℓ in L by interacting with a prover who has a sufficiently high probability of convincing the verifier.
Argument systems that possess all of the properties above are referred to as ZK-STARKs, and have been recently implemented in practice [10], following theoretical constructions [11,12] (cf.[9] for a prior, non-ZK, STARK).
To reap the benefits of a scalable proof system, it is important to encode computations succinctly, and one natural way to achieve this is via an Algebraic Intermediate Representation (AIR), as suggested in [10].Both Turing machines and Random Access Memory (RAM) machines can be represented succinctly using AIRs that we describe briefly now, and more formally in Appendix D. 7  An Algebraic Execution Trace (AET) is similar to an execution trace of a computation.It is an array with t rows (one row per time step) and w columns (one column per register).The size of the AET is t • w .The main property distinguishing an AET from a standard execution trace is that each entry of the array is an element of a finite field F q .The transition function of the computation is now described by an Algebraic Intermediate Representation (AIR).An AIR is a set P of polynomials over 2w variables X = (X 1 , . . ., X w ), X ′ = (X ′ 1 , . . ., X ′ w ), representing, respectively, the current and next state of the computation, such that a transition from state s = (s 1 , . . ., s w ) ∈ F w q to state s is valid iff all polynomials in P evaluate to 0 when the values s, s ′ are assigned to the variables X, X ′ , respectively.(See appendix C for an example.)To maximize the efficiency of ZK-STARKs, we wish to minimize the three main parameters of the AIR: the computation time t , the state width w and the maximal degree d of an AIR constraint (polynomial) in P. While the degree d does not affect the size of the AET, it does affect the execution time and the proof size.
Circuit model.Numerous ZK proof systems operate in the model of arithmetic circuits, meaning that the language L is that of satisfiable arithmetic circuits.Succinct computations can be "unrolled" into arithmetic circuits, and several compilers exist that achieve this, e.g., [13,36,39].Such circuits are specified by directed acyclic graphs with labeled nodes and edges.The edges, or wires, have a value taken from some ring; the nodes, or gates, perform some operation from that ring on the values contained by its input wires and assign the corresponding output value to its output wires.An assignment to the wires is valid if and only if for every gate, the value on the output wires matches that gate's operation and the values on its input wires.In the context of zero-knowledge proofs, the prover generally proves knowledge of an assignment to the input wires of a circuit computing a one-way function, meaning that the corresponding output matches a given public output.Alternatively, the prover can prove satisfiability -that there exists a corresponding input -which makes sense in the context where it is also possible for no such input to exist.
Recent years have seen a concentration of effort towards Quadratic Arithmetic Programs (QAPs) [28] and Rank-One Constraint Satisfaction (R1CS) systems [13] for encoding circuits and wire assignments in an algebraically useful way.The circuit is represented as a list of triples ((a i , b i , c i )) i .A vector s of assignments to all wires is valid iff ∀i .
systems can be defined over any ring; when this ring is Z/pZ , i.e., the field of integers modulo some prime p , the R1CS instance captures exactly an intermediate step of the ZK-SNARK family of proof systems [28].Additional transparent systems such as Ligero [4], BulletProofs [17] and Aurora [14] also accept R1CS over different fields as their input.For the purpose of efficient R1CS-style proofs, the degree of the constraints describing a cipher is as important as their number: any algebraic constraint of degree higher than two must first be translated into multiple constraints of degree two, and the complexity parameter we seek to minimize is the number of R1CS constraints needed to specify the computation.

Multi-Party Computation (MPC)
A multi-party computation is the joint evaluation of a function in individually known but globally secret inputs.In recent years, MPC protocols have converged to a linearly homomorphic secret sharing scheme whereby each participant is given a share of each secret value such that locally adding shares of different secrets generates the shares of the secrets' sum.We use the bracket notation [•] to denote shared secrets.
Using a linear sharing scheme, additions are essentially free and multiplication requires communication between the parties.The number of such multiplications required to perform a computation is a good first estimate of the complexity of an MPC protocol.
However, while one multiplication requires one round of communication, in many cases it is possible to batch many multiplications into a single round.Moreover, some communication rounds can be executed in an offline phase before receiving the input to the computation.These offline rounds are cheaper than the online rounds, as the former does not affect the protocol's latency and the latter completely determines it.To assess the MPC-friendliness of a cipher one must therefore take three metrics into account: number of multiplications; number of offline communication rounds; and the number of online communication rounds.
An important family of techniques that have a relatively low multiplication count, offline, and online complexity is masked operations such as the technique suggested by Damgård et al. [23].The protocol raises a shared secret to a large power while offloading the bulk of the computation to the offline phase.Suppose for instance that the protocol wishes to compute [a e ] for some exponent e, given only the shared secret A similar procedure enables the computation of inverses with only a handful of multiplications [6].
We extend this range of techniques in two ways.First, we adapt the technique of Damgård et al. for exponents of the form 1/α; while the online complexity is the same, our technique reduces the offline complexity by exploiting the concise representation of α .Second, we introduce a new technique to efficiently evaluate the compositional inverse of sparse linearized polynomials.This novel technique is a contribution of independent interest.We cover these masked operation techniques in more detail as part of our benchmarking.The key observation is that some polynomials with large powers can be efficiently computed over MPCeven when counting the offline phase.

Design Considerations and Concepts
Cipher design has been subject of research since the publication of the Data Encryption Standard (DES) [20].Since then, science has progressed to the point where designing a new block cipher can be as simple as following a formula: choose a family of basic operations (e.g., ARX) and a general structure (e.g., (G)Feistel, or SPN), components with known cryptographic properties (e.g., Sboxes with high non-linearity, a linear layer with fast diffusion), add constants to break symmetry, and set the number of rounds based on theoretical arguments (e.g., the wide trail strategy) or using automatic tools (e.g., MILP and SATsolvers [31,33,34]).This approach, if used properly, results in a secure algorithm that is resistant to known attacks.
In contrast, arithmetization-oriented ciphers necessitate different design considerations and cannot be understood in the context of traditional cipher design.While such a design formula could exist in principle, the field has not yet progressed to the point where it can be articulated.In this section, we highlight and discuss the considerations identified during our design process that are unique to arithmetization-oriented algorithms.This section is independent of our cipher designs.To the extent that it raises questions or concerns, these are answered and addressed in the context of Marvellous universe's designs of ciphers in Section 4.

Acausal Computation
In a procedural model of computation, the state of the system at any point in time can be uniquely determined as a simple and efficiently computable function of the system's state at the previous point in time.The arithmetic modalities of computation considered in this paper are capable of violating this procedural intuition.While all participants in the protocols are deterministic and procedural computers, some emergent phenomena are best interpreted either with respect to a different time axis or without any respect at all to the passage of time.From this perspective, they seem to undermine the constraining character of causality or violate it altogether.We call these phenomena acausal computations.
It is possible to design and define ciphers in terms of acausal computations.Doing so can offer security against particular or general attacks without having to increase the number of rounds.This benefits the efficiency of the advanced cryptographic protocol capable of computing the acausal operations efficiently.As a result of this design strategy, the cipher might be more expensive to evaluate on traditional, progressive computers; however, this is not the defining metric to begin with.
Consider for example the inversion operation x → x q−2 , for some x ∈ F q .When the field is large, so is the exponent, and as a result a progressive evaluation is expensive.We show how this operation is captured efficiently by acausal computation in various arithmetic modalities.
In the case of zero-knowledge proofs, the particular variant of acausal computation is known as non-determinism.The honest prover, who has evaluated the cipher locally, is in possession of all the intermediate states including x and y = x q−2 , and the verifier possesses only commitments to these values.The verifier is incapable of computing the values directly, but establishing that the expressions x(1 − xy) and y(1 − xy) both evaluate to 0 accomplishes the desired effect -convincing the verifier that y was computed correctly from x.
In the case of multi-party computations, the acausal computation originates from the capability of masked operations to offload certain calculations to the offline phase, where they do not affect online efficiency.In particular, in the offline phase the protocol prepares two shared values, [a] and [b] satisfying a ̸ = b and b = a −(q−2) .Then in the online phase, [ax] is opened and the result is inverted locally before being multiplied with [b], yielding [y] = (ax) q−2 [b].This description ignores the special care necessary when x = 0 but the point remains that the number of online multiplications (which are expensive) is independent of the power to which the shared secrets are raised.
Another example of acausal computation arises in the polynomial modeling prelude to deploying a Gröbner basis attack -which is arguably another arithmetic modality of computation.The observation here is that the attack does not need to follow the same sequence of events involved in evaluating the cipher.The attacker can search for x and y simultaneously and require their consistency through the polynomial equation xy − 1 = 0. Note that the adversary may choose to ignore case x = 0 if the probability of this event is sufficiently small, as it is when working over large fields.
The takeaway is that acausal computation adds another dimension to cipher design, and opens the door to completely new types of constructions.While all algorithms presented in this paper (and, to the best of our knowledge, all published algorithms in this space) follow a traditional, progressive structure of an iterative cipher, acausal computation admits a departure from this strategy.Since progressive evaluation is no longer necessary, one can go a step further and consider operations whose progressive evaluation is expensive.The space of admissible ciphers extends even beyond functions to relations between pairs of objects, or of tuples even, whose defining computations admit an efficient acausal representation.We leave the design of relational ciphers, along with a compelling selling point therefor, as an open question.

Efficiency Metrics
Unlike their traditional counterparts, arithmetization-oriented ciphers do not attempt to minimize execution time, circuit area, energy consumption, memory footprint, etc. -at least not as a first order consideration.Instead, these ciphers optimize algebraic complexity as described in terms of AIR or R1CS constraints for zero-knowledge proofs; and number of multiplications, number of offline and online rounds of communication for MPC.The common feature of these metrics is the gratuitous nature of linear operations.With respect to non-linear operations, each metric introduces its own subtleties.Even the cost of a single multiplication differs from metric to metric depending on where in the cipher that multiplication is located.
To illustrative this discrepancy, consider a state consisting of m field elements in some field F q .Suppose that we want to square one of these m elements over a non-binary field.This would require 1 multiplication in an MPC protocol, but would require an entire row (m entries) in algebraic execution trace of a STARK proof.Should we want to raise the element to a higher power α, we can use masking techniques in MPC at a fixed online cost that is independent of α, and yet it would require log 2 (α) R1CS constraints.The exception is when α has a small inverse in Z/(q −1)Z; then the R1CS representation can be optimized with an acausal computation.
At the risk of stating the obvious, even when restricting to zero-knowledge proof systems, ciphers can have a different cost depending on whether they are encoded as R1CS and AIR.For instance, raising a value to the power α requires log 2 (α) R1CS constraints, meaning that the cost is the same for all values in the range ]2 log 2 (α)−1 , 2 log 2 (α) ].In contrast, a system encoded in AIR can specify the maximal degree d of the polynomials describing the system, giving rise to a cost of log d (α) AIR constraints.
Another subtlety is introduced by the introduction of acausality, which is used precisely because its effect on the relevant efficiency metrics is small.In particular, low-degree power maps, low-degree affine polynomials, and their functional inverses, all have efficient acausal computations.
Importantly, and unlike in the case of traditional cipher design, the size of the field over which the cipher is defined, is immaterial to its cost of operation.For example, changing the base field of a hash function from F 2 128 to F 2 256 doubles the digest length at no additional cost.
The flipside of the cheapness of native field operations is the expensiveness of non-native operations that traditional ciphers typically are composed of.For example, the exclusive-or operation is extremely cheap for traditional ciphers because the platforms on which they run represent everything as sequences of bits; however, applying the same operation to elements of an odd-characteristic field requires first computing this bit expansion, which is prohibitively expensive.
Arithmetization-oriented ciphers must sacrifice the security benefits conferred by mixing algebras.

Cryptanalytic Focus
In traditional cipher design, statistical attacks -particularly, differential and linear cryptanalysis -are considered to be the main threat.The alternatives to statistical attacks, algebraic attacks, seldom deliver better results despite being a subject of active research.However, the opposite seems to be the case for arithmetization-oriented ciphers, for two reasons.First, the flexibility in choosing the field size, the gratuitous nature of scalar multiplication, and acausal computation, allow killing statistical attacks in a rather small number of rounds.
Second, and more importantly, the optimization of ciphers for arithmetic modalities of computation has the unfortunate side-effect of enabling attacks that exploit their low arithmetic complexity.Any cipher whose operations are described by simple polynomials gives rise to a range of attacks that manipulate those same polynomials algebraically (and enjoy the speedup afforded by acausal computation).While it is true that any function from finite fields to finite fields can be represented by a polynomial, the problem is that arithmetizationoriented ciphers make this polynomial representation concise and thereby reduce the complexity of algebraic attacks that are otherwise wildly infeasible.Among this class of algebraic attacks we count the interpolation attack [29] and the GCD attack [3, §4.2], and, warranting particularly close attention, Gröbner basis attacks.For an overview on the processes involved in Gröbner basis attacks, we refer the reader to Appendix A; we proceed here assuming familiarity with these concepts.
The interpolation and GCD attacks rely on the univariate polynomial expression of the ciphertext in function of the plaintext (or vice versa).Their complexity, and their countermeasures, are mostly understood.In essence, it is sufficient to ensure that the algebraic degree of the univariate polynomial describing the algorithm is of high enough degree and dense for the algorithm to be deemed secure against these attacks.
In contrast to these attacks, Gröbner basis attacks admit a multivariate polynomial description and are much more difficult to qualify in terms of complexity.This difficulty stems from a variety of sources: -The field of arithmetization-oriented cipher design is a relatively new field, spurred by recent progress in advanced cryptographic protocols.For ciphers not optimized for arithmetic complexity, merely storing the multivariate polynomials in memory tends to be prohibitively expensive, let alone running a Gröbner basis algorithm on them.As a result, Gröbner basis attacks are rarely considered and poorly studied. 8-There may be many ways to encode a cipher as a system of multivariate polynomials, or more generally, to encode an attackable secret as the common solution of a set of multivariate polynomial equations.As such, Gröbner basis attacks do not constitute one definite algorithm but a family of attacks whose members depend on the particular choices made while modeling the cipher as a collection of polynomials.-The complexity of Gröbner basis algorithms is understood only for systems of polynomial equations satisfying a property called regularity, which corresponds to the algorithms' worst-case behavior.Even if a given system of polynomial equations is regular, it is difficult to prove that this is the case without actually running the algorithm.The complexity of Gröbner basis computation of irregular systems can be characterized in terms of the system's degree of regularity, but once again there is no straightforward way to compute this degree without actually running the Gröbner basis algorithm.-In some cases, the actual Gröbner basis calculation is relatively simple but the corresponding variety contains parasitical solutions in the field closure.
Additional steps are then required to extract the correct base field solution, and these post-processing steps may be prohibitively complex.The parasitical solutions are typically eliminated by converting the Gröbner basis into one with a lexicographic monomial order, at which point at least one basis polynomial is univariate; factorizing this polynomial identifies the solutions in the base field.The complexity of monomial order conversion can be, and often is, captured via that of the FGLM algorithm [26]; however an alternative algorithm called Gröbner Walk does not have a rigorous complexity analysis and yet is observed to outperform FGLM sometimes in practice [16].More fundamentally, the diverse range of options afforded to the attacker when modeling the cipher in terms of polynomials as well as during other steps of the Gröbner basis attack, suggest that this typical strategy of monomial order conversion and factorization may be merely one out of many: it is eminently plausible that there are alternative strategies to filter out parasitical extension field solutions.
The dual design criteria of both having an efficient arithmetization and offering security against Gröbner basis attacks seem to be fundamentally at odds with each other.A concise polynomial description of a cipher benefits both the algebraic attack and the advanced cryptographic protocol that uses it.Consequently, the question of security against Gröbner basis attacks is the crucial concern raised by arithmetization-oriented ciphers, and no such proposal is complete without explicitly addressing it.
We observe that non-deterministic encodings used in zero-knowledge proofs have a counterpart in the cipher's polynomial modeling and make both the zeroknowledge proof and the Gröbner basis algorithm more efficient.Furthermore, we conjecture that this duality is necessarily the case, even for tricks and techniques that we may have overlooked.
The relative importance of Gröbner basis attacks is illustrated by Jarvis [5] and MiMC [3], two arithmetization-oriented ciphers that were proposed with explicit consideration for a wide range of attacks, but not attacks based on computing Gröbner bases.However, shortly after its publication, a Gröbner basis attack that requires only a single plaintext-ciphertext pair was used to discover non-ideal properties in Jarvis [2].An investigation of MiMC using the same attack was argued to be infeasible [2,Sec. 6].While finding the Gröbner basis is easy, the next two steps -monomial order conversion and factorization of the resulting univariate polynomial -are not, owing to the infeasibly large number of parasitical solutions in the field closure.
However, relying on the large number of parasitical solutions for security against Gröbner basis attacks is a new security argument and a risky one.The simple observation that using more than just one plaintext-ciphertext pair makes the system of equations overdetermined, and thus filters out all parasitical extension field solutions with overwhelming probability, seems to undermine this argument.We note that the complexity analysis of overdetermined polynomial system solving requires delicate attention and it is conceivable that the resulting attack is also infeasible but for a different reason.However, the point is that even if this is the case, MiMC's security is not guaranteed by the large number of parasitical solutions.Either way, these observations raise the question whether there is a systematic argument for Gröbner basis security that does not depend on the particular flavor of the attack.In Section 4.5 we answer this question positively by providing such an argument.

Translation of Existing Cryptographic Constructions
Cryptographic primitives are generally not used directly but as part of a larger scheme (e.g., a mode of operation).When using an arithmetization-oriented primitive as part of such a scheme, it is important to address several concerns, starting with efficiency.Consider for example AES-CTR -it is easy to see that this mode of operation mixes two algebras: F 2 8 for the block cipher part and Z for the counter.The result is a mode of operation with a prohibitively inefficient arithmetization.
Another important aspect that is easy to overlook is that the interface of the scheme may not be properly defined to work with field elements.As an instructive example we consider sponge constructions.A sponge construction generates a hash function from an underlying permutation by iteratively applying it to a large state.The state of a sponge function is defined to consist of b = r + c bits, where r and c are called the rate and the capacity of the sponge, respectively.In other words, sponge functions are inherently defined to work over vector spaces of F 2 (with the exclusive-or and conjunction operations as their native operations).We show in Section 4.4 how to fix this mismatch of algebras specifically for the case of sponges; however, such a straightforward fix might not always be possible.
Perhaps most importantly is to note that even if the security of a construction is well understood in the traditional setting, this knowledge may not be transferable to the arithmetized variant.A case in support of this point is the Merkle-Damgård construction in the face of Gröbner basis attacks.We observe that in certain cases the degree of regularity grows slowly but surely as a function of the round number when Davies-Meyer is used, whereas the degree of regularity remains constant(!)when Miyaguchi-Preneel is used.This is a surprising observation since PGV hash functions are believed to be interchangeable in practice.We conjecture that absence of growth is due to the interface through which the unknowns are introduced.In particular, in Miyaguchi-Preneel the chaining value is introduced via the key interface and since this value is initially known, the key schedule does not contribute to the complexity of the resulting system of equations.We suspect that this vulnerability has an analogue in traditional cryptanalysis.We note that the practical interchangeability of the PGV constructions remains an open question.
The key takeaway from this section is that using schemes that are well understood in the traditional model may not be straightforward in the arithmetic model.Before instantiating a primitive in an existing construction, it is important to check that the construction is efficient with respect to the application at hand, properly defined, and that its security proof translates to the computational model being considered.

Concluding Words
Our survey of the advanced cryptographic protocols employing arithmetic modalities of computation is by no means complete.Consequently, our matching survey of the design considerations induced by the advanced cryptographic protocols that we do cover, is likewise incomplete.
For example, fully homomorphic encryption is missing from our list of cryptographic protocols and yet induces other design considerations, starting with the conjectured unavailability of acausal operations.Another difference is that multiparty computations and zero-knowledge proofs tend to be flexible with respect to the operating field, and in principle this field is not an input to the security calculation.By contrast, the field of choice in fully-homomorphic encryption is intricately linked to the security provided by the encryption scheme.A third difference with fully homomorphic encryption is that both additions and multiplications accrue noise, although the noise increase due to multiplication is much greater.As a result, additions are not free but merely much cheaper than multiplications.
It is possible that we overlooked other advanced cryptographic protocols employing arithmetic modalities of computation, or that other are yet to be invented.If there is a demand on the part of these protocols for symmetric ciphers, then the design considerations for such ciphers ought to be re-evaluated in light of the target protocol and application.In such an event, the points and questions raised by our analysis provide an ample roadmap for such a reassessment.
Lastly, we note that the field of algebraic attacks seems rather underexplored.As a result, it is difficult to make a compelling security argument valid for the entire family of attacks.We expect third party analysis to contribute to fleshing out this field and hope that this analysis confirms the merit of our design principle for addressing algebraic attacks (Section 4.5).

Design Decisions
Following the discussion in Section 3 it is clear that designing an arithmetizationoriented cipher is different from designing a traditional one and that the different considerations lead to different design decisions.In this section we explain and motivate the design decisions we made in the course of developing the Marvellous designs Vision (Section 5) and Rescue (Section 6).We begin by explaining our motivating principles.

General Structure
Vision and Rescue are two primitives based on substitution-permutation networks operating over fields of even and odd order, respectively.Both families manipulate a state of m > 1 elements seen as a column vector.A round of the function includes two steps.In every step an S-box is applied to each of the m state elements, followed by an MDS matrix which mixes the elements together.
The S-box consists of a power map, possibly composed with an affine layer.The exact maps used are detailed in the parts specific to Vision and Rescue and so is the difference between the the odd and even step within each round.The cipher is an iterative application of the round function N times with different round keys derived from the key schedule.

Key Schedule
The key schedule of the algorithms reuses the round function.The master key is fed through the plaintext interface and random round constants are used where the subkey is normally injected.The subkeys are then determined as the value of the state immediately following the constant injection.
The round constants are derived in the following way: we use SHAKE256 to expand a short seed into enough randomness from which one samples the first round constant (with rejection as necessary to ensure that the bit string does not represent a subfield element nor an integer larger than or equal to the prime modulus).All subsequent constants are obtained by applying an affine transformation to the previous one.The first round constant and the coefficients of the affine transformation can be generated deterministically using the code provided in [38].
In recent years, driven by the advent of lightweight cryptography, complex key schedules have fallen out of favor.For the Marvellous designs we have decided to take the opposite approach, namely a heavy (i.e., non-linear) key schedule.This complexity is motivated by the following arguments: -The domain of arithmetization-oriented ciphers is relatively new and it pays to err on the side of safety until the landscape of possible attacks has been explored more thoroughly.
-One of the use cases of arithmetization-oriented ciphers is hashing; and in this case it is possible to completely hide the complexity overhead of the key schedule as its input is a known IV or fixed key.In other cases it may be possible to amortize the cost of the key schedule over the cost of the entire execution.-A straightforward Gröbner basis attack on the block cipher represents a key recovery from one or a few plaintext-ciphertext pairs.When the key schedule is simple -say, linear -then the same variables that are used to represent the key in one round can be reused across all other rounds.A complex key schedule introduces many more variables and equations, making the system of equations that much more difficult to solve.Reusing the round function in the key schedule is a conceptually simple way to require at least as many polynomials and variables in the polynomial modeling step as are required to attack the hash function.-A less straightforward Gröbner basis attack on the block cipher targets the injected subkeys rather than the master key.However, as these are different and have no attackable relation, they must be treated as independent variables.Consequently, at least 2N plaintext-ciphertext pairs (one per step) are necessary to uniquely determine these subkeys.With the resulting explosion in the number of variables and equations, even a very mild degree of regularity makes the system of equations unsolvable in practice.

Efficiency
Throughout the Marvellous design, we only use arithmetization-efficient maps or the functional inverse of one.The realization of functional inverse maps is not always efficient when implemented straightforwardly.However, all advanced protocols considered in this paper enable acausal computations that make the functional inverse of an arithmetization-efficient operation itself arithmetizationefficient as well.
A particularly useful property of our designs is the inverse trade-off between m and N (i.e., the number of field elements in the state and the number of rounds, respectively).We see that for higher m, the degree of regularity grows faster in each round.This allows to treat m as a parameter that can be tweaked in order to favor a lower multiplicative depth in exchange for a lower base field size.For example, a large m can be used to build an n-ary Merkle-tree rather than a binary one and thus shrink the authentication paths.As the S-boxes operate in parallel, a large m allows to compress m multiplications into a single communication round in an MPC protocol.

Arithmetic Sponge
A sponge construction generates a hash function from an underlying permutation by iteratively applying it to a large state [15].Traditionally, the state is thought of as consisting of b = r + c bits, where r and c are called the rate and the capacity of the sponge, respectively.In every iteration of the absorbing phase, r bits of the input are injected into the state until there are no new input bits left; in every iteration of the squeezing phase, r bits of the state are read out until the desired output length is met.We slightly adapt this definition to allow for hashing of field elements.Instead of working over bits, the rate part now consists of r q field elements in F q .The remaining c q = m − r q elements of the state constitute the capacity and their size determines the security of the sponge.
To turn the block ciphers into permutations, the secret key is fixed to zero.The resulting permutation is then used in a sponge construction to obtain an extendable output function and, if the output length is fixed, a hash function.We note that the sponge mode can also be used to turn the permutations into stream ciphers.However, exploring this option is beyond the scope of this paper.
The resulting sponge absorbs (using field addition) and squeezes r q field elements per iteration and offers log 2 (c q ) bits of security.Note that increasing r q and keeping c q fixed increases the throughput of the sponge without significantly affecting the cost and not at all the security.

Security
We now give a high-level overview of the algorithms' security countermeasures applied to inoculate them against attacks.A more rigorous description of how each attack is prevented can be found in Appendices G-H.

Statistical Attacks
In the design of traditional symmetric-key primitive, resistance to differential and linear cryptanalysis is of utmost importance.Building on the work of Nyberg [35] we see that S-boxes consisting of a power map have good differential and linear properties and that these properties can be easily derived once the field is specified.Using the wide trail strategy we bound the maximum differential probability and linear correlation of an active S-box.
An interesting observation here is that these quantities improve directly (from the designer's point of view) as a function of the field size.Since the state is treated as a column vector (rather than a matrix as in AES) fast diffusion is also achieved since the MDS guarantees that if at least one S-box is active before the MDS, m + 1 S-boxes would be active afterwards.
We require that the field F q is at least 4-bit wide and that 2m ≥ q such that a state-wide MDS matrix exists, which offers resistance against differential and linear cryptanalysis after four rounds.We refer the interested reader to [22] for a more elaborate description of the wide trail strategy, and to Appendices G-H for the exact derivation in our case.
It is unclear how linear cryptanalysis would look like for ciphers operating over elements of F p with p prime.Normally, linear cryptanalysis searches for a linear combination of input-, output-, and key bits that is unbalanced, i.e., biased towards 0 or towards 1.As such, linear cryptanalysis seems tailored to work over the field F 2 .No analogue to this behavior exists for F p .All reasonable extensions of linear cryptanalysis to this case would treat an element in F p as we would normally treat bits and search for a expression approximating a ciphertext (which is an element in F m p ) as a multivariate linear polynomial in F p .Since the polynomial is dense and of high degree, there is no straightforward way to go about finding such a linear approximation.However, we stress that we do not have a rigorous argument for the inapplicability of linear cryptanalysis in this setting and the dual questions -how to lift linear cryptanalysis to this setting, and how many rounds can be attacked -remains open.
Structural Attacks Self-similarity attacks work by splitting a cipher into multiple sub-ciphers that are similar to one another, for some definition of similarity.This allows to attack one of the sub-ciphers and use the self-similarity to cleverly link this part with the other ones.
The invariant subspace attack works by observing that all the values of a certain subspace that enter the round function are mapped to outputs within a subspace.The input and output subspaces do not have to be the same.
The standard way to resist these attacks is to inject round constants which break the similarity between different parts of the algorithms, and lift the state out of the subspace being attacked.In our case, we opt to add these constants to the key schedule, resulting in a "fresh" value injected into the state in every round.Note that even when instantiated inside a sponge function, the key schedule still outputs a non-zero value in each step and that value is injected into the state.The improved efficiency in this case is due to the fact that the constants can be precomputed and hard-coded into the realization of the algorithm.

Algebraic Attacks
The way to resist most algebraic attacks is to ensure that the polynomial describing the output of the primitive is dense and of high degree.Initially, having a high degree polynomial appears to be at odds with having an efficient arithmetization-oriented primitive.We resolve this apparent contradiction by using maps of high degree that can be computed efficiently owing to acausal operations and operating on m ≥ 2 state elements.The MDS matrix thus ensures a good mixing, resulting in a dense polynomial.
Gröbner basis attacks are of particular interest here.Recall that a Gröbner basis attack consists of the following steps: (i) computing the Gröbner basis in degrevlex order; (ii) converting the Gröbner basis into lex order; (iii) factorizing the univariate polynomial, and back-substituting its roots.

We use the following design principle to argue security against Gröbner basis attacks: the security of arithmetization-oriented ciphers against Gröbner basis attacks should come from the infeasible complexity of computing the Gröbner basis in degrevlex order.
This principle guarantees that the Gröbner security is independent of the presence of parasitical solutions in the field closure; if present and large in number, these parasitical solutions represent a superfluous security argument because the attacker has to get past step (i) in order to get to step (iii).More importantly, with this principle, the number of parasitical extension field solutions required for an infeasible univariate factorization is no longer a constraining factor in determining of the number of rounds.
In order to guarantee that finding the first Gröbner basis is prohibitively expensive, we implement the cipher and an attack and observe the degree of regularity experimentally for small round numbers.We assume a constant relation between the observed concrete degree of regularity, and the degree of regularity of a regular system of the same number of equations, degrees, and variables.Conservatively, assuming ω = 2 as the linear algebra constant and extrapolating from there, we set the number of rounds such that this Gröbner basis attack has the required complexity.
This Gröbner basis attack represents a preimage search attacking the spongebased hashing mode as described in Section 4.4 with m = 2, in which one data block is absorbed and one digest block is squeezed out.Alternative Gröbner basis attacks induce a greater complexity due to the increase in the number of variables without a disproportionate increase in the number of equations.When used as a block cipher instead of as a hash function, variables and equations need to be introduced to account for the key schedule.Since this key schedule is as complex as the sponge-based hash function, the resulting Gröbner basis attack must be at least as expensive.These attacks on Vision and Rescue are discussed in more detail in Appendices G-H.

Number of Rounds
To set the number of rounds for each primitive we consider ℓ, the maximal number of rounds that can be attacked by any of the attacks above.Our analysis shows that algebraic attacks other than Gröbner basis ones do not extend beyond three rounds, and that statistical attacks can be used against four rounds at most.For most parameters we considered, the most dangerous attack is the Gröbner basis attack and we discuss its analysis when determining the number of rounds in the respective sections.Having determined ℓ, we set the number of rounds to be 2ℓ with a minimum of 10 rounds.

Concluding Words
Given the present state of development of arithmetization-oriented cipher design, our design choices can hardly be argued to be the right ones or to enjoy widespread consensus as being good ones.However, the common theme throughout all choices is the preference for erring on the side of safety, thereby minimizing the risk of unforeseen fatal attacks.As we see in the sequel, even with this conservative approach, our ciphers are extremely efficient in the use cases we identified.Still, we hope that independent third-party analysis reaches the conclusion that our design choices were indeed too conservative, and that the complexity and security margins can safely be reduced.

Vision
We now describe the first family of the Marvellous universe of ciphers, Vision, whose design is inspired by that of AES.Since we already discussed the design decisions leading to Vision in Section 4, we only discuss here the technical specification of the algorithm.

The State
The native field in which Vision operates is F 2 n/m where n is the security level desired (in bits) and m is the number of field elements in the state.The state is viewed as a column vector of m field elements and is an element of the vector space F m 2 n/m .

S-Box
The S-box of Vision consists of two operations composed with one another.The first is the inverse power map which is expressed as the following power map or in rational form Similar to the S-Box of Rijndael, the multiplicative inverse is followed by an affine polynomial.Recall that an F 2 -linearized affine polynomial is of the form Such a polynomial is a permutation over F 2 n/m if and only if its linear part only has the root 0 in F 2 n/m .In fact, two S-boxes (π 1 , π 2 ) are used in Vision.The first S-box, π 1 , consists of the inversion function composed with the functional inverse of the F 2 -affine polynomial.The second S-box, π 2 , uses the multiplicative inversion composed with a direct evaluation of the same polynomial.When evaluated in the forward direction (i.e., in encryption mode), π 1 ensures that the algebraic degree of the polynomial description of the algorithm is sufficiently high.Similarly, when evaluated backwards (i.e., in decryption mode), π 2 achieves the same goal.

Round Function
The round function consists of two steps.In each step, the state goes through a non-linear layer followed by a multiplication with an MDS matrix.The non-linear layer applies π 1 to each of the m elements if this is the first step of the round, and applies π 2 if it is the second.Followed by the nonlinear step, an MDS matrix which is the same for both steps is used to mix the state.A schematic description of a single round (two steps) of Vision is depicted in Figure 1 and the pseudo-code of the cipher is listed in Algorithm 1.

Algorithm 1: Vision
Input: Plaintext P , round keys K s for 0 ≤ s ≤ 2N Output: Vision (K, P ) To generate the ciphertext from a given plaintext, the round function is iterated N times with a key injection before the first round, between every two steps, and after the last round.

Choosing the Number of Rounds
As explained in Section 4, the number of rounds is determined by the Gröbner basis attack.Experiments on reduced parameters show that the base-2 logarithm of the complexity of such an attack is lower-bounded by 5.5mN .Accounting for a factor 2 security margin, we recommend 2⌈n/5.5m⌉rounds, with a minimum of 10 rounds, for ciphers operating on a state of m elements and targeting an n-bit security level.

Rescue
The second family of algorithms in the Marvellous universe is Rescue.Rescue is similar to Vision, but this time operating on elements of prime fields rather than binary ones.

The State
The native field in which Rescue operates is F p .The state is viewed as a column vector of m field elements and is seen as an element of the vector space F m p .The security level afforded by the algorithm is m • log 2 (p).

S-Box Similar to
Vision, Rescue uses a pair of S-boxes π 1 and π 2 .The S-boxes consist of the power maps x 1/α and x α , respectively, where α is the smallest prime such that gcd (p − 1, α) = 1 .
For most fields, α = 3 suffices.When possible we recommend to choose the field such that α = 3 is viable.In some cases the field is determined by the intended application and cannot be chosen freely.For example, the 255-bit prime field F r , which is used for the multiplications made over the BLS12-381 curve used by ZCash, does not satisfy gcd (r − 1, 3) = 1 making α = 3 unsuitable for this case.Instead, to use this field one can choose α = 5 since gcd (r − 1, 5) = 1.

Round Function
The round function of Rescue consists of two steps.In the first step, π 1 is used, followed by and MDS matrix.In the second step, π 2 is used, again followed by an MDS matrix.
To generate the ciphertext from a given plaintext, the round function is iterated N times with a key injection before the first round, between each two steps, and after the last round.
A schematic description of a single round (two steps) of Rescue can be found in Figure 2 and the pseudo-code of the cipher is listed in Algorithm 2. Note that here, similar to Vision, both steps are efficient for prover and multi-party computations owing to the low degree of x α .

Algorithm 2: Rescue
Input: Plaintext P , round keys K s for 0 ≤ s ≤ 2N Output: Rescue (K, P ) Choosing the Number of Rounds Similar to the case Vision we see that the most prominent attack is the Gröbner basis attack.We find that the base-2 logarithm of the attack complexity is lower-bounded by 4mN .Accounting for a factor two security margin, we set the number of rounds to 2⌈log 2 (p)/4⌉ with a minimum of 10 rounds, for a security level of m log 2 (p).

Benchmarks
In this section we analyze the efficiency of Vision and Rescue with respect to three use cases: AIR constraints for ZK-STARKs (Section 7.1), Zero-Knowledge Proofs based on R1CS Systems (Section 7.2, and MPC protocols (Section 7.3).Section 7.4 provides a comparison of the algorithms with MiMC-q/q and MiMC-2p/p.Notation.We use the following conventions.Variables of multivariate polynomials are denoted with capital letters (X, K, R, . ..) .Plain variables denote the current state and primed variables (X ′ , K ′ , R ′ ) denote variables describing the state at the next cycle of the computation.We limit ourselves to constraints involving only two consecutive states.We use [i, j] (or [i]) to select the indicated element from a matrix (resp.vector).When not affixed to a vector the notation [m] is shorthand for the set {1, . . ., m}.Furthermore, we extend set-builder notation to indicate multiple set members for each conditional satisfaction, i.e.,

AIR Constraints for ZK-STARKs
We begin by realizing the two algorithms in AIR, the Domain-Specific Language (DSL) used to encode ZK-STARKs.For the sake of readers not versed in the relevant definitions related to STARKs [10] we recall those, along with a simple motivating example in Appendices C and D.

Encoding of a Vision Step as a Set of AIR Constraints
We present an AIR with w = 4m , t = 2 and degree d = 2 for a single step of Vision.The spongebased Vision hash replaces the key schedule with fixed constants, and hence has half the width of the cipher (w = 2m) and the same length.We describe only the second step in the round in which B(X) is used.The first step, which uses B −1 (X) , is analogous.First we deal with computing the key schedule, which requires 2m variables, denoted 1.The first cycle is used to compute the map x → x q−2 , mapping x to its inverse when x is nonzero and otherwise keeping x unchanged.The following set of constraints (polynomials) ensures this, To see this, notice that when

and when K[i]
= 0 the first constraint forces R[i] = 0 so the last constraint forces K ′ [i] = 0 as well. 2 , and so, there exists a quadratic polynomial in K [1], . .

. , K[m] and R[1], . . . , R[m]
that computes the concatenation of the quartic polynomial B along with the linear transformation M and the addition of the step constant C k used in the kth step.The following constraints ensure that A single step of the cipher is identical to the key schedule, with the main difference being that instead of adding a step constant (denoted C k above) we add the kth key expansion during that stage.It follows that with 2m additional variables and essentially the same set of constraints as above, we have accounted for the full AIR of the Vision round.
The Vision hash is a sponge construction and so the keys are fixed to certain known constants.The key schedule is dropped, leading to an AIR of width w = 2m and t = 2 cycles per step.
Note that one could use different AIRs than described above to capture the same computation, just as we could use different AIRs to capture the Fibonacci computation of the example in Appendix C. For instance, one may increase the number of cycles per step from 2 to 2m, while decreasing the width from 4m to 4 , by operating on the m state registers sequentially instead of in parallel.However, this alternative description does not reduce the overall size of the AET which stands at 8m per step (and 16m per round).Similar trade-offs can be applied to Rescue, as well, which we discuss next. . . .
Step 2 of Round k Step 1 of Round k + 1

Encoding of a Rescue
Step as a Set of AIR Constraints Rescue is quite similar to Vision but simpler from an algebraic perspective.The main difference between the two ciphers is that the inverse step of Vision is replaced with a cubing operation (i.e., α = 3) and the quartic polynomial is removed.The result is that each step of the Rescue key schedule or state function involves only m cubic polynomials (or inverses thereof), so we can encode it via an AIR using d = 3 with a single cycle per step and width m.The representation of the Rescue round function admits an optimization owing to acausal computation.Consider an adapted round as shown in Figure 3. Here, the first step of the adapted round is "folded" into its second step.This leaves the first and last steps of the entire primitive to be taken separately.We connect S and S ′ from the middles of rounds k and k + 1 using m cubic equations, effectively skipping the evaluation of the state after round k.The result is that we can encode the adapted round function via an AIR with a single cycle per round, d = 3 and width m.The following constraints ensure that Where we used that K 2k We conclude that the Rescue state function AIR has degree d = 3 , state width w = m and t = 1 cycle per round.Since the above encoding does not require K 2k , the key schedule admits a similar optimization.As a result, the Rescue key schedule AIR also has degree d = 3 , state width w = m and t = 1 cycle per round.When the cipher is used as a hash in sponge mode, Rescue does not require an AIR for the key schedule; this was also the case for Vision.

Zero-Knowledge Proofs Based on R1CS Systems
In this section we evaluate the efficiency of Vision, Rescue and MiMC when encoded as rank one constraint satisfaction (R1CS) systems.Such systems are used by many zero-knowledge proof systems that operate on arithmetic circuits, such as [36], ZK-SNARK [13], Aurora [14], Ligero [4], and Bulletproofs [17].

Encoding of a Vision
Step as a System of Rank-one Constraints Recalling the two cycles of the AIR for Vision recounted earlier for constructing each of the key and round (Section 7.1), we convert them into a system of R1CS constraints.Consider the key schedule first; the cipher round is identical.The first cycle is converted into 3m R1CS constraints.The second cycle splits the evaluation of the affine polynomial into two parts, each involving one squaring and thus m constraints for each part, resulting in a total of 2m constraints for the second cycle.For this latter constraint we notice that over binary fields (of size 2 k , integer k) it is the case that for the constants α j satisfying α 2 j = M [i, j]b 3 .Since each step involves both the key derivation and the cipher step, we observe that the cost of a Vision block cipher step is 10m R1CS constraints, and that of a round is 20m .
When used in sponge hash mode the key schedule is fixed, and so the number of R1CS constraints per step is halved.This gives a total number of 5m constraints per step (and twice that number per round.).

Encoding of a Rescue Step as a System of Rank-one Constraints
To efficiently encode a step of Rescue for α = 3, we use two R1CS constraints to compute the cube of a state variable giving a total of 2m constraints for the cubing operations over the whole state.The step using the inverse cubing map is analogous.The linear combinations due to the MDS matrix M can be integrated into these 2m constraints.Since the same computation is applied to the key schedule when used as a cipher, we count 4m per step, twice as many constraints (8m) per round, and 2m constraints per step for Rescue used in sponge hash mode because the key schedule is fixed.

MPC with Masked Operations
In this section we explore how to implement Vision and Rescue over MPC using masked operations. 9We consider three masked operation techniques: one technique to find the inverse of a shared field element due to Bar-Ilan and Beaver [6]; one technique to raise a shared element to an arbitrary but known power due to Damgård et al. [23]; and one new technique to compute the compositional inverse of a low-degree linearized polynomial.The last two techniques are novel and their descriptions can be found in Appendix E.
The common strategy behind these techniques is to apply random and unknown masks to a shared secret value and opening their sum.The operation proper is applied to the opened variable giving a known but still-masked output value.The mask on this output value is then removed by combining it with the output of a dual operation applied to the original shared random mask.The benefit of these techniques comes from shifting the computation of this mask and its dual to the offline phase, which is possible as this computation does not depend on the value to which the operation is applied.In the online phase, the regular operation is computed locally (i.e.without needing to communicate); the dual operation does require communication but it is cheaper.
The first two of these techniques require zero-tests -sub-protocols that produce a sharing of 1 if its input is a sharing of 0, and a sharing of 0 otherwise.Our MPC implementations of Rescue and Vision are agnostic of the particular zero-test as well as of the secret sharing mechanism.In the sequel we present figures without taking the zero test into account.

Computing a Vision
Round over MPC Recall that elements of the state in Vision are members of the extension field F 2 n/m .Since we use a linear secret sharing scheme, we can perform the additions and multiplications-by-constants from Vision in a straightforward manner, namely by manipulating shares locally.In particular, this means that applications of the MDS matrix to the working state impose no extra cost.However, nonlinear operations do not admit such a straightforward realization and instead require creative solutions to retain an efficient implementation.
Only two component blocks of Vision induce a cost: the inversion operation, and the polynomial evaluation of B and B −1 .All other operations are linear and thus free.Recall that the state of Vision consists of m field elements.Therefore, each round includes m initial inversions, m inverse-polynomial evaluations, followed by another m inversions and m regular polynomial evaluations.These m executions are independent and can therefore be performed in parallel.The cipher consists of N rounds in total.The key schedule algorithm doubles these numbers, but its cost can be amortized over the entire execution of the protocol so we neglect it here.
To evaluate the inversion step, we use the technique due to Bar-Ilan and Beaver [6], of which pseudocode is given in Appendix E. This procedure requires 2 communication rounds and works for all non-zero elements x ∈ F 2 n .In scenarios where the shared value is unlikely to be zero (i.e., if the field is large enough), this technique can be used directly.Ignoring the zero test, the total cost of this method is 1 communication round: it is possible to merge a multiplication and an opening call.
A similar approach can be used to compute B −1 (x).To the best of our knowledge, this masking technique is novel and is thus an independent contribution of this paper.However, in the interest of brevity, we only describe it in Appendix E.2 together with pseudo-code.
The implementation of a round of Vision follows straightforwardly from using these building blocks, along with linear (and thus local) operations.A round of Vision consists of 2 calls to the inversion protocol at a total cost of 2 communication rounds (ignoring the zero-test), the evaluation of B −1 (x) with an overall cost of 3 communication rounds (2 of which are be precomputed in an offline phase), and the evaluation of B(x) at a cost of 2 communication rounds.While these elements are performed on each of the m elements, they are performed independently and are hence parallelizable.The total complexity of Vision is therefore # offline rounds: 2 , # online rounds: Computing a Rescue Round over MPC The only nonlinear operations of Rescue to take into account are the α and inverse-α power maps.To achieve this, We have adapted, for any arbitrary large α , the exponentiation technique introduced by Damgård et al. [23].This way, we can offload a portion of the computation to an offline phase and retain a constant online complexity (i.e., 1 round).A small adaptation of this technique computes the inverse power map at the same online cost.We summarize this adaptation in Appendix E.3.Each procedure requires ⌈log 2 α⌉ + 2 multiplications in total, and ⌈log 2 α⌉ + 2 communication rounds (including the 1 online round).In the case of the inverse alpha map, obtaining [r −1 ] can be combined with the exponentiation, thus reducing by one the number of communication rounds.All operations on r can be executed in parallel during an offline phase as they do not depend on the input and on each other.
The implementation of Rescue is now straightforward.Each power map is applied in parallel to all m elements of the state.The multiplication with the public MDS matrix is free.The cost of a single round is therefore # offline rounds: ⌈log 2 α⌉ + 1 , # online rounds: 2 , # multiplications: 2m • (⌈log 2 α⌉ + 2) .

Comparison
To compare MiMC with Vision and Rescue, we set m = 2 , n = 128 , p = 2 64 +13 , q = 2 129 − 45 and α = 3.For the purpose of the present comparison, the number of AIR constraints (of degree d) is given by the value of w • t, we ignore the zero-test for MPC and observe that the offline parts can be done in parallel for all rounds.We stress that in the interest of a fair comparison with MiMC, we provide figures of merit only for the case of m = 2, which is on the one hand, the smallest m we deem secure, and on the other hand, the largest m that allows for such comparison.However, noting that our designs achieve an inverse trade-off between m and the number of rounds, setting m to a higher value would show that the Marvellous designs are even more efficient than how they are portrayed here.
We compare the three algorithms for AIR (Table 1), R1CS (Table 2) and masked MPC (Table 3) in two scenarios: as block ciphers and as sponge functions.For 128-bit block cipher security, we require 24 rounds of Vision; 32 rounds of Rescue; 82 rounds of MiMC-q/q; and 164 rounds of MiMC-2p/p.Since the absorption and squeezing of inputs and outputs in the case of MiMCHash-q/q are not native operations to the working field, they require complex arithmetic to achieve.By contrast, MiMCHash-2q/q is much better suited to arithmetization and thus what we compare our algorithms (in hash mode) against.In sponge mode these parameters offer only 32 bits security against collisions; nevertheless they allow for an apples-to-apples comparison with the same rate and capacity.Note that the field size does not change the cost under the metrics we consider in this paper (i.e., arithmetic complexity).

Conclusion
This paper explores the design of secure and efficient symmetric-key primitives for advanced cryptographic protocols based on arithmetization.It starts by surveying three protocols that fit this description -zero-knowledge proofs for the Turing or RAM models of computation, for the circuit model, and multi-party computations.
The design considerations unique to designing arithmetization-oriented primitives are discussed.We show that the set of efficient building blocks available to the designer of an arithmetization-oriented cipher is different from those available to designers of a traditional cipher.We also observe that the efficiency metrics are different, as well as the security concerns.This last point we discuss at length, particularly in the context of Gröbner basis attacks, and we propose a strategy to ensure that an arithmetization-oriented cipher is secure against this class of attacks.
After this discussion we turn to designing two new families of arithmetizationoriented ciphers -Vision and Rescue -new members of the Marvellous universe.These primitives are benchmarked with respect to three use cases: the ZK-STARK proof system; proof systems based on Rank-One Constraint Satisfaction (R1CS) systems; and Multi-Party Computation (MPC), and we compare them in these settings to MiMC.We see that the Marvellous algorithms are extremely efficient.Despite the conservative nature of these designs, they outperform MiMC in all but a few cases.

A Gröbner Basis Attacks
We recall here some basic facts about attacking symmetric primitives using Gröbner basis algorithms.For more general information on the underlying mathematics, we refer the reader to Cox et al. [21].For a specific description of the steps involved in attacking block ciphers with Gröbner bases, we refer to the excellent summary by Buchmann et al. [16].
An ideal I ⊆ F q [x] = F q [x 1 , . . ., x n ] is the algebraic span of a list of polynomials {b 1 (x), . . ., b m (x)}, meaning that every member f (x) ∈ I can be expressed as a weighted sum of the basis elements with coefficients taken from the polynomial ring: . An ideal can be spanned by many different bases; among these, Gröbner bases are particularly useful for computational tasks such as deciding membership, equality, or consistency.The task we are interested in is polynomial system solving: computing the ideal's variety, or the set of common solutions when equating all ideal members to zero.
A monomial order is a rule according to which to order a polynomial's terms.This rule is not just a convenience for mathematicians to read and write polynomials; it also affects how the polynomials are stored on a computer as well as the complexity of various operations on ideals.In general, the calculation of a Gröbner basis is fastest with respect to degree reverse lexicographical ((de)grevlex) order.However, whenever the variety contains a substantial number of solutions, a Gröbner basis in lexicographic (lex) order is preferable.A Gröbner basis in lex order guarantees the presence of at least one univariate basis polynomial.Factoring this polynomial and back-substituting its roots generates another, simpler, Gröbner basis again in lex order; iterative back-substitution produces all solutions.The FGLM [26] and Gröbner Walk [19] algorithms transform a Gröbner basis for one monomial order into one for another order.
The focus on degrevlex order for computing the first Gröbner basis owes in large part to the success of the celebrated F 4 and F 5 algorithms [25,27].In every iteration, these algorithms extend the working set of polynomials via multiplication by monomials to a certain step degree, before reducing the extended polynomials using linear algebra techniques -essentially Gaussian elimination on the Macaulay matrix.The F 5 algorithm stands out in this regard, because in this case it can be proven not to terminate before the step degree reaches the ideal's degree of regularity [7,8], which is informally equal to the degree of the Gröbner basis in a degree-refining order such as degrevlex (but not lex).If a system of polynomial equations {f i (x) = 0} i is regular -exhibiting no non-trivial algebraic dependencies in the same sense that non-singular matrices exhibit no linear dependencies -then the degree of regularity is given by the Macaulay bound: . When there are more equations than unknowns, the system of equations is incapable of being either regular or irregular, and the worst-case behavior for F 5 is captured instead by semi-regular systems.The degree of (semi-)regularity is now defined as the degree of the first non-positive term in the power series expansion of HS , where m is the number of equations and n the number of variables.Note that when m ≤ n this formal power series is a polynomial and the Macaulay bound indicates one more than its degree; this is what justifies re-using the term degree of regularity.However, while F 5 must reach this degree before it terminates, the degree of the resulting Gröbner basis is typically much smaller.
Regardless of whether the system is regular, knowledge of the degree of regularity provides a lower bound on the complexity of computing a Gröbner basis, namely that of running Gaussian elimination on a Macaulay matrix of degree ) ω therefore bounds the attack complexity, where ω ≥ 2 is the linear algebra constant -ω = 3 for standard Gaussian elimination; ω ≈ 2.37 if fast multiplication techniques [41] are used; and ω = 2 when sparse linear algebra techniques such as Wiedemann's algorithm [40] can be used.
Buchmann et al., writing before the above-mentioned results on the degree of regularity were established, observe that for specially chosen monomial orders, the Gröbner basis comes for free as a result of clever polynomial modeling [16].The bottleneck of the attack then consists of the monomial order conversion using either FGLM or Gröbner Walk.
In stark contrast, the security rationale underlying our cipher designs is explicit about the designed intractability of the first Gröbner basis computation step.Whatever steps come after might be of greater or lesser complexity and are either way irrelevant to the security consideration.In particular, the security of our ciphers is determined with respect to the Gröbner basis calculation in degrevlex order with ω = 2.The degree of regularity is experimentally tested against that of regular systems of the same dimension for small round numbers.In the case of Vision we observe that the experimental degree of regularity and the degree of regularity of regular systems are equal; in the case of Rescue, we observe a linear relation and extrapolate from there.

B Experimental Results Using Gröbner Bases
Vision Due to the high complexity of calculating the degree of regularity (i.e., of performing the Gröbner basis calculation and observing the degree of the resulting basis) even for round reduced versions, we have few results even after running the experiment for 60 hours.The one observed data point, coupled with the prohibitive complexity of obtaining more, justify the assumption that the attacked system behaves like a regular system of the same number of equations and variables.We extrapolate this finding and show the complexity of constructing a degree reverse lexicographic Gröbner basis of Vision providing for a different number of rounds and parameters m.We found these results were independent of the field size.Rescue We made the same experiments for Rescue.We calculated the degree of the Gröbner basis output by the Gröbner basis algorithm for several roundreduced versions of Rescue and found that this concrete degree was exactly half the degree of regularity of regular systems, independently of the field size.We show the complexity of constructing a degree reverse lexicographic Gröbner basis of round reduced versions of Rescue for different m assuming the same concreteto-regular degree ratio holds even for larger round numbers.For comparison, we also show the complexity if the system were regular.

C STARK Intuition
We start be recalling the relevant definitions from [10], along with a simple motivating example.
Scalable Interactive Oracle Proofs (IOPs) and Transparent Arguments of Knowledge (STARKs) like [9,10] express computations using an Algebraic Execution Trace (AET): for a computation with t steps and internal state captured by w registers, the trace is a t × w array.Each entry of this array is an element of a finite field F.
Before presenting formal definitions, we motivate them using a simple example.Suppose the prover wishes to prove the statement below, where p is prime and F p is the finite field of size p: "∃x 0 , x 1 ∈ F p such that y is the qth element in the Fibonnacci sequence defined recursively for i > 1 by x i = x i−1 + x i−2 mod p."An execution trace proving the statement above is a (q + 1) × 1 array in which the ith state is, supposedly, x i .Now, to verify the correctness of the statement our verifier must check that the following two conditions hold: -boundary constraints: the last entry equals y.
-transition relation constraints: for each i ≤ q − 1, the i th register plus the i+1 st register equals the i+2 nd register.This can be captured succinctly by a constraint of the form which is applied to each consecutive triple-of-states in the trace.Satisfying a constraint always means setting it to 0, so the right hand side above is redundant and henceforth we shall simplify such a constraint and write only its left hand side, namely, Alternatively, the execution trace could be a q × 2 array in which the ith state supposedly contains x i , x i+1 .Now, the verifier checks two constraints for each pair of consecutive states, described next by using X, Y to denote the two registers capturing the state, The boundary constraint would now check that the [q, 2]-entry of the execution trace equals y.
Rescue.For Rescue, the use of square-and-multiply does not require any specific protocol adaptation.Both power maps can be obtained from an invocation of square_multiply([x],e), where e = α or e = α −1 mod p − 1.Like for Vision, the linear components do not contribute to the cost.Consequently, the total complexity of one round of this implementation of Rescue is # offline rounds: 0 , # online rounds: ⌈log 2 α⌉ + ⌈log 2 p⌉ , # multiplications: 2m • (⌈log 2 α⌉ + ⌈log 2 p⌉) .
Comparison.Like before, we consider 24 rounds of Vision with n = 128 and m = 2; 34 rounds of Rescue with p = 2 64 + 13, α = 3 and m = 2; and 82 rounds of MiMC-q/q with q = 2 129 − 45; each a parameter set targeting 128 bits of security.This consideration gives rise to the following table of comparison.

G Cryptanalytic Strength of Vision
In this section we explain the security of Vision and how it resists certain attacks.

The Wide Trail Strategy
In order to argue the security of Vision, we follow the same line of reasoning as was done for Rijndael and apply the wide trail strategy to our construction.From Nyberg [35] we find the differential and linear properties of the inversion function over arbitrary binary fields.For the field F 2 n/m we have δ = 2 −n/m+2 and |λ| = 2 −⌈n/2m⌉+1 .Since our diffusion layer is an MDS matrix applied on m state elements, we have at least m + 1 active S-Boxes every two steps.When requiring that n/m ≥ 4 we find that a four round trail has a maximal differential probability of 2 4(m+1)(−n/m+2) < 2 −2n , and maximum absolute correlation which is sufficient to resist potential differential and linear attacks given that a large enough security margin was taken.

Algebraic Degree
The algebraic degree of a function f is defined as the degree of the largest monomial in the algebraic normal form of f .Ciphers which achieve a low algebraic degree are potentially vulnerable against higher-order differential attacks as introduced by Knudsen [30].For our construction, the S-Box has algebraic degree n/m − 1 after two steps (taking into account both B(x) and B −1 (x)).The maximal algebraic degree that can be reached by a polynomial in F 2 n/m is n/m − 1 , thus this is achieved already in one round as per our design strategy following [35].
Interpolation Attacks Jakobsen and Knudsen introduced in [29] the interpolation attack.Here the attacker constructs polynomials using input/output pairs of the cipher.Due to the complexity of calculating GCD's or Lagrange interpolation being linear in the degree of the polynomial, one needs the polynomial representations of the cipher to have a high degree to avoid this attack.These attacks lend themselves to meet-in-the-middle variants where the attacker tries to find a concise rational expression in the plaintext or ciphertext.Since B −1 (x) is of full degree and dense, we expect that two rounds of Vision are sufficient to create a complex rational expression between the plaintext and the ciphertext.
To thwart potential meet-in-the-middle attacks we consider three rounds of the cipher to be sufficient in order to resist interpolation attack variants.
Invariant Subfield Attacks Finally, we consider attacks which make use of an invariant subfield.To recall, for F 2 n/m , any field F 2 s where s is a divisor of n/m is a subfield.An adversary might be able to attack the cipher by making it work over one of the subfields.This would involve the adversary inputting a value of a subfield and receiving an output which is again in a subfield.We require that the affine polynomial has coefficients which do not lie in any subfield of F 2 n/m thus frustrating this attack.
For a discussion about linear cryptanalysis in prime fields see 4.5.

Interpolation Attacks
We look at polynomial descriptions of the cipher over F m p after several rounds.Due to the α-inverse power map being of high degree, two rounds of the cipher already attain the maximum polynomial degree p.Moreover, due to this power map, the polynomial expression is dense.From [29], we know that meet-in-the-middle variants of the interpolation attack are possible, however this attack becomes infeasible after three rounds.
Gröbner Bases Like for Vision, we provide equations encoding the preimage of the Rescue sponge function where a single unknown message block was absorbed and one known message block was squeezed out.In contrast to Vision it is now possible to fold equations across two steps in order to reduce the number of variables and equations. 11Like before we use [m] to denote {1, . . ., m} unless the brackets are a suffix to a vector or matrix, in which case the indicated element is meant.
This encoding introduces m new variables and as many new equations of degree α per extra full round.The first step introduces one variable and m equations, whereas the last step introduces no variables and one equation.So along with m variables representing a single state to start from, we have in total 1 + mN variables and 1 + mN equations.If the system of equations were regular we would find via the Macaulay bound d reg = 1 + ∑ 1+mN i=1 (deg(f i ) − 1) = (α − 1)(mN + 1) + 1. Experimentally, we observe the concrete degree of regularity d con = ⌈ dreg 2 ⌉ for small round numbers. 12We extrapolate from here, assuming a constant concrete-to-regular ratio of 2.
[a].The protocol generates a random nonzero blinding factor [r] and computes [r −e ] in the offline phase.In the online phase they multiply [a] with [r], open [ar], and then locally raise this known number to the power e .The result of this exponentiation is then multiplied with [r −e ] giving (ar) e [r −e ] = [a e r e r −e ] = [a e ].

Figure 1 :
Figure 1: A single round (two steps) of Vision

Figure 2 :
Figure 2: One round (two steps) of Rescue where the addition with the key is taken over a prime field.
m be the ith field element of the kth step constant, and letB(Z) = b 0 + b 1 Z + b 2 Z 2 + b 3 Z 4 bethe quartic polynomial used by Vision.

Figure 3 :
Figure 3: An adapted representation of a round of Rescue better suited for STARK evaluation.

Figure 4 :
Figure 4: Experimental results of round reduced Vision for parameters m.The bottom left graphs show on the vertical axis the degree of regularity with experiments denoted by asterisks.The upper graph shows the resulting complexity of constructing a Gröbner basis assuming the system is regular with the grey dotted lines showing 128, 192 and 256-bits of complexity.

Figure 5 :
Figure 5: Experimental results of round reduced Rescue for parameters m.The bottom left graphs show on the vertical axis the degree of regularity of regular systems (in blue), and half that number (in green), with experimental observations denoted by asterisks.The upper graph shows the resulting complexity of constructing a Gröbner basis with the grey dotted lines showing 128, 192 and 256-bits of complexity.
d reg polynomials in n variables.At this point there are