Integral Cryptanalysis of WARP based on Monomial Prediction

. WARP is a 128-bit block cipher published by Banik et al. at SAC 2020 as a lightweight alternative to AES . It is based on a generalized Feistel network and achieves the smallest area footprint among 128-bit block ciphers in many settings. Previous analysis results include integral key-recovery attacks on 21 out of 41 rounds. In this paper, we propose integral key-recovery attacks on up to 32 rounds by improving both the integral distinguisher and the key-recovery approach substantially. For the distinguisher, we show how to model the monomial prediction technique proposed by Hu et al. at ASIACRYPT 2020 as a SAT problem and thus create a bit-oriented model of WARP taking the key schedule into account. Together with two additional observations on the properties of WARP ’s construction, we extend the best previous distinguisher by 2 rounds (as a classical integral distinguisher) or 4 rounds (for a generalized integral distinguisher). For the key recovery, we create a graph-based model of the round function and demonstrate how to manipulate the graph to obtain a cipher representation amenable to FFT-based key recovery.


Introduction
Lightweight cryptographic primitives protect an increasing number of interconnected, highly resource-constrained devices transmitting sensitive information in our daily life, such as healthcare devices, the Internet of Things, or sensor networks.Cryptographers have designed a variety of new symmetric-key primitives fitting the constraints of these settings.The most prominent effort in this direction is the ongoing NIST LWC project aiming to standardize a lightweight authenticated encryption algorithm, as well as a lightweight hash function.Among the symmetric primitives, lightweight block ciphers have attracted attention as a building block of lightweight authenticated encryption schemes occupying a very small hardware footprint.While the first generation of lightweight block ciphers such as PRESENT [BKL + 07] and LED [GPPR11] mainly focused on hardware footprint, more recent (tweakable) block cipher designs often target different criteria, such as low latency in QARMA [Ava17] or low energy consumption in MIDORI [BBI + 15].
WARP is a lightweight 128-bit block cipher presented by Banik et al. at SAC 2020 [BBI + 20] as a low-area drop-in replacement of AES-128.Such a lightweight, low-area alternative to AES-128 with the same interface size is attractive as it is easy to integrate in existing systems.Previous 128-bit designs like MIDORI and GIFT-128 [BPP + 17] predominantly follow a Substitution-Permutation Network (SPN) structure.However, SPN ciphers are not perfect in terms of the area, particularly where a unified encryption and decryption circuit is required since designing fully involutory components with minimal area is challenging [GPV19].Banik et al. thus designed WARP using the Generalized Feistel Structure (GFS) design paradigm which is involutory in nature.With its nibble-oriented 4-bit branches, the F -function consists only of a small 4-bit S-box followed by the key addition -an order of operations inspired by PICCOLO [SIH + 11].The resulting design shares similarities with LBlock [WZ11] and LBlock-like functions, but provides better diffusion of active S-boxes.At the same time, the design is significantly smaller than the prior ones for both encryption-only and unified encryption and decryption implementations.
The designers of WARP also provide security analysis against differential, linear, impossible differential, and integral attacks [BBI + 20].Regarding impossible differential attacks, they show a 21-round impossible differential distinguisher.Applying the method of Sasaki and Todo [ST17], we find that zero-correlation distinguishers also reach 21 rounds (e i+120 e i+60 and e i+56 e i+124 for 0 ≤ i ≤ 3).For integral attacks, the designers argue that (nibble-oriented) integral distinguishers can cover no more than 20 rounds, and key-recovery attacks can extend these by at most 1 round.Additionally, there are a few third-party analysis results focusing primarily on WARP's differential properties [KY21a,TB21].Teh and Biryukov [TB21] provided a more accurate analysis against differential attacks taking the clustering effect into account.They also investigate the security of WARP against boomerang attacks.The results are summarized in Table 1.

Specification of WARP
WARP is a 128-bit block cipher aiming at small-footprint circuit in the field of 128-bit lightweight block ciphers and follows a variant of the 32-branch GFS design paradigm.It receives a 128-bit plaintext with a 128-bit key and performs 40 full rounds plus 1 partial round to produce a 128-bit ciphertext.The internal state of WARP can be represented as , where X i ∈ {0, 1} 4 .WARP splits the 128-bit master key K into two 64-bit halves, i.e., K = K 0 ||K 1 , and K (r−1) mod 2 is used as the round-key in the rth round.The ith nibble of the round-key K (b) in round b = (r − 1) mod 2 is denoted by i , where b ∈ {0, 1}, and 0 ≤ i ≤ 15.We denote the ith nibble in the input of rth round by X (r−1) i , where 1 ≤ r ≤ 41 and 0 ≤ i ≤ 31.We also sometimes use bitwise indexing, denoting the ith bit of X (r−1) counting from the left (MSB) by x (r−1) i , where 0 ≤ i ≤ 127.When it is clear from the context, we only use a number between 0 and 127 to represent a certain bit of the internal state.

Figure 1:
The round function of WARP.
As Figure 1 illustrates, the round function of WARP first applies the same S-box S : {0, 1} 4 → {0, 1} 4 as well as the round-key addition on each of two consecutive nibbles of internal state.Next, a round constant is added and a permutation π : {0, . . ., 31} → {0, . . ., 31} is applied on the position of nibbles.The last round of WARP does not include the nibble permutation.

Boolean Functions
To represent bit vectors, we use bold italic lowercase letters, e.g., x ∈ F n 2 denotes the n-bit vector x = (x 0 , • • • , x n−1 ).We also denote the n-bit zero vector by 0, and e i represents a unit vector in which all coordinates are zero except for the ith coordinate which is equal to one.We also define a partial order over the set of n-bit vectors, such that for any n-bit vectors x and y, x ≤ y if x i ≤ y i for all i.A Boolean function f : F n 2 → F 2 can be uniquely represented by a multivariate polynomial in n−1 +xn−1 which is called its algebraic normal form (ANF) and is defined as follows: To compactly represent a monomial n−1 i=0 x ui i , we use x u or π u (x).Moreover, given a Boolean function f = u∈F n 2 a u •x u , the coefficients of its ANF can be uniquely determined from its values and vice versa: A vectorial Boolean function is a function f : F n 2 → F m 2 whose m coordinates are Boolean functions in n variables and it is compactly denoted by y = f (x).To show the presence and the absence of monomial π u (x) in the ANF of π v (y), we use π u (x) → π v (y), and π u (x) π v (y), respectively.

Integral Cryptanalysis
The idea of integral analysis was first introduced as a theoretical generalization of differential cryptanalysis by Lai [Lai94] and as a practical attack by Daemen et al. [DKR97].Knudsen and Wagner formalized the concept [KW02].The core idea of integral cryptanalysis is finding a set of inputs such that sum of the resulting outputs is key-independent in some positions.In the context of symmetric-key cryptography, any primitives can be represented as a vectorial Boolean function in a combination of secret and public variables.The set of inputs in bit-based integral analysis usually consists of inputs taking all possible combinations in d input bits, whereas the remaining bits take a fixed value.Such input sets form a linear subspace of dimension d, so the resulting output sets are the dth derivative of the corresponding vectorial Boolean function with respect to the active input bits, i.e., those bits that are not fixed [Lai94].An approach to find the key-independent output positions is detecting those output bits whose ANF do not include monomials containing key bits as well as active input bits, since the dth derivative of such output bits with respect to the active bits is constant (zero-sum or one-sum).However, in terms of the computational complexity, the vectorial Boolean functions corresponding to cryptographic primitives are usually very hard to construct directly, and hence cryptographers have to use indirect approaches to inspect the algebraic properties of vectorial Boolean functions.Monomial prediction [HSWW20] is a new technique to determine the presence of a particular monomial in the product of the coordinate functions of a vectorial Boolean function, when directly constructing it is computationally infeasible.This technique exploits the fact that a cryptographic vectorial Boolean function f : F n0 2 → F nr 2 is usually a composition of several simpler vectorial Boolean functions f i : where the algebraic representation of each smaller vectorial Boolean function f i is available.Assuming that x (i) = f i x (i−1) , for any u (i) ∈ F ni 2 , we can describe π u (i) (x (i) ) in variables x (i−1) as follows: Accordingly, to follow the presence of monomials in a sequence of vectorial Boolean functions that are applied one after another, Hu et al. [HSWW20] defined the concept of monomial trails and proposed how to model them using MILP.
Definition 1 (Monomial Trail [HSWW20]).Let )) is an r-round monomial trail connecting π u (0) (x (0) ) and π u (r) (x (r) ) with respect to the composition function If there is at least one monomial trail from π u (0) (x (0) ) to π u (r) (x (r) ), we write For a given composition sequence of vectorial Boolean functions, the following lemma guarantees the absence of a monomial in the ANF of the resulted composite function.
Hu et al. [HSWW20] also show that the parity of the number of monomial trails connecting π u (0) (x (0) ) and π u (r) (x (r) ) perfectly determines the presence or absence of π u (0) (x (0) ) in the ANF of π u (r) (x (r) ).This requires counting the monomial trails.Moreover, Hu et al. prove that the monomial prediction technique and the 3-Subset Division Property without Unknown subset (3SDPwoU) [HLM + 20] are equivalent and perfect detection algorithms to detect the presence or absence of monomials in the ANF of Boolean functions.The downside is that for composite vectorial Boolean functions coming from cryptographic primitives, the number of monomial trails between two monomials can be extremely large, which makes counting the number of monomial trails computationally difficult.However, as we will show in the next sections, Lemma 1 is sufficient to detect key-independent bits in integral cryptanalysis.
Key-recovery for integral attacks.The easiest way to extend an integral distinguisher where some bit positions sum to 0 across N queries to a key-recovery attack is to evaluate the sum for each of the 2 k key candidates, if k key bits are necessary to partially decrypt the ciphertext to the relevant target bits.Several techniques have been proposed to reduce this key-recovery complexity of O(N • 2 k ) partial decryptions under certain circumstances, including the partial-sum technique by Ferguson et al. [FKL + 00], the Meet-in-the-Middle (MitM) technique by Sasaki and Wang [SW12], and the Fast Fourier Transform (FFT) technique by Todo and Aoki [TA14].We refer to Subsection 4.2 for more details.

Modeling Monomial Prediction for Block Ciphers
Denoting the key, plaintext and ciphertext variables by k, x, and y respectively, a block cipher y = E(k, x) can be considered as a family of functions indexed by k: Hence, any product of ciphertext bits y v can be expressed as a Boolean function f (k, x): where k denotes the key size, In the single-key setting, x is a public variable, whereas k takes a fixed unknown value.For a fixed key k and for any u ∈ F n 2 , the coefficient of x u in Equation 1 is determined by , and as a result, a u (k) is still key-independent, though it may not be zero.Given that an attacker can compute ), for all w ≥ u and for all v ∈ F k 2 , then a w (k) is key-independent for all w ≥ u, and hence x∈Cu f (k, x) is key-independent, where C u is a set of plaintexts such that variables in {x i : u i = 1} are taking all possible combinations and the remaining variables are fixed to some arbitrary values.

CP Encoding of Monomial Prediction
Every cryptographic primitive can be divided into some smaller vectorial Boolean functions whose ANF are available.Therefore, modeling the propagation of monomial trails through the building blocks of cryptographic primitives, such as S-box, Xor, and Copy, or generally small Boolean functions is sufficient to model the whole primitive.To model the behavior of a small vectorial Boolean function in terms of the propagation of monomial trails, we define the monomial prediction table (MPT).

Definition 2 (Monomial Prediction Table (MPT)). For a vectorial Boolean function
Application to WARP's S-box.Table 5 represents the MPT of WARP's S-box.Given that MPT(u, v) can be interpreted as a Boolean function in variables (u, v), we can utilize the Quine-McCluskey [Qui52] or Espresso [BHMSV84] algorithms to minimize its conjunctive normal form (CNF) in our CP and SAT models.For instance, all valid monomial propagations (u 0 , u 1 , u 2 , u 3 ) → (v 0 , v 1 , v 2 , v 3 ) through the WARP's S-box can be encoded as the satisfying solutions of the following CNF, where u 0 and v 0 correspond to the MSB of input and output, respectively: General propagation rules.The propagation rules for monomial prediction of other building blocks of block ciphers and their CP/SAT encoding are summarized as follows.
Proposition 1 (And or Multiplication in F 2 ). .For f : can be encoded as: Proposition 4 (Negation of 3-bit Xor).For f : can be encoded as: Proposition 5 (Branching Point or Copy).For f : Model for WARP.To denote the monomials corresponding to the input of the i + 1th round, we use π u (i) (x (i) ).Accordingly, assuming that we are analyzing r rounds of WARP, π u 0 (x (0) ), and π u (r) (x (r) ) represent the monomials corresponding to the plaintext and ciphertext, respectively, where 1 ≤ r ≤ 41.To model the key schedule of WARP, we denote by π v (i) (k (i) ) the round key of the ith round, and π v (0) (k (0) ) represents the monomials in variables of the master key k.Given that a monomial π u ) is uniquely specified by u (i) and v (i) , as illustrated in Figure 2, our CP and SAT models are described based on variables in u (i) and v (i) , for 0 ≤ i ≤ r.In addition to the main variables represented in Figure 2, we define some additional variables corresponding to the input and output of S-boxes as well.Next, using the propagation rules for S-box, Copy, and Xor with constant, we link the variables of our model to encode the propagation of monomial trails through r rounds of WARP, taking the key schedule into account.As a result, any feasible solution of the constructed CP model corresponds to a valid monomial trail.To check the absence of a monomial π v (k (0) ) • π u (x (0) ) for a certain u ∈ F n 2 and any possible values of v ∈ F k 2 in the ANF of the ith output bit, we check the satisfiability of the model, where u (0) = u and u (r) = e i .If the model becomes unsatisfiable, then there is no valid monomial trail starting from π v (0) (k (0) ) • π u (x (0) ), which guarantees that the ith output bit is balanced over the set of inputs C u := {x ∈ F n 2 : x ≤ u}, according to Lemma 1.If we only fix the variables in u (0) corresponding to the nonzero values of u and let the rest of its variables be free, then the unsatisfiality of the model guarantees the absence of π v (k (0) ) • π w (x (0) ) in the ANF of the ith output bit for all v ∈ F k 2 and for all w ≥ u.The advantage is that the constant part of the resulting cube of plaintexts over which the ith output bit is balanced should not necessarily take zero value, which gives us more freedom to adjust the overall complexity of the resulting integral attack.Moreover, we can include the constraint v (0) = 0 to check whether the ith output bit is key-independent, and detect the one-sum property as well.
We implemented the automatic method to search for integral distinguishers based on monomial prediction with three popular encoding methods: CP, MILP, and SAT.It is worth noting that, during our experiments, SAT encoding resulted in a much better performance than the MILP or CP encoding methods.Our approach to taking the key schedule into account is applicable for linear and nonlinear key schedules.However, the efficiency of solving the resulting model may decrease where the key schedule involves many basic operations.

Improved Integral Distinguishers of WARP
The best previous integral distinguisher for WARP was reported by its designers [BBI + 20], who propose a 20-round integral distinguisher discovered by a nibble-wise model based on 2-Subset Division Property (2SDP) [Tod15].The designers of WARP also claimed that finding an integral distinguisher with a lower data complexity is only possible for less than 20 rounds.Here, thanks to the higher accuracy of our bit-wise model based on monomial prediction to search for integral distinguishers, we not only find a 20-round integral distinguisher with lower data complexity and more balanced output bits, but we are also able to find integral distinguishers for up to 22 rounds of WARP.Exploiting the intrinsic properties of WARP, we also prove that the distinguishers derived from our automatic search can be extended by one round further at the end as well as the beginning.As a result, we can improve the integral distinguishers of WARP by 4 rounds in total.
To look for the longest integral distinguishers of WARP, we check all of the 128 possible plaintext structures with only one constant bit to see whether they yield a key-independent bit at the output.Accordingly, we discovered that for 26 out of 128 possible positions for the constant bit at the input, there is at least one output bit with zero-sum property after 22 rounds.Setting bits (2) or (66) to a constant value results in the highest number of balanced bits at the following output positions: (2) To find an integral distinguisher with a lower data complexity, we also implemented the method introduced by Eskandari et al. [EKKT18] to minimize the number of active bits taking the one-sum property into account as well.Consequently, we verified that there is no integral distinguisher with a lower data complexity for 22 rounds.Next, applying the same strategy for 21 rounds, we found the following distinguishers which are optimal for 21 rounds in terms of data complexity: (4) For 20 rounds, we discovered a distinguisher with one constant nibble at the input yielding 13 more output bits with zero-sum property in comparison to the integral distinguisher proposed by the designers: (64, 65, 66, 67) To disprove the implicit claim of the designers regarding the nonexistence of a 20-round integral distinguisher with data complexity lower than 2 124 chosen plaintexts, we provide the following 20-round distinguishers with a data complexity of 2 123 chosen plaintexts: (5) Extension to a generalized integral distinguisher.The designers of WARP put the Xor with sub-key operation after the S-box application to avoid the complementary property of Feistel-type structure.However, as shown in the following theorem, this property lets us extend all of our discovered integral distinguishers by one round further.
Theorem 1.Any integral distinguisher for WARP built upon a multiset of even size which yields at least one key-independent bit after r rounds, can be extended to an r + 1-round generalized integral distinguisher with the same data complexity.
Proof.Let C denote the multiset of input plaintexts.If the jth bit of C X (r) 2i is keyindependent for some 0 ≤ i ≤ 15, then the same position in C X (r+1) π(2i) must be keyindependent as well, since 2i+1 is key-independent for some 0 ≤ i ≤ 15, then the jth bit of C (S(X In Subsection 4.1, we show how to prepend another initial round to the distinguishers.

Experimental verification.
To experimentally validate the outcomes of our automatic tool to search for integral distinguishers, we also discovered some practical integral distinguishers with very low data complexity for up to 15 rounds of WARP which are summarized in Table 6.The very low data complexity of these distinguishers makes anyone able to evaluate their correctness with very limited computational resources.

Key-Recovery Attack Based on Integral Distinguishers
We now extend our 22-round integral distinguisher from Equation 2 to a 32-round keyrecovery attack by prepending 1 initial round and appending 9 final rounds.The alternative 22-round distinguisher from Equation 3 could be similarly extended.

Prepending 1 Round
We first prepend an initial round to the 22-round integral distinguisher from Equation 2. This distinguisher requires an input set of all 2 127 chosen plaintexts X (1) where bit 2 in nibble X (1) 0 denote the set of suitable values for X (1) 0 with constant c.If we prepend one round, this nibble is computed as where K (0) 5 is constant, as illustrated in Figure 3. Thus, we get X  22-round distinguisher from Equation 2X (1) 0   2to obtain a 23-round distinguisher with data complexity 2 127 and 9 zero-sum bits in X (23) .
Combined with the extension of Theorem 1, we obtain a generalized, set-based integral distinguisher for 1 + 22 + 1 = 24 rounds with the same data complexity of 2 127 .To the best of our knowledge, this is the best distinguisher for WARP so far.The approach can similarly be applied to the 21-round distinguisher with data complexity 2 124 in Equation 4 and the 20-round distinguisher with 2 123 in Equation 5 to obtain distinguishers for 1 + 21 + 1 = 23 and 1 + 20 + 1 = 22 rounds, respectively, with the previous complexities.

Cost Model for Appending Key-Recovery Rounds
The 1 + 22-round distinguisher of Figure 3 gives us 9 zero-sum bits at the output of 2 127 queried plaintexts.In the design document [BBI + 20, Appendix B.3], the designers discuss a 20-round integral distinguisher and claim that due to the high data complexity, a key-recovery attack can only extend the distinguisher by at most one round and the resulting complexity is "almost close to an exhaustive key search".However, this estimate appears to assume a straightforward key-recovery approach where each of the queried plaintexts is partially decrypted under each of the key guesses, which would require 2 124+4 1-round decryptions in their example 1 .
MitM FFT key-recovery approach.With more advanced approaches for integral attacks such as the Meet-in-the-Middle (MitM) technique [SW12] and the Fast Fourier Transform (FFT) technique [TA14], as well as a dedicated analysis exploiting WARP's properties such as the position of key addition, we can cover substantially more rounds.Consider a target nibble C X (r) i = 0 with i odd for an r-round integral distinguisher with an input structure C of even size |C|.When we want to recover this nibble based on some key guess and a given set of ciphertexts, WARP's Feistel structure permits us to use the Meet-in-the-Middle approach and independently recover its two contributing branches as where the key sum C K (r mod 2) (i−1)/2 = 0 cancels out for an even-sized set C. We refer to these two nibbles X L = S(X (r+1) as the left and right target nibbles, respectively.For both nibbles X L , X R and each of their 2 4 possible values C X L , C X R , we want to construct a list of key candidates that generates this value.By matching these two lists based on the value (and potentially consistency checks between the two partial key candidates), we obtain all combined key guesses that satisfy the integral property C X (r) i = 0.For a 4-bit property and a 128-bit involved key, we expect a list of 2 124 remaining key candidates that we can then check by brute-force evaluation.Note that we could also take additional integral conditions into account before brute-forcing (in our distinguishers, we have 9 rather than just 4 zero-sum bits); however, this won't substantially improve the overall attack complexity, so for simplicity, we focus on just one target nibble with its two branches X L and X R in the following.
To recover each of these branches, we need to compute the targeted nibbles as a function of the ciphertext and key nibbles.Ideally, in case we can represent the target nibble as a function X = F ( K ⊕ C) that depends only on the Xor of a partial k-bit ciphertext C and key K, then we can recover C X with a complexity of O(k • 2 k ) using the FFT technique for integral attacks [TA14], plus the initial cost of |C| queries.As the FFT approach works with Boolean functions, we actually need to repeat this 4 times in parallel for the 4 bits of the target nibble; in the following, f (•) : More specifically, the FFT approach can compute u K = C∈C f ( C ⊕ K) for all K with the help of the Fast Walsh-Hadamard Transform (FWHT) and two k-dimensional vectors v (encoding f ) and w (encoding the set C): Here, w i ∈ {0, 1} counts the number of appearances of the partial ciphertext value i in C, where we first eliminate all duplicate appearances as they cancel out in C f (•).The results u K are computed using the 2 k -dimensional Walsh matrix H 2 k as Dependency DAG for WARP.To find the function F (•) and thus evaluate the cost of recovering each of these branches, we construct a directed acyclic graph (DAG), the dependency DAG, to compute the targeted nibbles as a function of the ciphertext and key nibbles.Each node of the DAG represents a nibble of the internal state or key of WARP.A directed edge from one node (parent) to another (child) indicates that the parent node depends on the child node.Here, we can take advantage of the properties of WARP to compress the DAG and thus the input bitsize k of F .
• Nodes in WARP have one of the following types: -Temporary (T): any internal state nibble X (r) i starts as a temporary node and can be expanded into one of the following types based on the cipher definition.In particular, a node X (r) i turns either into a X node (if i odd and r ≤ R − 1, spawning children for the left and right cells X (r+1) and the key K (r mod 2) (i−1)/2 ; or similarly if i even and r ≤ R − 2 via X (r+1) -Xor (X): a nibble that is computed as the Xor of its children.
-Key (K): either a key nibble K (r) i or a synthetic key nibble corresponding to the Xor of several key nibbles.
. In particular, a temporary node and a K-node.
• Edges in WARP always link children with a parent of type X, so the parent is computed as the Xor-sum of (a function of) its children.The dependency can be of two types, representing the function applied to the child before Xoring: -Identity (I) uses the child directly.
When constructing the DAG, starting from the target nibble as the root node, each node is first created as a temporary T-node and then expanded based on the definition of the cipher as described above.Each node is created based on a given nibble position and with a specified parent and edge type, with the following additional rules: • If the parent of an X-node is another X-node, it is merged with its parent upon creation, and all the child node's children are directly attached to its parent.
• If the nibble is a key or ciphertext nibble, and the parent node already has a child (sibling) of type K or C, respectively, then no new node is created; instead, the sibling is converted to a synthetic nibble and merged with the given nibble.
-If the sibling previously had multiple parents, it is split into two nodes and only the copy for the target parent is updated with the new cell.-Conversely, if the updated synthetic nibble now equals a previously created synthetic nibble, these two are merged into one node by adding the parent to the previously created nibble • If the nibble already exists as a node, this node is reused, i.e., it has multiple parents.
We now want to point out some special properties of the WARP dependency DAG.Unlike typical Feistel constructions, WARP does not add the round keys at the beginning of each F -function, but at the end, together with the Feistel-Xor in an X-node.Additionally, the permutation after the F -function step will move each right (odd-indexed) branch to a left (even-indexed) and then back to a right branch, corresponding to another X-node.These X-nodes will be merged, and thus the corresponding keys will also be merged into synthetic key nibbles.This continues recursively until reaching the ciphertext.As a result, all involved keys in a path will be merged into a synthentic key nibble that will end up as a sibling to a synthetic ciphertext node.In other words, the resulting function F will be of a form F ( K ⊕ C) with the synthetic key nibbles K and synthetic ciphertext nibbles C. If we consider the diffusion of the target nibble, then the number of distinct synthetic ciphertext nibbles equals the number of activated nibbles before the last round, in X (R−1) (based on the definition of C-nodes).The number of synthetic key nibbles and thus the value k determining the complexity may be higher, as some C-nodes may appear as siblings to multiple K-nodes.Still, we observed that the number of K-nodes in our construction is typically lower than a naive count of involved key nibbles.Additional optimizations can take the rank of the equivalent key into account, i.e., the rank of the matrix M such that K = M • K.We discuss this in more detail for the specific dependency DAGs in our attack in the following.

Key-Recovery Attack on 32 Rounds
Based on the previous discussion, the number of key-recovery rounds is essentially upperbounded by the number of rounds required for full diffusion.WARP generally achieves full diffusion after 10 rounds; the following additional factors impact the number of rounds we can attack: • Full diffusion in WARP is after 10 rounds for even-indexed branches, but 9 rounds for odd-indexed branches.• For the MitM attack, the limit is full diffusion of the left branch X L , which is an odd-indexed; but we gain the additional first round, where the target nibble is split into X L and X R .• We require that diffusion is not full before the last round, thus gaining one round: in the last round, we can work with synthetic ciphertext nibbles.Based on these limitations, we can hope to add no more than 10 rounds.Considering the potential target nibbles in the distinguishers of Equation 2 and 3, 10 rounds activate between 29 and 31 of the 32 output nibbles in the left branch X L , but depend on all 128 bits of key material.The resulting complexity of FFT key recovery would thus probably be above the generic brute-force complexity, depending on the assumed conversion factor of the complexity of additions versus cipher evaluations and other details.We thus focus on 9 rounds appended to the 23-round distinguisher, or 32 rounds (out of 41) overall.
In the distinguisher of Equation 2 illustrated in Figure 3, we can choose one of two target nibbles, X ), but both require the same amount of key material.We focus on X (23) 5 for simplicity.Overall, the target nibble X appears to depend on all ciphertext nibbles and all key nibbles.Applying our dependency algorithm to the right and left branches of the MitM approach in Figure 4, we obtain the dependency DAGs in Figure 5 and Figure 6, respectively.We can now use them to separately recover C X R = C X (24) 12 (right ) and C X L = C S(X (24) 1 ) (left ) bit-by-bit under each key guess and then merge the results to obtain 2 128−4 = 2 124 candidates for the full key.Among these, we can brute-force for the correct key.

Right branch:
C X R = C X (24) 12 = C F R ( CR , KR ).According to the dependency DAG illustrated in Figure 5, X (24) 12 can be recovered as a function F R ( KR , CR ) using 84 bits (21 nibbles) of key information and 80 bits (20 distinct nibbles) of partial ciphertext, of which one appears twice.In a straightforward analysis based on Figure 4, on the other hand, the nibble X R appears to depend on 27 distinct key nibbles and 26 ciphertext nibbles.Here, the key information corresponds to an equivalent key KR computed as a linear function with full rank of the original key.Similarly, the ciphertext information corresponds to an equivalent ciphertext CR whose nibbles are nonlinear combinations of the original ciphertext nibbles.We will recover the equivalent key based on the equivalent ciphertexts.
b c d e f S(x) c a d 3 e b f 7 8 9 1 5 0 2 4 6

Figure 2 :
Figure 2: Main variables in our CP model for block ciphers.
additions and can be converted to the desired integral sum C f ( C ⊕ K) by reduction modulo 2[TA14].
properties are similar, but not identical: X (23) 15 activates fewer nibbles before the last round and thus creates fewer synthetic ciphertext nodes (24 for X

Figure 4 :
Figure 4: Key recovery for 9 rounds of WARP after the 23-round distinguisher of Figure 3.

Table 1 :
Distinguishers and key-recovery attacks on round-reduced WARP.Gen. integral refers to generalized integral properties where the sum is taken over a function of the ciphertext bits rather than directly over the bits.
Table 3 shows the lookup table of WARP's S-box S, while Table 4 lists its Algebraic Normal Form (ANF).We refer to the WARP specification [BBI + 20] for full details.

Table 3 :
4-bit S-box S of WARP.

Table 4 :
Algebraic Normal Form (ANF) of the WARP S-box S.

Table 6 :
Practical integral distinguishers for 10 to 15 rounds of WARP.