Automatic Search of Rectangle Attacks on Feistel Ciphers: Application to WARP

. In this paper we present a boomerang analysis of WARP , a recently proposed Generalized Feistel Network with extremely compact hardware implementations. We start by looking for boomerang characteristics that directly take into account the boomerang switch eﬀects by showing how to adapt Delaune et al. automated tool to the case of Feistel ciphers, and discuss several improvements to keep the execution time reasonable. This technique returns a 23-round distinguisher of probability 2 − 124 , which becomes the best distinguisher presented on WARP so far. We then look for an attack by adding the key recovery phase to our model and we obtain a 26-round rectangle attack with time and data complexities of 2 115 . 9 and 2 120 . 6 respectively, again resulting in the best result presented so far. Incidentally, our analysis discloses how an attacker can take advantage of the position of the key addition (put after the S-box application to avoid complementation properties), which in our case oﬀers an improvement of a factor of 2 75 of the time complexity in comparison to a variant with the key addition positioned before. Note that our ﬁndings do not threaten the security of the cipher which iterates 41 rounds.


Introduction
Boomerang distinguishers [Wag99] were introduced at FSE'99 as a variant of differential distinguishers [BS91] taking advantage of the existence of short differentials of high probability.In its simplest version, the attacker sees the cipher E as the composition of two subciphers (E = E 1 • E 0 ) and makes use of a differential for each part.If at first it was thought that these two differentials can be selected freely, following advances like [Mur11] showed that the interdependence should be carefully treated, as incompatibilities or better-than-expected probabilities might occur.
As a result, searching for the best boomerang distinguisher does not simply reduce to finding two differentials of high probability, and thus emerged a need for automated tools that would take into account the possible events in the middle, later formalized by the BCT [CHP + 18] (for SPNs) and FBCT [BHL + 20] (for Feistels) theories.Two techniques have been proposed recently to address this issue.In [HBS21], the authors proposed to give as input to a MILP model the size of the middle part (where dependencies happen) and to take into account one type of dependency (the so-called ladder switch [BK09]).A more precise approximation of the probability of the middle rounds was then obtained with the BCT framework or experimentally.A second technique has been proposed in [DDV20], that directly takes into account all the possible middle dependencies and does not require that the attacker specifies the size of the middle part.
If many papers start by looking for the best distinguisher to next turn it into an attack, better results might be obtained by taking into account the incidence of the distinguisher on the key recovery phase.An example of this was given in [ZDJ19] when building a rectangle attack on Deoxys, and further discussions were provided in [QDW + 21] with results on Skinny.The latter presents an automated tool that takes the model provided in [DDV20] and adds relations to include the dominating factors of the key-recovery phase so that the resulting model directly looks for an optimization of the attack as a whole.
Our contribution.In this work, we propose to study the security of the recently published block cipher WARP [BBI + 20] against boomerang techniques.To do so, we start by showing how to adapt the model developed in [DDV20] to the case of Feistel ciphers, since the original tool was developed for SPN ciphers in general and for Skinny in particular.Since the execution time of the simple model is exponential in the number of rounds, covering more than 20 rounds of WARP is out of reach.We thus propose several techniques to speed up the model and to guide it to the solutions.By counting different solutions with the same input and output differences, we were able to find a 23-round distinguisher that covers 2 more rounds than the best result to date.
Second, we show how to extend this model to search for rectangle attacks, following the method developed in [QDW + 21].This extra step ensures that both the key recovery part and the distinguisher are optimized together to reach a (close to) optimal attack as a whole.Finally, we describe a 26-round attack on WARP, again reaching the best result to date.Our analysis shows that the designers' choice of positioning the key addition after the S-box in the Feistel round function (which is justified by the need to avoid complementation properties) turns in favour of the attacker.
Our results on WARP with a comparison with previous works are summarized in Table 1.The code of our tool is available at: https://gitlab.inria.fr/lrouquet/boomerang-warp-fse-23 Outline.We start by recalling some preliminaries in Section 2 which include the specification of WARP, a reminder on boomerang attacks and a brief overview of the existing techniques to automatically find them.Section 3 is dedicated to the description of our model searching for boomerang distinguishers and to the discussion of our result on 23 rounds, that we can easily extend by 2 rounds.Our techniques to improve the execution time of the model are presented.Section 4 shows how to turn the previous model into one searching for rectangle attacks and in particular how the position of the key addition turns favourable to the attacker, leading us to a 26-round attack.

Specification of WARP
WARP [BBI + 20] is a lightweight block cipher that has been recently presented at SAC 2020 by Banik et al.The main objective of the designers was to propose a cipher that could be used as a direct replacement of AES-128 (thus with a 128-bit block and key) but that would be lighter in terms of hardware footprint.This challenge was met with flying colours as evidenced by the impressive reported number of around 800 Gate Equivalents (GEs) for a serialized circuit of WARP.
Description.The cipher follows a variant of a Type-2 Generalized Feistel Network (GFN) using 32 branches of 4 bits each.Special care was taken to the selection of the 32-branch In detail, the 128-bit internal state is split over 32 branches of 4 bits.At the input of round r, the value of the 32 nibbles is denoted X[r, 0] to X [r, 31].They go through five elementary mappings in each (full) round, as depicted in Figure 1.Each nibble with an even index X[r, 2i] is modified by the F function, which consists in the application of a 4-bit S-box (denoted SBOX in the following, and given in Table 2) followed by a round key addition.The result is then xored to X[r, 2i + 1], a constant is added to X[r, 1] and X[r, 3] and finally the 32 branches are shuffled by the π permutation given in Table 3.Since the values of the round constants have no impact on our analysis we refer the reader to the specification [BBI + 20].
The key schedule is linear and relies on a 128-bit master key seen as the concatenation of two 64-bit keys: K = K0||K1.Each half is used alternatively as the round key, starting with the 16 nibbles of K0 that are used in the first round.Security.The designers of WARP claimed single-key security and did not claim any security in related-key and known/chosen-key settings.They provided a rather comprehensive security analysis of their design, with a discussion of differential, linear, impossible differential, integral, meet-in-the middle and invariant subspace attacks.The longer distinguisher they mentioned is a 21-round impossible differential distinguisher.
The constant addition in blue (-) plays no role when searching for differential properties, and the round key addition in green (-) can be ignored when considering the single key scenario.
To the best of our knowledge, two external cryptanalyses have been reported to date, both studying differential-based attacks.In the article [KY20] by Kumar and Yadav, a 21round differential attack is presented (based on a 16-round differential characteristic), with a time and data complexity of 2 113 .Concurrently with our work, a 23-round differential attack and a 24-round rectangle attack were reported by Teh and Biryukov in [TB21].

Boomerang Attacks
The boomerang attack is a variant of the differential attack that was introduced by David Wagner in 1999 [Wag99].The boomerang distinguishers at their basis are defined by two differences α and δ chosen so that the probability of the following equality is higher for the (round-reduced) cipher E than for a random permutation: In their basic form, boomerang distinguishers are built by rewriting E as the composition of two sub-ciphers; E = E 1 • E 0 and by finding a good differential for each part.If we denote by p the probability of the differential over E 0 , that is p = Pr(α → E0 β) and by q the probability of the second differential over E 1 (q = Pr(δ → E −1 1 γ)), a first approximation returns that the probability of Equation (1) is close to p 2 q 2 .Kelsey et al. [KKS01] and Biham et al. [BDK01] independently introduced a chosen plaintext-only version of the boomerang distinguisher, that they respectively named the amplified and the rectangle technique.This variant relies on the same rewriting of the cipher and on the same differentials as previously and consists in observing twice the difference δ in the quartets at the output of the cipher.The distinguisher is expected to have a probability of p 2 q 2 2 −n , while it would be 2 −2n for a n-bit random permutation.
As shown for instance in the analysis of Sean Murphy [Mur11], the naive approximation of the probability of a boomerang distinguisher might turn wrong due to an incompatibility between the upper and the lower differentials.To solve this problem, Dunkelman et al. introduced the sandwich attack [DKS10] which adds a middle part E m in the rewriting of to isolate and study separately the rounds where the two differentials are interdependent.This middle part is called boomerang switch [BK09]; if we denote by r the probability that E m connects the upper and the lower trails, the final probability of the distinguisher becomes p 2 q 2 r.

Computing the middle part probability.
A recent line of works showed how to approximate the value of r with the use of various tables.The first to be introduced is the BCT [CHP + 18], developed by Cid et al. to deal with 1-round boomerang switches in the case of SPN ciphers.In this paper we focus on the Feistel case and thus recall the FBCT, the FBDT and the FBET, introduced in [BHL + 20].The notation is recalled in Figure 2.
The table used to compute the 1-round probability depends on which input and output differences of the S-box are fixed: for instance, the FBCT is chosen when only the inputs ∆ i and ∇ o are fixed (see Figure 2).
Figure 2: View of the parameters of the tables: ∆ i is the input difference and δ is the output difference of S when looking at the difference between state 1 and 2 .∇ o is the input difference of the same S-box S when looking at the difference between state 1 and 3 and α is its output difference.We focus on the case where the differences are the same on parallel sides.
From the distinguisher to the attack.Once an efficient boomerang distinguisher is found, there exist several techniques to extend it to a key recovery attack, as summarized for instance in [DQSW21].In this article we focus on the technique devised by Zhao et al.
The parameters on which the complexities of an attack depend are shown on the right in Figure 3.The key recovery works by adding few rounds before and after the N d distinguisher rounds E d .We denote E b the N b rounds prepended and by E f the N f appended rounds.The attacker extends backward with probability one the input difference α, obtaining a truncated difference α with r b possibly active bits and n − r b inactive bits.In the same way, δ is extended forward with probability one over the N f rounds, giving a truncated difference equal to δ , with n − r f inactive bits.
The idea to recover some key material is to make a guess on key bits appearing in E b and E f in order to be able to count how many times the distinguishing property happens, the correct key material being amongst the ones with a large number of hits.In the following description, we denote by m b the number of key bits in E b and by m f the number of key bits in E f .The detail of the attack procedure devised by Zhao et al. [ZDM + 20] is as follows, where s is the expected number of right pairs: structures of 2 r b plaintexts each, and store them with their associated plaintexts.
(c) Insert S into a hash table H indexed by the n − r f bits that are inactive in δ .Each collision defines a quartet (C 1 , C 2 , C 3 , C 4 ).(d) Use these quartets to determine the correct m f key bits.The time complexity of this stage is denoted .
Depending on the parameters, two factors might be dominating the time complexity; either the cost of stage 2.(b) or the cost of the last stage.Their time complexity is respectively 2 Since stage 2.(b) does partial encryptions over E b , µ can be approximated by while corresponds to the cost of gradually decrypting rounds to check the validity of a key guess, so we decide to approximate it by 1 s .
Success probability.We use the formula devised in [Sel08] for differential cryptanalysis (and later used in the context of rectangle attacks) to evaluate the probability of finding the correct key: where S N is the signal-to-noise ratio, so is equal to p 2 q 2 r/2 −n and h is the advantage.

Delaune et al.'s Model
In Step 2 search that tries to instantiate those truncated boomerangs with concrete nibble differences so that the distinguisher has the highest possible probability.In this article, we use the same notation and similar steps.
A boomerang distinguisher uses two differential trails, one is called the upper trail and determines α, the input difference of the distinguisher.The other one is called the lower trail and determines δ, the output of the distinguisher (see Figure 3).In the model proposed in [DDV20] the division as a sandwich is not made but the upper and lower trails are searched on all the rounds.In what follows we denote by δ X [r, i] the nibble difference at the input of an S-box and by δ SB [r, i] the corresponding output difference of the S-box.
The boomerang model of [DDV20] uses six variables for each S-box in its Step 1: 3 variables relate to the upper trail (in the encryption direction) whereas the 3 others relate to the lower trail (in the decryption direction).These variables are used to select the proper boomerang transition tables and are defined as: is free of conditions, that is can take any value with a uniform probability in the upper (resp.lower) trail, free SBup [r, i] is a Boolean variable that indicates if the nibble difference δ SB [r, i] can take any value with a uniform probability in the upper trail as an output of the S-box.free SBlo [r, i] is a Boolean variable that indicates if the nibble difference δ SB [r, i] can take any value with a uniform probability in the lower trail.Note that the free SBup [r, i] variable represents the state of the variable after the S-box in the encryption direction, so free SBlo [r, i] can be seen as the input state of the S-box in the decryption direction.
Several constraints describe the relations between these variables, starting with the one modelling the propagation of the free states through the S-boxes: if a variable is free before an S-box, it is also free after the S-box.Since the propagation is done in the opposite direction for the lower trail, the implication is in the other direction for the lo variables.
The second rule ensures that if an S-box output is free then the S-box input must be non-zero.Again the lower trail is reversed since it represents the decryption direction.
The third rule ensures that we can compute the probability of the S-box by setting a minimum number of parameters.
Finally, for any linear operation there is a constraint stating that if any input variable is free then all the output variables on which it depends are also free.
Given this set of constraints, the solver is going to choose the best truncated trail among the ones with a valid propagation of differences, where the quality of a trail is measured by the best probability it might reach.Namely, given the state of each S-box, one can determine which table (DDT, BCT, etc) should be used to compute its probability, and based on this the best probability of the Step 1 solution is obtained by assuming that the best transition in the table is met.Once the best solution for Step 1 is found it is given as input to the Step 2 model, which looks for a concrete instance of the upper and lower trails, again with the objective of reaching the best possible probability.

Automatic Search of Truncated Boomerang Distinguishers for Feistel Ciphers
This section describes how to build an automated tool that searches for truncated boomerang distinguishers for Feistel ciphers.Our method follows the idea developed by Delaune et al. for Skinny in [DDV20] but makes the required adjustments to fit the Feistel structure.
Bup Aup  Note that the F function is never inverted and that the only difference comes from the direction in which the XOR is computed.
Required changes.The model presented in [DDV20] for SPN ciphers treats differently the S-boxes of the lower and upper trail to take into account the direction in which they are computed.Given the specific property of Feistel ciphers (that are their own inverse) and as illustrated in Figure 4, our model does not have to make this distinction, so we end up with the same constraints for the S-boxes in the upper and in the lower trail: For the same reason we change the second constraint as follows: Knowing which input and output differences are fixed for every S-box allows to select the correct table from Definition 1 to compute the associated boomerang probability.For instance, if the two input differences ∆ i and ∇ o are fixed while α and δ are free parameters, the required table is F BCT (∆ i , ∇ o ).
For the model, it corresponds to the case where the input value of Xup and the input value of Xlo are fixed, so where free Xup and free Xlo are assigned to false.α and δ being free means that free SBup and free SBlo are equal to true, so we end up with constraint (4) that indicates when the F BCT table is required to compute a 1-round probability.
Other tables are built in the same way and we obtain constraints (7) to (11) of Model 1 that select the correct table according to which variables are fixed or not.
To ensure that the probability of each S-box can be computed using one of the Feistel boomerang transitions, we add the following constraint: While S-boxes are treated in the same way in the upper and lower trail, special care has to be taken to correctly propagate knowledge through the XOR operations.In the upper trail (Figure 4a) we have the following equation: A = F (B) ⊕ A while for the lower trail (Figure 4b) we have: A = F (B) ⊕ A .This leads to the following distinction in the constraints: Constraints (2) to (5) are the core mechanisms of the boomerang model for Feistel ciphers.They must be applied on every S-box transition.Constraint (6) must be used on the parts of the state that are XORed together.
Resulting model.The complete model is provided in Model 1.Its first half is dedicated to the selection of the correct boomerang table.The second part starts with constraint (12) which ensures that the trails are active (i.e. that there is at least one difference in α and δ).Constraints (13) to (16) define the propagation from one round to the other, while the block of constraints (17) corresponds to the constraints (2), (3) and (5) explained at the beginning of this section and model the S-box transition.The model ends with the objective (18) given here in its naive form and that can be simplified as we now discuss.

Improvements
Weighted sum simplification.Given a model looking for differential characteristics, an upper bound of the probability is obtained by multiplying the number of active S-boxes found during Step 1 by the log 2 of the maximum probability of the transition of an active S-box, that is U B = 2 −P DDT ×#SB .
Similarly, for a model looking for boomerang distinguishers, the upper bound has to take into account the various tables that are possible (the ones that can be selected in the first half of Model 1) and for each of them their maximum probability, denoted Model 1: Model searching for truncated boomerangs on WARP, part 1/2: table selection.P DDT , P DDT 2 , ..., P F BDT .The objective (and consequently the bound) thus corresponds to a weighted sum, as shown in (18), Model 1.
Even if the semantic remains the same, reordering this weighted sum may have a huge impact on the resolution time.The first simplification that can be done corresponds to cases where a table has a maximum probability of 1.In such a setting, the table can simply be ignored during Step 1.This happens for the F BCT of WARP.The second simplification occurs when different tables have the same maximum probability, in which case they can be grouped by their respective maximum probabilities.For WARP, such an equality happens for the DDT , the F BDT and the F BET which have the same maximum denoted P isT able .Also, the DDT 2 can be handled by counting them twice more than the DDT in the sum.Thus, the obj function can be simplified as follows: minimize obj : all rounds include E b , E d and E f rounds of Figure 3 while distinguisher rounds only include E d rounds.
Model 1: Model searching for truncated boomerangs on WARP, part 2/2.π even and π odd correspond to the subparts of the π permutation for even or odd inputs only.BR is the number of branches of the cipher, so is equal to 32 in the case of WARP.
In addition to this, since there is a single maximum probability for all the tables (except for the F BCT removed previously), we can rewrite the weighted sum as: Finally, once the objective function is simplified we can use the Quine-McCluskey algorithm to create a minimized Boolean predicate isTable ) and use it in the weighted sum: Incremental search.In our model, the objective function non-strictly decreases as the number of rounds r increases, since if we do not add an active S-box we will have the same optimal probability while if we do add one S-box the probability will always be equal (if the maximal possible probability of the table is 1) or lower (if the maximum possible probability is strictly less than 1).We can use this information to lower bound the objective function for r + 1 rounds when we know the optimal probability for r rounds: This observation allows to give the model additional information about the minimum bound, which makes it stop the search earlier and therefore save execution time.
Forcing the pattern of the solution.The previous model allows to find distinguishers for up to 20 rounds of WARP (in both Step 1 and Step 2), but more rounds are out of reach as the computation time grows exponentially in the number of rounds.
However, we found out that all the optimal solutions returned for N d = 15 to N d = 20 have a specific pattern of the form 1-1-0-1-1, that is contain a sequence of 5 rounds with 1 active S-box in the first round, 1 active S-box in the second round, 0 in the third one, and so on.
While we cannot formally prove that this pattern is going to appear in the optimal solutions for 21 rounds and more, we believe that there are high chances that it does, so we decided to add a Step 1 constraint that forces the solutions to follow this specific pattern.This assumption seems reasonable as we did not observe a break in the probability chart.
Formally, it gives (note that we do not fix the position where the pattern appears):

Instantiating the Truncated Boomerang Distinguishers
As mentioned before, we decompose our analysis into two steps.The first one (described above) implements the search of truncated boomerang distinguishers and is written in Picat SAT [ZK16], the SAT compiler in the Picat system.Each S-box of each round is associated to 6 bits: 3 for the upper trail and 3 for the lower trail.They indicate if an S-box is active or not, if the S-box input is free or not and if the S-box output is free or not.
The second step looks for concrete instantiations of the previous truncated solutions.It is written in the open source Java constraint programming library Choco [PFL16].This step is also inspired from the one of [DDV20].

CP Model
The Constraint Programming model of Step 2 takes as input the results of Step 1 to know the general shape of the distinguisher, in particular which nibbles are inactive.To transform a truncated solution into a concrete one we need to assign values to the nibbles.For each pair of nibble abstraction (isActive X , free X ) we create a variable δ X whose domain depends on the Step 1 solution: In the same way, δ SB variables are created depending on the value of the pair (isActive X , free SB ).As the free variables can take any value from the nibble domain and are not constrained by the model we ignore them in Step 2.
One table constraint is created for each of the tables appearing in the probability computation (DDT, DDT 2 , FBCT, FBDT and FBET), and we also make one to handle the XOR over nibbles.In addition to indicating the valid transitions, it also contains a third variable corresponding to the absolute value of the base 2 logarithm of its probability.The truncated solutions outputted by Step 1 completely define which table is used.For example if we have: ), which corresponds to a DDT transition in the upper trail, we add the constraint: The objective function is then the following sum: Combining the two Steps.

Clusters
Once the optimal solution has been found for Step 1 and Step 2 (this solution is hereafter denoted < S 1 ref, S 2 ref >), the goal is to obtain a better approximation of the actual probability of the boomerang distinguisher by considering clusters.Indeed, the solution returned by Step 2 has most of its S-box transitions fixed, while the only differences that matter when considering a boomerang distinguisher are the input and output differences (α and δ in Figure 3).
To get closer to this actual probability, we start by generating multiple Step 1 solutions that have their truncated differences in the first round and in the last round equal to the ones of S 1 ref .The objective is to take into account many solutions that are all different one from the others.We need to be carefull about what being different means in our context as the situation is a bit more subtil than for a differential attack (for which the only two possible S-box status are "active" and "inactive").
In our model, we have the special case of the free S-box inputs that can take any value uniformly.If we focus on one particular S-box, the case of a fixed active input difference can be seen as contained in this one, so we must not count these cases as two independent ones.To be on the safe side, we choose to consider that two Step 1 solutions are different if at least one of their S-boxes is not free and inactive in one while it is not free and active in the other.
We thus search for the Step 2 solutions corresponding to these, with the additional condition that the nibble differences in the first and last rounds are the one of S 2 ref .
We sum over the different values of obj 2 in a variable called OBJ.To avoid counting solutions with too low probabilities we leave out the solutions of probability lower than 2 −10 × p(S 1 ref ) for both Step 1 and Step 2. Still, the large number of solutions forces us to set a limit on the number of solutions enumerated in Step 2 for a given Step 1 solution: we set this limit to 2 20 .
To check the validity of our approach, we wanted to compare the result of the simple model and of the model with the clusters with what can be experimentally observed.To do so, we decided to pick a Feistel cipher with a smaller block size than WARP with the hope that the clustering effect would be easier and faster to observe.We selected the 64-bit block cipher TWINE [SMMK13] reduced to 12 rounds.We first computed its experimental probability by fixing the differences α and δ and we counted how many times the boomerang comes back.For the considered example, we obtained a probability close to 2 −25.8 when testing 2 29 plaintexts with 2 4 keys with a Rust experiment.The corresponding optimal Step 1 solution has a probability of 2 −26 and is instantiated in Step 2 with a trail of probability 2 −26 .When aggregating solutions with the same input and output differences we obtain an approximation of the distinguisher probability of 2 −25.15 which is close to the experimental result, albeit slightly exceeding it.

Implementation details. The Step 1 is written in MiniZinc and runs on Picat [ZK16]
which uses the Lingeling solver [Bie11] under the hood, the Step 2 is written in Choco [PFL16] version 4.10.6 which is a dedicated framework for Constraint Programming running We found 20 instances (from 3 rounds up to 22 rounds) with a probability better than 2 −128 .They all took less than 48 hours to solve.).The black line corresponds to the probability 2 −128 .
Without taking into account the clusters, the longest distinguisher that can be obtained is a 22-round boomerang of probability 2 −120 .By summing up several boomerang trails inside one 23-round solution we are able to build a distinguisher of 23 rounds with probability 2 −124 .By exploiting the position of the key addition, it can easily be extended to a 25-round distinguisher, thanks to the easy trick that we now present1 .
The distinguisher is depicted in Appendix A. Its 32 nibbles of input and output differences are given by: α = 57 00 00 07 00 00 57 57 07 57 00 07 00 00 57 00 δ = 70 05 00 70 05 00 70 70 00 00 70 70 00 00 00 05 (note that for simplicity we kept the last round permutation in the figures and here).To exploit this distinguisher, an attacker would ask for the encryption of a large number of pairs M 1 , M 2 verifying M 1 ⊕ M 2 = α, and build two new ciphertexts by computing C 3 = E(M 1 )⊕δ and C 4 = E(M 2 )⊕δ.She would then ask for the corresponding plaintexts and check if Since the round keys are added after the application of the S-boxes, an attacker can compute the difference entering the second round of the upper trail, and similarly the difference at the input of the S-boxes of the penultimate round of the lower trail.This easily leads to an extension of two rounds of any boomerang distinguisher.The attacker starts by picking a random message , and computes M 2 according to the difference she wants to observe one round later.For instance, it would give the begining of M 2 to be and the boomerang returns if M 3 and M 4 verify

Automatic Search of Rectangle Attacks
As already discussed in [ZDJ19] in the case of Deoxys-BC, the best rectangle distinguishers do not always lead to the best attacks, and choosing a sub-optimal (in terms of probability) distinguisher might allow to cover more rounds in the key recovery phase, and then to attack a bigger version of the cipher.
Following this idea, Lingyue Qin et al. proposed an automatic model that directly searches for an attack [QDW + 21] by taking into account the dominating factors of the key-recovery step.The model simply minimizes the time complexity of the attack instead of maximizing the probability of the distinguisher, while making sure that the data complexity does not exceed the full codebook.The number of rounds on which is run the model is gradually increased until the returned time complexity exceeds the cost of an exhaustive search of the secret key.This technique turned effective as it leads to improved attacks on the SPN ciphers Skinny and ForkSkinny [QDW + 21].
In this section we study how to apply a similar idea to find good attack parameters for WARP and show that when considering the attack technique introduced by Zhao et al. in [ZDM + 20] there are at least two possible improvements in comparison to a variant of WARP with the key addition positioned before the S-box.The first one is the reduction of the value of m b (the number of key bits that have to be guessed in the upper rounds), and the second one is the potential growth of the number of filtering bits, that is of n − r f .These two improvements are crucial since the two predominating factors of the time complexity of the attack of

Taking Advantage of the Structure of WARP
Reduction of m b .To understand the first point, we look at a simple example that considers N b = 3 rounds of key recovery prepended to the distinguisher, where α, the top difference of the boomerang distinguisher, has only two active nibbles, in position 1 and 4 (see the bottom of Figure 6).
To determine the value of r b (the number of active bits in the plaintext structures), an attacker starts by propagating backwards the difference α to know which nibbles might be active and which are inactive for sure.This process is rather straightforward, and in our example it returns r b = 32 active nibbles (denoted in green in Figure 6).
Determining the required key bits to apply [ZDM + 20] over 3 rounds.
The next step is the determination of the key bits that are required to compute M 2 from M 1 .In the description given in [ZDM + 20], the attacker starts from M 1 , computes partially the state at the input of the boomerang distinguisher so that she can add α to it, and decrypts the result to get M 2 .Put differently, it can be seen as guessing the necessary key bits to uniquely determine the difference to add to M 1 to get M 2 knowing that after N b rounds the internal states differ of α, which can also be seen as being able to uniquely propagate α backwards.
Consequently, the attacker starts from α, and takes note of the information needed: in round 2, the exact difference at the output of the second S-box has to be uniquely determined, so the input value of this S-box is needed.This is denoted by the dashed lines in Figure 6.Knowing this value in round 2 implies that the two inputs of the xor number 6 of round 1 are known, which in particular forces to make a guess on the key nibble added after the S-box.The rest of the backward propagation is processed similarly, and we obtain that 4 nibbles of key are required in total.In the case where round keys are added before the S-boxes, this computation would have return a total of 10 nibbles of key (we consider here that the round keys are independent).

Improved filtering process.
For some specific shapes of output difference δ , the attacker is able to increase the number of bits on which she looks for collision in step number 2(c) of the attack.An example of this is given in Figure 7 with N f = 2: by counting the number of active nibbles at the output, we obtain r f = 21 × 4 = 84, which means that the hash table is used to look for collisions over 2 × (128 − 84) = 88 bits.
In our example, the F functions number 4 to 9, 12 and 13 in the last round all have a similar difference pattern where the input of the S-box (X[r, 2i]) is active while the right part (X[r, 2i + 1]) is not.
The inactivity of X[r, 2i + 1] can be translated into the equality Given that F is the application of an S-box followed by a sub-key addition it can be simplified into: ] which does not depend on a secret value.Thus, the idea is to simply add as index the value of S(C[2i]) ⊕ C[2i + 1] when building the H table of step 2.(c), in addition to the value of the bits where the difference is expected to be 0 in the ciphertexts.In our example, it means that we are colliding on 32 additional bits, and thus that the filter for quartets is of size 2 × (128 − 52) = 152 bits.

Model for Searching a Rectangle Attack
In this subsection, we briefly go over the main characteristics of the model searching for the rectangle attack.The detailed model is given in Model 2 in Appendix C. It takes as input the number of rounds covered by the distinguisher and by the prepended and appended rounds of key recovery (respectively N d , N b and N f ) and returns the complexities of the best attack that can be found.
The intermediate values that are needed to evaluate the time and data complexity of the attack are m b , r f , and the probability of the distinguisher (previously denoted p 2 q 2 r).This latter is determined with the same constraints as in the model searching for the best distinguisher, and we simply add constraints to model the additional N b and N f rounds.
• r b and r f are computed by propagating with probability one the difference at the input and at the output of the boomerang distinguisher, see constraints (30) and (34).To take into account the above described trick on the filtering process, we define r f as the number of active nibbles entering the last round.
• The data complexity is computed so that s right quartets are found, see constraint (27).To make sure that it does not exceed was is available, one may add a constraint stating that • As it is impossible to compute the cluster at this stage, we introduce a variable cluster gain which is set by the attacker to represents the expected gain obtained with clusters.Its value is precisely computed afterwards, once a solution to the model is obtained.
• The time complexity is computed as the maximum between the two most expensive stages detailed in Section 2.2, see constraint ( 27).Again, one may add a constraint saying that the resulting time complexity has to be smaller than the cost of an exhautive search of the key.
• Constraint (29) makes the link between the known variables and the guess key variables.
• We take into account the simple key schedule of WARP to precisely compute the value of m b .We start by determining the states that have to be known in value (denoted known in the model, constraints (31), ( 32) and ( 33)) and then link them to the keys and to m b , taking into account the key schedule (constraints (28) and ( 29)).
All the values (except t which is related to the distinguisher probability and cluster gain which is first only approximated) can be computed during Step 1.As a result all the constraints are computed in Step 1 and only the constraints that imply 2t or t are modeled in Step 2.

Results: A 26-round Attack on WARP
We apply the previous model to search for rectangle attacks on WARP with various values for the parameters N b , N d and N f .As the execution time rapidly increases with the number of rounds, we added another constraint (proposed in [DDV20]) which consists in using the bounds obtained for differential distinguishers.The idea is that the upper trail (resp.lower trail) cannot be better than a differential trail, i.e. the number of active S-boxes in the upper trail (resp.lower trail) cannot be lower than the minimal number of active S-boxes of a differential trail (optimal dif f ).To implement this idea for WARP we use the lower bound of the number of active S-boxes computed in [BBI + 20].
The best attack we found covers 26 rounds of WARP based on a N d = 22 rounds distinguisher, N b = 1 round added before and N f = 3 rounds added after.The model took 6 days to solve this instance, and returned the following values: m b = 0 , r b = 72, r f = 60 and a distinguisher probability of 2 −128 .This search was made by assuming that s is equal to 4 and that the value of the cluster gains would be the ones given in Table 4, so in the case of a 22-round distinguisher equal to a factor of 2 12 .
We next run the cluster search on the 22-round distinguisher.We obtained a probability approximation of 2 −111.2(so with a cluster gain a bit larger than what was expected), resulting in the associated attack having a data complexity of 2 120.6 messages, and a time complexity of little less than 2 116 encryptions when following the key recovery method introduced by Zhao et al.
The success probability of the attack is equal to 97,67 % (using the formula given in [Sel08]) and this is the best attack reported so far on WARP.

On the Impact of the Key Addition Position
We now briefly discuss a variant of WARP with a round key addition made before the S-box application, and consider an attack using the same 22-round distinguisher, and the same number of rounds N b and N f added before and after the distinguisher.The different round structure changes the value of m b and r f , as the improvements discussed in Section 4.1 cannot be applied anymore.m b increases (from 0) to 32, while r f is now equal to 88 instead of 60.The data complexity of an attack with such parameters would still be 2 120.6 , but the dominating factor of the time complexity becomes 2 191.2 .
This example shows the importance of the key addition position and of the techniques discussed in Section 4.1 that save a factor of 2 75.3 in the time complexity.

Conclusion
In this article, we propose the adaptation of two recent techniques to the case of Feistel ciphers to find boomerang distinguishers and rectangle attacks.Our analysis reveals a 23-round distinguisher and a 26-round attack of WARP, beating by 2 rounds the recent results of [TB21].Our code is public and can be used as a basis to attack other Feistel ciphers, and we actually demonstrate its versatility by providing results for TWINE and LBlock-s (see Appendix D).
Secondarily, while studying WARP we show how to take advantage of the key addition position to reduce the complexity of the attack.In our specific case, this design decision allows to reduce by a factor of 2 75 the time complexity of the attack in comparison to a variant of WARP that would have the key addition positioned before the S-box (and thus would have the complementation property).

D Application of our Technique to TWINE and LBlock-s
To illustrate the flexibility of our tool, this section reports the results obtained when applying it to two well-known Feistel ciphers, TWINE and LBlock-s.TWINE [SMMK13] is a 64-bit block cipher with a Type-II GFN structure and LBlock-s is used in the authenticated encryption LAC [ZWW + 14] submitted to the CAESAR competition.LBlocks is a simplified version of the original cipher LBlock [WZ11] which uses only one S-box instead of the 8 original ones and admits 16 rounds or 32 rounds according to where it is used in LAC.It is also a 64-bit cipher and it could also be represented as a Type-II GFN as shown in [SN14].This is that representation that we used for our models.Then, for those two ciphers, we apply our method for computing the boomerang clusters and the results are summed up in Table 5.

This paper
In [BHL + 20], the authors used a C code to experimentally compute the probability of the 8 middle rounds of the boomerang distinguisher, while a single trail was used for the top and the bottom parts of the sandwich.The slightly improved value obtained with our new method shows that the 8-round boomerang switch does not capture everything, and that other trails contribute to the boomerang.

Figure 3 :
Figure 3: Sandwich distinguisher (left) and setting for an attack, including the key recovery (right).

2.
For each possible value of the m b key bits: (a) Initialize 2 m f key counters.(b) Partially encrypt each plaintext M 1 of each structure using the guessed m b key bits up to the beginning of E d .Add α to the computed value and decrypt it up to the plaintext, to obtain M 2 .Construct the set S (of size y • 2 r b ) given by: Feistel backward round.

Figure 4 :
Figure 4: Encryption (4a) and decryption (4b) procedure of a classical Feistel cipher.Note that the F function is never inverted and that the only difference comes from the direction in which the XOR is computed.

Figure 7 :
Figure 7: Example of difference propagation over N f = 2 rounds.

Table 1 :
Complexities of the existing results on WARP.Note that for all the distinguishers presented here we can add 2 rounds for free, see Section 3.4.ID = Impossible Differential, DC = Differential Characteristic.
permutation π in order to optimize both the diffusion and the number of active S-boxes in a differential or linear trail.The cipher iterates 41 rounds, where the final round misses π.
Definition 1 (FBCT, FBDT and FBET [BHL + 20]).Let S be a function from F n 2 to itself, and (∆ i , δ, ∇ o , α) be elements of (F n 2 ) 4 .The Feistel Boomerang Connectivity Table (FBCT), Feistel Boomerang Difference Table (FBDT) and Feistel Boomerang Extended Table (FBET) of S are given by: Step 1 is composed of two different strategies: Step1 − Opt that searches for the truncated boomerang with the best objective obj and Step1 − N ext that enumerates one by one the solutions that reach this minimum obj.The best obj value is an upper bound (U B) that can not always be reached as Step 1 is an abstraction (some truncated solutions may not have concrete instances).The lower bound (LB) is fixed to 0 since it is the lowest possible value.For a given number of rounds, we first run Step1 − Opt to find U B, and we next interleave Step1 − N ext with Step 2 to obtain a concrete boomerang with the best probability.Once done, we update LB with this new value and repeat the process for all the Step 1 optimal solutions.If a Step 2 is returned it means that the solution has a better probability than the given LB, so we update LB and we continue the search.If no Step 1 is found it means that we have already seen all the Step 1 solutions that can match U B, so we degrade U B and we continue the search with Step1 − Opt.If LB = U B we have found the best solution available.Note that the model generates many solutions in Step 1 and most of the time we stop the search when LB = U B instead of enumerating all possible Step 1 solutions.

Table 4 :
[LDLS21]tinguishers found after 2 days when summing up boomerang characteristics in the same cluster for the best Step 1 solutions.We choose the Picat solver to solve the Step 1 as it is a SAT solver, so is especially suited to problems on Boolean formulae.Previous works like[LDLS21]have shown that Picat has good performances on multiple Step 1 models.Since the Step 2 contains a lot of table constraints, it appears that CP solvers are more adapted.The experiments are run on a virtual machine Ubuntu 18.04.5 LTS x86_64 with an Intel Xeon Gold 5118 processor and 32 Gio of RAM.The requirements are : Java 10.0.12 OpenJDK, Gradle 6.8, MiniZinc 2.5.5, Picat 3.1.2and Choco 4.10.6.Each instance is run on a single thread.
Dong, and Keting Jia.New related-tweakey boomerang and rectangle attacks on deoxys-bc including BDT effect.IACR Trans.Symm.

Table 5 :
Summary of the results for computing the best boomerang clusters for TWINE and LBlock-s.