Towards Key-recovery-attack Friendly Distinguishers: Application to GIFT-128

. When analyzing a block cipher, the ﬁrst step is to search for some valid distinguishers, for example, the diﬀerential trails in the diﬀerential cryptanalysis and the linear trails in the linear cryptanalysis. A distinguisher is advantageous if it can be utilized to attack more rounds and the amount of the involved key bits during the key-recovery process is small, as this leads to a long attack with a low complexity. In this article, we propose a two-step strategy to search for such advantageous distinguishers. This strategy is inspired by the intuition that if a diﬀerential is advantageous only when some properties are satisﬁed, then we can predeﬁne some constraints describing these properties and search for the diﬀerentials in the small set.


Introduction
Differential cryptanalysis was proposed by Biham and Shamir in [BS91] and linear cryptanalysis was proposed by Matsui in [Mat94]. These are the most two famous methods for analyzing block ciphers. For both these two methods, the first step is to search out some valid distinguishers: the differential trails for differential cryptanalysis and the linear trails for linear cryptanalysis. When searching out a distinguisher: Input→ Output, the cryptanalysts extend some round functions backward from the Input and forward from the Output. After that, a key recovery attack is executed by guessing the involved key bits in the extended rounds. Usually, more extended rounds and less involved key bits are expected, as it leads to a more efficient attack with a longer length and a lower complexity, respectively. Both these two expectations are affected by the distinguisher's Input and Output.
To search for differential and linear trails efficiently, several automatic methods have been introduced and they have facilitated many fruitful works [KLT15,AK18,SS14,FWG + 16, AST + 17, KLT15,AK18]. Among them, the MILP based method and the SMT/SAT based method are most widely used. These methods perform well when the searching space is not too large and the model is very easy to implement. However, when searching for long trails, they may be inefficient as the search space is too large. What's more, the underlying solvers of these methods, such as Gurobi [Gur] in MILP and STP [STP] in SMT/SAT, are used as black boxes. We can not iteratively make adjustments according to their outputs. In addition to these new introduced methods, Matsui's branch-and-bound algorithm may be the most well known method to search for differential and linear trails. It also has two sides. On one side, this algorithm employs a depth-first with pruning search strategy and does guarantee to return all best trails for any initial value. On the other side, the cryptanalysts need to know what a good initial value might be before the search process.
In another direction, GIFT [BPP + 17] lightweight block cipher is designed by Banik et al. [BPP + 17], which includes two versions: GIFT-64 and GIFT-128. Both of them have a 128-bit key size and inherit the design framework from PRESENT [BKL + 07], but correcting the weakness in linear cryptanalysis. Specially, by a dedicated selection of the Sbox and the linear layer, it avoids the single active bit transitions for two consecutive rounds in both differential and linear trails, which stops the very effective linear hull attacks. GIFT gains more efficiency in various domains, i.e., much smaller hardware implementation, faster encryptions and more secure against the known attacks. Moreover, the hardware cost of the GIFT Sbox is smaller than that of PRESENT Sbox and its key schedule is also much simpler, which makes it more lightweight. In addition, in the round based hardware implementation, the area of GIFT is even smaller than the recently proposed lightweight block ciphers SKINNY Hence, it is quite important to understand the security level of GIFT and many evaluation results of GIFT have been published. In the GIFT [BPP + 17] document, the designers claimed a 9-round differential with probability 2 −44.415 and a 9-round linear with probability 2 −49.997 . At CT-RSA 2019, Zhu et al. [ZDY18] gave the first third-party cryptanalysis on GIFT, including a 19-round and a 22-round key-recovery attack on GIFT-64 and GIFT-128, respectively. Sasaki et al. [Sas18] improved the meet-in-the-middle (MitM) attack on 15-round GIFT-64. Zhou et al. [ZZDX19] gave the minimum number of differential/linear active Sboxes for up to 16/15 rounds and found the best differential/linear characteristics for up to 15/13 rounds of GIFT-64. Li et al. [LWZZ19] reduced the searching time to 4 seconds to obtain a 9-round GIFT-64 differential trail with probability 2 −42 . For GIFT-128, they found a 21-round differential trail with probability 2 −126.415 . Due to too many active bits on the output of the 21-round differential trail, they utilized the last 20 rounds of the 21-round differential trail as the distinguisher to attack 26-round GIFT-128. Later, the 26-round attack is improved by Ji et al. [JZZD20]. In [JZD19], Ji et al. improved Matsui's algorithm by using three new methods. They claimed the highest probability of the differential trails of GIFT-128 up to 19-round and the highest probability of the best linear trails up to 10-round. They presented a 19-round differential trail and a 10-round linear trail, both with the highest probability. What's more, they also claimed that a 20-round GIFT-128 differential trail with probability 2 −121.415 and a 21-round one with probability 2 −126.415 were found. In [LLL + 19], the authors found another 21-round differential trail on GIFT-128 with probability 2 −126.415 . All the above results are under the single-key setting. In related-key setting, Liu and Sasaki [LS19] gave a 23-round and a 21-round boomerang attack on GIFT-64 and GIFT-128, respectively. Chen et al. [CWZ] gave a 23-round relatedkey rectangle attack on and Zhao et al. [ZDM + 19] improved it to a 24-round attack. Ji et al. [JZZD20] improved the related-key attack on GIFT-128 to 23 rounds. In the GIFT document, the designers claimed no security under the related-key setting. In [LWZZ19], the authors study the influences between the solution and construction of MILP models, and give good results on PRESENT, GIFT-64 and GIFT-128. However, the exact relationship is still ambiguous. To study the dependence of key bits, there are two articles [HV18,Sas18] published. In [HV18], the algorithm can be used to estimate the key dependent correlation distribution of a linear approximation to facilitate advanced linear attacks and also search for a large number of trails by converting the diffferential/linear trails into paths in a multistage graph. However, no results on GIFT-128 are given in this article. In [Sas18], the algorithm studying the dependence between key bits is used to facilitate advanced meet-in-the-middle attack on GIFT-64.

Our Contributions
We propose a two-step strategy for searching advantageous distinguishers which can lead to long attacks with a small number of involved key bits. The overall concept is inspired by that, when mounting an attack, the two expectations: 1) more rounds are extended by the distinguisher and 2) less key bits are involved during the key-recovery process, are both determined by the distinguisher's Input and Output. In the first step, we specify the Input(Output) values in a set called the InitialSet which need to satisfy the following two conditions: 1) a distinguisher with an input (output) from the InitialSet can be extended by many rounds at the top (the bottom), to lead a long attack, 2) the amount of involved key bits in the extended rounds is small, to lead a low attack complexity. In the second step, we only search for advantageous distinguishers with input and output values from the InitialSet. This provides two benefits: the searching space is reduced and an efficient attack can be mounted once a distinguisher is found out. As a first application, we use it to search differential and linear trails of GIFT-128 and give cryptanalysis results on GIFT-128 and two GIFT-128-based proposals: SUNDAE-GIFT and GIFT-COFB. In more detail, we achieve the following: a. We utilize the MILP technique and revisit Matsui's branch-and-bound algorithm to implement a two-step strategy of searching for advantageous differential and linear distinguishers. In the first step, we construct some MILP models describing the difference (linear mask) propagation in a block cipher's round function and marking the involved key bits. This step can output the InitialSet including all possible values of the Input(Output) that most rounds can be added at the top (the bottom) of a distinguisher while the involved key bits are the least. In the second step, we revisit Matsui's algorithm to search out some advantageous distinguishers that can lead to efficient attacks. The initial value of this step is only chosen from the InitialSet.
These two steps make full use of the good sides of both the MILP method and Matsui's algorithm, while avoiding their bad sides. Usually, the searching space of the first step is small, the MILP method can be very efficient to searching the solutions. In the second step, the problem that Matsui's algorithm needs some good initial values is solved by only searching in a smaller space limited by the InitialSet.
b. We apply our strategy to analyze GIFT-128 and search for both differential and linear trails. For differential cryptanalysis, we find a 20-round differential that can be used to attack 27-round GIFT-128. This is the first 27-round attack on GIFT-128, which covers one more round than the other results. Although some 21-round differential are found in some other works, they are all weaker than our 20-round differential trail. And this also proves the validity of our search strategy. For linear cryptanalysis, we find a 17-round linear hull and give the first linear key-recovery attack on GIFT-128, which covers 22 rounds.
c. We mount linear cryptanalysis on the most two notable GIFT-based proposals: GIFT-COFB and SUNDAE-GIFT. For GIFT-COFB, we analyze the security of its 15round GIFT-128 version; for SUNDAE-GIFT, we analyze the security of its 16-round GIFT-128 version.

Remarks.
In [ZSCH18], the authors show how to tweak the objective functions of the MILP models for finding better trails, with some constraints derived from the bounding condition of Matsui's algorithm. The key different point of this work and ours is that the MILP technique is utilized to search for trails, while for our strategy, the MILP technique is used to search for the InitialSet and the Matsui's algorithm is to search for specific trails.    : the difference in the state X, X[j · · · k] : j th bit, · · · , k th bit of state X, note that X[0] is the LSB of X. X[j ∼ k] : the successive bit from the j th bit to the k th bit of state X, ≫ i : an i-bit right rotation within a 16-bit word.

GIFT-128
GIFT [BPP + 17] lightweight block cipher was proposed by Banik et al. at CHES 2017. Similar to PRESENT, GIFT adopts an SPN structure with an Sbox layer and a bitwise permutation layer. The authors define two versions of GIFT, namely, GIFT-64 and GIFT-128 according to the block size. Both versions have a 128-bit key. The round numbers of them are 28 and 40, respectively. Since our paper is mainly about GIFT-128, so we omit the description on GIFT-64. Table 4 gives some notations used throughout this paper. In each round function, three operations are performed in sequence, i.e., SubCells, PermBits and AddRoundKey: • SubCells : Apply 32 4-bit Sboxes in parallel to every nibble of the internal state of GIFT-128. The Sbox is given in Table 5.  • AddRoundKey : The round keys RK is 64-bit, which is generated by the key state.
The round key is Xored to the state b 127 ...b 0 in the following way: A single bit "1" and a 6-bit constant C are Xored into the internal state b 127 ...b 0 at positions 127, 23, 19, 15, 11, 7 and 3 respectively.
The Key schedule. The 128-bit master key is initialized as K = k 7 k 6 ... k 0 , where |k i | = 16. For GIFT-128, the round key RK is RK = U V = k 5 k 4 k 1 k 0 . The key state is updated as follows, For more details of GIFT-128, we refer the readers to [BPP + 17].

The Strategy for Searching Differential Trails
Our strategy can be used to search for both differential trails and linear trails. To facilitate the narrative, we first introduce the process of searching for differential trails. After that, we give the process of searching for linear trails and list the different points between these two processes. When mounting a differential key-recovery attack after searching out a valid differential ∆in → ∆out, the cryptanalysts usually have two expectations: 1. The amount of the involved key bits during the key-recovery process is less as it will lead to a lower attack complexity, 2. More rounds can be extended at the top and the bottom of the differential without activating all the bits of the plaintext and ciphertext, as it will lead to a longer attack.
A differential is advantageous if both two expectations are achieved and the value of ∆in and ∆out plays a decisive role. Motivated by this, we propose a two-step strategy that only concentrates on searching for the advantageous differentials. The first step is to find out the values of the ∆in and ∆out that can satisfy the above two expectations. All possible values of ∆in and ∆out are stored in a set called the InitialSet. After that, the second step is to search for differentials whose input and output difference is only from the InitialSet. This provides two important benefits: 1). the searching space is greatly reduced as the potential distinguisher's input and output difference is limited by the InitialSet, 2) the attack can cover more rounds with a lower complexity once a advantageous differential is found.
The first step utilizes the MILP techniques and the second step is a revisit of Matsui's branch-and-bound search algorithm. We give a detailed introduction of these two steps in the following.

The MILP Model Searching for the InitialSet
After searching out a valid differential, we try to extend some round functions at its top and bottom and check the state to avoid all bits are activated. In this step, what matters is usually the state's activeness but not the potential specific difference value. For example, if an r-round GIFT-128 differential with output difference ∆out is found and ∆out has one active bit, the input of the first added round has only one active Sbox. Suppose the Sbox's input difference is 0001, according to the difference distribution table (DDT), it can propagate to 8 output difference values: {0101, 0110, 1000, 1001, 1010, 1011, 1100, 1111}. We would mark all 4 output bits as uncertain bits as they have different activeness in different output differences.
Constraints describing the activeness of the Sbox's input and output difference. We use 0 to denote an inactive bit and 1 to denote an active bit or an uncertain bit. The activeness of the input and output difference can be denoted as some 8-bit points. And the propagation rule is as follows: the 4 output bits are 1 as long as the input has at least one active bit. We use 8 boolean variables ( ) denoting the Sbox's input and output difference. The rule can be constrained by the following inequalities, and there are 20 inequalities for each Sbox in total.
Constraints describing the forward round function. We use 128 boolean variables, x r [i], 0 ≤ i ≤ 127, describing the activeness of input state of the r-th extended round. Since the PermBits operation is a linear bitwise permutation, no extra variables are needed to describe it. For example, for a Sbox in the r-th round, the input We construct inequalities describing all relations between the state in two consecutive rounds.
Other constraints. When extended at the bottom of a differential, the differential's output difference, denoted by x 0 , should have at least one active bit. And the output difference of the last extended round, denoted by x r , should have at least one inactive bit. These two constraints can be described by the following two inequalities: Till now, we construct the MILP model describing the state's activeness in the added r rounds at the bottom of the distinguisher. We start to solve the MILP model with r = 1, and if the r-round model is feasible, we construct the r + 1-round model and see whether it is feasible... Then the feasible model with the largest r tell us that we can extend at most r rounds at the bottom of a differential. Since we just want to see the model is infeasible or not, the objective function can be optional.
The MILP model describing the rounds extended at the top is almost the same as above. We don't repeat the detailed description. We just point out the only difference when extending backward at the top of the distinguisher: the number of active bits in the last added round should be should be less than 128, to avoid a full-codebook attack.
Adding the involved key bits. We already know how many rounds can be extended both at the top and the bottom of the differential and the extended round number is denoted by R. Next we try to find out which input and output difference values can be extended by R rounds, meanwhile, the amount of involved key bits in the R rounds is small. These constraints are very easy to construct.
For each extended round, we add 64 boolean variables k In the decryption direction, the round keys are Xored into the state after PermBits. However, in the key-recovery process, the round keys are transformed as RK (only key bit positions are change from RK) and Xored into the state after Sbox layer. Hence, the constraints on RK are similar to those in encryption phase. Note that there are dependencies between the round keys, two round key bits share a same variable if they are derived from the same master key bit.

The objective function
Since this new MILP model is to find output difference values that lead to a small amount of involved key bits. The objective function is as follows: Using this model, we will know the least amount of involved key bits and get all corresponding input and output difference values. We store all these difference values in the InitialSet.

A Revisit of Matsui's Algorithm Searching for Advantageous Differentials
In this step, we try to find out the longest differentials with a high probability, whose input and output difference from the InitialSet. The search process as shown in Algorithm 1 is a revisit of Matsui's branch-and-bound algorithm which adopts a depth-first with pruning strategy. This method guarantees to return all best trails for any initial value. As we already get the advantageous initial values in the first step, which are stored in the InitialSet, we are confident to search out some advantageous differentials. We start the second step with the optimal elements of the InitialSet, i.e., the greatest number of rounds can be extended and the involved key bits are the least, and set them as the input and output value of the potential differential trails. However, when given an input and output value, we can not determine the existence of valid distinguishers in advance before the searching process of the second step. Our strategy in this step is that, we start searching process with the optimal choices from the InitialSet, if we can not get advantageous distinguishers, we go on searching with the sub-optimal ones until we find out some valid distinguishers. This strategy ensures that the distinguishers searched out can lead to the best key-recovery attacks.
For GIFT-128, every round function has 32 Sboxes, the searching space will be very large when no constraints about the number of active S-boxes are set, as the algorithm needs to traverse all possible differentials. Due to this, we set up an upper bound of active S-boxes in each round function to be 4. And also, we set a lower bound of the probability of the differential characteristics that will be recorded to be 2 −128 . When the searching process covers r rounds, it outputs the qualified results. Note that r is decided somewhat Algorithm 1: The Search Algorithm Procedure 1: Initialization 1. Initialize t as the upper bound of the number of active Sboxes in each round, r as the number of searched rounds and prob as the lower bound of probability of qualified differential trails. 2. Choose the input difference, ∆X 0 , from the InitialSet, set the initial probability as p 0 = 1. Procedure 2: Recursive Searchsearch round i, i ≥ 1 4. For each (i − 1)-round differential trail, we get the output difference ∆X i−1 and the corresponding probability p i−1 . For each Sbox in round i, try all of its possible output differences.
Check the overall propagation probability p and the number of active SBoxes t e . Continue to the next Sbox only when p ≥ prob and t e ≤ t. 5. Get qualified ∆X i and p e is the propagation probability from ∆X i−1 to ∆X i . 6. If i < r, go to search round i + 1. 7. If i = r and p i ≥ prob, go to Procedure 3.
The differential record format is (∆X 0 , ∆X r , p r ).
experimentally, for example, we already know the longest known GIFT-128 differential tails are less than 22 rounds, then we run the model for r=21,20,19. . . until we find a valid trail to launch the attack.

Remarks.
There are other options to perform the second stage, such as MILP-based or SAT-based differential search. However, with the differential searching experiences [ZDY18, JZD19, JZZD20] on GIFT-128, when searching for longer rounds (e.g., 19, 20, 21 rounds etc.) with the same constrained number of active S-boxes in each round, the branch and bound method is more efficient than those automatic-tool based method. Hence, for the attack on GIFT-128, we mainly use Matsui's method to perform the second stage.
For GIFT-COFB [BCI + ] and SUNDAE-GIFT [BBP + ], due the data limitation, only short trials with high probabilities can be used to perform key-recovery attack. The search space is small, the tool-based model is also efficient. Hence, to search those trails we can use tool-based method to replace Matsui's method.
On the balance between the first stage and the second stage. The first stage is very fast. It only outputs some candidate initial values for the second stage. In the first stage, we collect those initial values that do not fully activate the plaintext and ciphertext when appending 4 rounds and 3 rounds at the top and bottom of the distinguisher. Then, we sort those initial values by the number of key bits involved in the extended rounds and store them in InitialSet. In the second stage, we use large computing resources (Dell PowerEdge with about 64 cores in our platform) to enumerate the initial values in InitialSet from the best one, until a distinguisher is found. We start the second step with the best elements from InitialSet, if failed, we continue searching with some second-best elements. But for other ciphers, the situation may be different.
On the output of the second stage. The second stage outputs some distinguishers conforming with certain initial values from the InitialSet, with probability larger than 2 −n . However, we can not guarantee those distinguishers can be used to perform keyrecovery attacks by extending 4 rounds and 3 rounds at the top and the bottom of the distinguisher. Look at the constraints in the first stage, we only restrict that the difference bits of plaintext and ciphertext are not fully activated. Then, use the number of key bits involved in the extended rounds to sort those candidate initial values in InitialSet. Hence, the output distinguishers by the second stage may not lead a valid attack because of other factors, for example, the probability of the distinguisher may be too low to work, or the number of active bits after appending 4 rounds and 3 rounds are too many to use. If these cases happen, we may either tweak the distinguishers by peeling off one round to increase the probability, or extend fewer rounds on the top or the bottom to enjoy fewer active bits in plaintext or ciphertext. Therefore, our model may fail its target in certain situations. However, the distinguishers given in the second stage under the guide from the first stage are likely to work, for example the 27-round key-recovery attack on GIFT-128 in Section 5. Even though for certain cases, some tweaks have to be applied to the given distinguisher, it still preserves some advantages in the key-recover attack. For example, if we have to extend fewer rounds at the top and bottom (e.g., 3 and 2 rounds), the number of active bits are likely to be small, since when extending 4 rounds and 3 round at the top and bottom, the input and output states are not fully activated yet.

The Strategy for Searching Linear Trails
The two-step strategy can also be used to search for advantageous linear trails. As in a linear key-recovery attack, the process is executed by first searching out some good linear trails as the distinguisher and then adding some rounds at the top and the bottom of it, which is very similar with that in the differential key-covery attack. The overall procedure is very similar with that used for searching differentials in Section 3.
Due to the fact that the interplay of the Sbox and the linear layer in GIFT is well-crafted to resist linear cryptanalysis, we can not search out long linear trails when t (the upper bound of linear active Sboxes in each round in the Matsui's branch and bound algorithm) is not larger than 4. We set t as 5 and the lower bound of the probability of qualified linear trails as 2 −128 .

Differential Cryptanalysis of GIFT-128
Following the strategy in Section 3, we first get the InitialSet and find that at most 4 rounds can be extended at the top and at most 3 rounds can be extended at the bottom of a distinguisher. The least amount of the involved key bits is 62, we find 2436 optimal elements of the InitialSet. For the best solutions, there are only one or two active bits in the input or the output. However, we can not search out long valid differentials with them when executing further in the second step. Instead, we find out some differentials with the second-best solutions. We list two of them in Table 6. In addition, we also have some other observations. When the length of the potential distinguisher and the number of the extended rounds are fixed, the involved key bits are the same, it is not influenced by the starting round index of the distinguisher. For example, the least involved key bits of a potential 27-round attack based on a 20-round distinguisher by extending 4 rounds and 3 rounds at the top and the bottom is 62, it is the same when the attack occupies round 0 27, 1 28, 2 29, 3 30, 4 31. What's more, we also find that three rounds can be extended at the bottom as long as the output is only inactive in the first half or the second half bits.
We finally use some 20-round advantageous differentials with probability 2 −121.83 to attack 27-round GIFT-128. The time complexity is 2 124.83 , the data complexity is 2 123.53 , the memory complexity is 2 80 -bit space.

The 27-round Differential Key-recovery Attack
Based on the first 20-round differential in Table 7: we attack 27-round GIFT-128 by extending 4 rounds at the top and 3 rounds in the bottom. The whole attack details are shown in Table 9. For better readability, we usedenoting the inactive bits in the state. The attack procedure is as follows.
Data collection.
1. In GIFT, there is no whitening key at the beginning, we can construct structures before the first round key involved at X P 1 . We set X P 1 [63 ∼ 0] as constant and traverse all values of X P 1 [127 ∼ 64] as one structure. There are 2 64 elements in each structure, providing 2 64×2−1 = 2 127 pairs. 2. Construct 2 t structures, we get 2 127+t message pairs.
3. For each message, we can get the plaintext P by applying the PermBits −1 and SubCells −1 operations. After that, we can get the corresponding ciphertext by encrypting the plaintext.

Key Recovery.
In the key-recovery process, we only care for which the ciphertext difference conforms to the difference pattern of ∆C as shown in Table 9. ∆C has 64 inactive bits, this will perform a 64-bit filter, about 2 t+127−64 = 2 t+63 pairs remain. The involved key bits during the key recovery process is given in Table 10. To simplify the description of the key guessing procedure, we move the subkeys involved in the last two rounds to the place between the SubCells and PermBits operation. Next we give the detailed procedure of counting right pairs using a guess and filter approach.
(a) Guess the involved key bits in RK 1 and RK 27 .
According to the key schedule, when i ≡ j mod 2, RK i and RK j consist of the same 64 bits. Thus, RK 27 contains the same 64 master key bits as RK 1 . As shown in Table 10, k 9 5 is also involved in the 27-th round. The value of X P 26 can be deduced by a PermBits −1 operation on the ciphertext. By guessing the value of k 9 0 , we can make an Sbox −1 operation can get the value of X 27 [91 ∼ 88]. Discard the pairs that don't satisfy ∆X 27 [91, 89] = 00. This also performs a 2-bit filter, about 2 t+58 pairs remain.
Overall, 3 key bits are guessed and the filter probability is 2 −5 .
Similarly, another 5-bit filter can be got by guessing the value of k 8 5 ,k 8 1 and k 8 0 . After this, the other 14 active Sboxes are all 4-to-2 transformations. Similar as above, we make a Sbox operation by guessing two key bits in the first round and then make a Sbox −1 operation by guessing one key bit in the 27-th round. Each step performs a 4-bit filter. We repeat the process until all involved key bit in RK 27 and RK 1 are guessed.
For the other involved key bits in RK 2 , the pairs will be filtered with probability 2 −3 after guessing every two key bits. For the other involved key bits in RK 26 , the pairs will be filtered with probability 2 −2 after guessing every two key bits. This procedure is operated 8 times until all involved bits in RK 2 and RK 26 are guessed. About 2 t−42−7 = 2 t−49 pairs remain.
(d) As shown in Table 10, k 2 4 , k 0 4 , k 10 0 , k 8 0 are already guessed in RK 27 , k 14 5 , k 12 5 are already guessed in RK 1 . We only need to guess the value of k 2 0 and k 0 0 . Deduce the value of X S 25 [7 ∼ 4]. After a Sbox −1 operation, discard the pairs that the difference pattern of ∆X 25 [7 ∼ 4] do not conform to 0001. We repeat a similar process four times and discard the pairs that can not conform to the output difference of the differential. Each process performs a 4-bit filter, about 2 t−49−4×4 = 2 t−65 pairs remain. Table 11: Time complexity in each step. Note that after guessing the keys involved in an S-box, we have to use 2 S-box operations to compute the partial internal values for P and P , which is about 2 32 = 1 16 one round operation.
Step #Pairs #Keys Time( 1 16 Round) Probability Complexity. The detailed time complexity estimation is shown in Table 11. For the wrong key guesses, 2 t−49−4×4 = 2 t−65 pairs remain. While for the right key guesses, there are about 2 t+127−64−121.83 = 2 t−58.83 pairs remaining. We set t ≈ 60.83, about 2 2 pairs remain for the right key guesses, and 2 −4.17 pairs remain for each wrong key guess. The data complexity is 2 60.83+64 = 2 124.83 , and the time complexity of the key recovery process is dominated by step (a), which is equal to 1 16 × 2 60.83+66.7 ≈ 2 123.53 . The memory complexity for storing the guessed key bits is 2 80 -bit space.

Linear Cryptanalysis of GIFT-128
For GIFT-128, different from differential cryptanalysis, in our two-stage model to search linear hulls, the second stage that applies Matsui's algorithm is very time consuming due to the fast diffusion of the linear mask. In our computing resource (Dell PowerEdge with about 64 cores ), we have to restrict the round number of the second stage as 15 rounds. In total, we need about 45 days run the second stage to output a sound 15-round linear hull. However, the time consumption of first stage is negligible. In the InitialSet, the best initial values only involve 56-bit key in the rounds extended at the top and bottom of the linear hull. The second-best solutions involve 76-bit keys. Finally, with the second-best initial values, we find some good 15-round linear hulls as shown in Table 12.
However, based on the 15-round linear hulls, we can not directly launch an attack by extending 4 rounds and 3 rounds at the top and bottom. Note that in our two-stage model, according to the constraints of the first stage, we only restrict that the state of plaintext and ciphertext is not fully activated and then sort the corresponding initial values (input-output of the distinguisher) with the number of involved key bits in the InitialSet. Hence, those distinguishers are only chosen towards a better key-recovery attack, but can not be guarantee to achieve an valid attack. For example, when extending 4-round and 3-round at the top and the bottom of the 15-round distinguishers, the bits of plaintext and ciphertext are not fully activated, but the number of active bits is too large to perform a valid linear attack (the time complexity is higher than exhaustive search). Therefore, our two-stage model is finding distinguishers towards better attacks, but can not guarantee it. When extending 4-round and 3-round fail the attack, we have to peel off several rounds the top or the bottom to enjoy more inactive bits until a valid attack is obtained. Since the initial values in InitialSet do not reach full active bits with 4 rounds (top) and 3 rounds (bottom) extended, they are likely to have fewer active bits when extending fewer rounds (e.g., 3 rounds at top and 2 rounds at bottom) than those not in InitialSet. Hence, the InitialSet is actually a guide to key-recovery-attack-friendly distinguishers.
In addition, due to the 15-round limitation in the second stage, the probability of the linear hulls obtained are usually high. Hence, we extend the 15-round linear hull by one round at the top and bottom to get a 17-round linear hull with probability of 2 −115 : (0000000000000000000000000a010000) → (00001000000000000000400000000000).
The 17-round linear hull includes 2 trails given in Table 13. Of course, one can search the 17-round linear hull directly in the second stage if the computing resource is large enough.

The 22-round Linear Hull Attack
Based on the 17-round linear hull, we mount a 22-round linear hull attack on GIFT-128. The whole attack details are shown in Table 14. The attack procedure is as follows.
The attack procedure.
Suppose The attack process is shown in Table 18 and the involved key bits are listed in Table 19 in Appendix C The attack procedure.
1. Denote the number of the needed plaintext-ciphertext pairs as N . For each of the plaintext-ciphertext pair, do the following steps.
2. The corresponding value of X P 1 and X K 15 can be deduced from each plaintextciphertext pair. Compress the N samples into 2 96 counters according to the value of (X This requires about 2 96 · 2 2 + 2 94 · 2 3 = 2 98 + 2 97 Sboxes and 2 92 counters remain.
4. There are 8 same key bits in RK 1 and RK 15 , i.e., k 3∼0 5 , k 11∼8 1 . Each key bit corresponds to one active Sbox in the first round. Repeat a similar process as in Step 3 for 8 times to guess all involved key bits in RK 1 and 16 key bits in RK 15 .
After that, there are still 16 unknown key bits in RK 15 , repeat guessing each two key bits which corresponds to the same Sbox and make one Sbox −1 operation until all key bits in RK 15 are known.

Linear Cryptanalysis on SUNDAE-GIFT
In our cryptanalysis on SUNDAE-GIFT, we focus on the Encryption message step shown in dashed box in Figure 2. Due to the security claim up to birthday bound, the data complexity of cryptanalysis on SUNDAE-GIFT should be also less than 2 n/2 = 2 64 . Hence, the analysis result in Section 6 is invalid. In the data collection phase, we collect T , Key recovery attack. Based on the a 10-round linear trail 000000000000000000000000a008a002 → 00000000000000000044000000220000 with probability 2 −56 , we analyze the security of the 16-round GIFT-128-based version of SUNDAE-GIFT. Due to the data usage limit, we can only add 4 rounds at the top and 2 rounds at the bottom. The situation is similar to the 22-round linear attack on GIFT-128. The trail is shown in Table 20 in Appendix C.
During the key-recovery process, the linear mask of X P 1 has 64 active bits and the linear mask of the ciphertext has 32 active bits. The memory complexity is dominated by implementing the 2 64+32 = 2 96 counters with about 2 96 -bit space. Similarly with the process in Section 6, by guessing key bits involved in each Sbox operation and compressing messages into new counters, we choose the corresponding key as the right key with the largest bias.
Complexity. The data complexity is c 2 = 2 60 when c is set as 4. The time complexity is about 2 100.2 Sbox operations, which is equivalent to 2 91.2 16-round encryptions. The memory complexity is 2 96 -bit space.
Comment. Due to the data usage limitation of GIFT-COFB and SUNDAE-GIFT, we can not mount long differential attack on these two proposals. In addition, for GIFT-COFB, there are varied secret 2 a · 3 i · L for different blocks. While for linear attack, we can ignore this value with zero linear mask for this part.

Conclusion
In this article, we propose a two-step strategy for searching advantageous distinguishers which can lead to long attacks with small involved key bits. The first step is to reduce the searching space and give advantageous initial values of the second step. It utilizes the advantages of the MILP based method and Matsui's branch-and-bound algorithm and can be used to search for differential trails and linear trails. As a first application, the strategy is used to analyze GIFT-128 and a 27-round differential attack and a 22-round linear hull attack on GIFT-128. The differential attack covers one more round than the previous result. What's more, we give evaluation results on two GIFT-128-based proposals: SUNDAE-GIFT and GIFT-COFB.

A GIFT-COFB
GIFT-COFB instantiates the COFB (Combined FeedBack) block cipher based AEAD mode with the GIFT-128 block cipher. This proposal primarily focuses on the hardware implementation size.
Recommended Parameter Choice. In GIFT-COFB, the underlying block cipher is the only parameter. The block cipher can be chosen by the following recommendation.
1. n: Length of the block cipher state in bits. The recommended choice is n = 128.
2. τ : Length of the tag in bits. The recommended choice is τ = 128.
3. EK : The recommended choice of EK is the block cipher GIFT-128.
Input and Output Data. To encrypt a message M with associated data A and nonce N , one needs to provide the information given below.
• A nonce N ∈ {0, 1} 128 . This can include the counter to make the nonce non-repeating.
It generates the following output data: To decrypt (with verification) a ciphertext-tag pair (C, T ) with associated data A and nonce N , one needs to provide the information given below.

B SUNDAE-GIFT
The encryption of SUNDAE takes a 128-bit key K, an associated data A ∈ {0, 1} * and a message M ∈ {0, 1} * as input. The designers define four variants of SUNDAE-GIFT depending on 4 different length of nonce N as shown in Table 17. For simplicity, the nonce N and associated data A are regarded as A ← N A. SUNDAE outputs a ciphertext C ∈ {0, 1} n+|M | , where the first n bits are the tag T . The encryption algorithm shown in Figure 4 is composed of five steps: Apply GIFT-128 block cipher E K to the initial block B to produce a chaining value V .
2. Processing associated data. If the associated data A is empty, skip this step. Otherwise, partition A into 128-bit blocks (the last block may be a partial block). The blocks are processed as shown in Figure 4 and the last padded block is processed differently by multiplying m before applying GIFT-128 block cipher.
3. Processing message. The message M is processed in a similar way to the step of Processing associated data.
4. Extracting tag. As shown in Figure 4, the chaining value is outputed as the tag T .
5. Encrypting message. The message blocks are encrypted as shown in Figure 4.