Boomerang Connectivity Table Revisited Application to SKINNY and AES

. The boomerang attack is a variant of diﬀerential cryptanalysis which regards a block cipher E as the composition of two sub-ciphers, i.e. , E = E 1 ◦ E 0 , and which constructs distinguishers for E with probability p 2 q 2 by combining diﬀerential trails for E 0 and E 1 with probability p and q respectively. However, the validity of this attack relies on the dependency between the two diﬀerential trails. Murphy has shown cases where probabilities calculated by p 2 q 2 turn out to be zero, while techniques such as boomerang switches proposed by Biryukov and Khovratovich give rise to probabilities greater than p 2 q 2 . To formalize such dependency to obtain a more accurate estimation of the probability of the distinguisher, Dunkelman et al. proposed the sandwich framework that regards E as ˜ E 1 ◦ E m ◦ ˜ E 0 , where the dependency between the two diﬀerential trails is handled by a careful analysis of the probability of the middle part E m . Recently, Cid et al. proposed the Boomerang Connectivity Table ( BCT ) which uniﬁes the previous switch techniques and incompatibility together and evaluates the probability of E m theoretically when E m is composed of a single S-box layer. In this paper, we revisit the BCT and propose a generalized framework which is able to identify the actual boundaries of E m which contains dependency of the two diﬀerential trails and systematically evaluate the probability of E m with any number of rounds. To demonstrate the power of this new framework, we apply it to two block ciphers SKINNY and AES . In the application to SKINNY , the probabilities of four boomerang distinguishers are re-evaluated. It turns out that E m involves 5 or 6 rounds and the probabilities of the full distinguishers are much higher than previously evaluated. In the application to AES , the new framework is used to exclude incompatibility and ﬁnd high probability distinguishers of AES -128 under the related-subkey setting. As a result, a 6-round distinguisher with probability 2 − 109 . 42 is constructed. Lastly, we discuss the relation between the dependency of two diﬀerential trails in boomerang distinguishers and the properties of components of the cipher.


Introduction
Differential cryptanalysis, proposed by Biham and Shamir [BS93], is one of the most powerful approaches to assess the security of block ciphers.The basic idea is to exploit non-random pairs of input and output differences of the cipher, i.e., high probability differentials.In many cases, it is hard or impossible to find long differentials.In such cases, the boomerang attack [Wag99] which was proposed as an extension of the differential cryptanalysis may be applied to combine short differentials with high probabilities to get a long one.
In boomerang attacks, a cipher E is regarded as the composition of two sub-ciphers E 0 and E 1 , i.e., E = E 1 • E 0 .Suppose there exists a differential α → β of E 0 with probability p and a differential γ → δ of E 1 with probability q.Under the assumption that the two differentials are independent, the boomerang attack exploits the high probability of the following differential property: As illustrated in the left part of Figure 1, two plaintexts with difference α are encrypted and the resulting ciphertexts are then XORed with δ to generate two new ciphertexts.These two new ciphertexts are then decrypted to give two new plaintexts.If the difference between the two new plaintexts is also α, it is said the boomerang returns and the two pairs of plaintexts form a right quartet.According to [Wag99], if (pq) −2 < 2 n , where n is the block size, then E can be distinguished from an ideal cipher with a complexity corresponding to (pq) −2 adaptive chosen plaintext/ciphertext queries.
Later, refinements on the boomerang attack were proposed.Particularly, Kelsey et al. [KKS00] developed amplified boomerangs which are pure chosen-plaintext attacks.In amplified boomerang attacks, the probability of finding a right quartet is 2 −n p 2 q 2 while for a random permutation the expected probability is 2 −2n .In [BDK01], Biham et al. proposed the rectangle attack which allows any value of β and γ to occur as long as β = γ.As a result, the probability of generating a right quartet increases to 2 −n p2 q2 , where p = Σ i Pr 2 (α − → β i ) and q = Σ j Pr 2 (γ j − → δ).In applications of boomerang attacks to concrete block ciphers, such as [Wag99,BDK01, ALLW14], attackers typically aim to find differential trails with high probability and then combine them to form long boomerang distinguishers.However, the dependency between the two differential trails highly affects the probability of the boomerang distinguisher.As pointed out by Murphy in [Mur11], there exist cases where the probabilities formulated by p 2 q 2 are highly inaccurate.He showed that in some cases of S-box based ciphers, two independently chosen differential trails are incompatible, making the boomerang never return, and in other cases, the dependency leads to a higher probability than p 2 q 2 .Further, Biryukov et al. made an improvement on exploiting the positive dependency of boomerang distinguishers, which was named boomerang switch [BK09].The idea was to optimize the transition between the differential trails of E 0 and E 1 in order to minimize the overall complexity of the boomerang distinguisher.In [BK09], three types of switches were proposed.Instead of decomposing a cipher into rounds by default, the ladder switch decomposes the cipher regarding smaller operations, like columns and bytes, which may lead to better distinguishers.The S-box switch refers to the case when both differential trails activate the same S-box with identical input and output differences, the probability of this S-box counts only once for the boomerang distinguisher.The Feistel switch, also noted in [BDK05], stands for a free middle round in the boomerang distinguisher for a Feistel cipher.
The above cases of dependency were later covered and unified in the sandwich attack proposed by Dunkelman et al. [DKS10,DKS14], which is depicted in the middle part of Figure 1.It regards E as E = Ẽ1 • E m • Ẽ0 instead, where the middle part E m specifically handles the dependency and contains a relatively small number of rounds.If the probability of generating a right quartet for E m is r, then the probability of the whole boomerang distinguisher is where p (resp.q) is the probability of the differential of Ẽ0 (resp.Ẽ1 ).Let (x 1 ,x 2 ,x 3 ,x 4 ) and (y 1 ,y 2 ,y 3 ,y 4 ) be input and output quartet values for E m , where y i = E m (x i ).Suppose the differential trail for Ẽ0 (resp.Ẽ1 ) ends (resp.starts) with difference β (resp.γ), i.e., x 1 ⊕ x 2 = x 3 ⊕ x 4 = β and y 1 ⊕ y 3 = y 2 ⊕ y 4 = γ.Then, r was formally defined as: In [DKS10,DKS14], the probability r of E m was evaluated by experiments.
Recently in [CHP + 18], the issue of dependency in boomerang distinguishers was revisited, and a tool named Boomerang Connectivity Table (BCT) was proposed, which calculates r theoretically when E m is composed of a single S-box layer, as shown in the right part of Figure 1.More importantly, the previous observations on the S-box including the ladder switch and the S-box switch as well as the incompatibility can be well explained by BCT, which gives new insights into boomerang attacks and provides a new point of view for designing a good S-box.As a follow-up, Boura and Canteaut [BC18] gave a thorough analysis of BCT properties of some important families of S-boxes.
Although the introductory paper of BCT [CHP + 18] well handles the dependency of two differential trails in boomerang distinguishers when E m is of one S-box layer, the following questions may be asked naturally.
• How to decide the actual boundaries of E m which contains dependency of the two differential trails in boomerang distinguishers?
• How to calculate r when E m contains multiple rounds?
Answers to these questions would be of great importance on evaluating the probability of the boomerang distinguishers.Only when the probability of the boomerang distinguisher is accurately computed can we evaluate the exact resistance of a cipher against boomerang attacks.
Our contributions.This paper gives the first solution to the above questions by proposing a generalized framework of BCT.Specifically, our new framework is able to not only find the actual boundaries of E m which contains dependency of two differential trails in the setting of boomerang attacks, but also systematically calculate the probability r of E m with any number of rounds.With the issues of E m settled, the probability of the full distinguisher of E = Ẽ1 • E m • Ẽ0 can then be closely modeled by p2 q2 r.
To achieve this, we start with the basic formula of BCT and then extend it to general cases.Specifically, new formulas are developed for all possible cases with the help of a new concept named crossing difference which refers to the difference propagated from the other differential trail of the boomerang distinguisher.With the crossing difference, the middle part E m can be well described.First, the boundaries of E m are delineated by the round where the crossing differences turn into random.Second, the probability r of E m depends on the distribution of the crossing difference.Finally, the case considered in [CHP + 18] where E m is of one S-box layer maps to the case here where the crossing differences are fixed.
To demonstrate the power of our generalized framework, we apply it to SKINNY [BJK + 16] and AES [DR02], which are two typical block ciphers using weak and strong round functions respectively.In the case of SKINNY, we re-evaluate the probabilities of the four boomerang distinguisher proposed in [LGS17] and the results are summarized in Table 1.As shown in Table 1, the lengths of E m for these distinguishers are 5 or 6 rounds.The corresponding r probabilities are computed and confirmed by experiments.Adjacent to E m there are some passive rounds for all these distinguishers, so the probability remains r with these passive rounds included.The increased numbers of rounds by adding these passive rounds are displayed in parentheses.The probabilities of the full boomerang distinguishers are then computed accordingly with p2 q2 r which turn out to be much higher than the probabilities given in [LGS17] by p2 q2 .In the case of AES, we propose a 6-round related-subkey boomerang distinguisher of probability 2 −109.42 by combining two 3-round differential trails.In this case, E m is of two rounds.Our framework is then used to exclude incompatibility and optimize p2 q2 r by selecting a good combination.8 Lastly, we discuss the relation between the dependency of two differential trails in boomerang distinguishers and the properties of the round function.It is deduced from our generalized framework that the length of E m is mainly determined by the diffusion effect of the linear layer, and the probability r is strongly affected by differential properties of the non-linear layer.
Concurrently, Wang and Peyrin [WP19] studied the effect of BCT in multiple rounds, based on which an improved attack on AES-256 was proposed in the related-key setting.
Organization.The rest of the paper is organized as follows.Section 2 provides preliminaries of boomerang attacks and previous works on BCT.Our generalized framework of BCT is presented in Section 3. Section 4 applies the new framework to SKINNY.Section 5 extends the application to AES.We then discuss in Section 6 the relation between the dependency occurring in boomerang distinguishers and the properties of the cipher.Finally, Section 7 concludes the paper.

Preliminaries
This section gives a clearer picture of the boomerang attack and reviews the previous works on the boomerang connectivity table.In addition, notations used throughout this paper are also introduced.

Framework of Boomerang Attacks
The boomerang attack, proposed by David Wagner [Wag99], treats a block cipher E as the composition of two sub-ciphers E 0 and E 1 , for which there exist short differentials α → β and γ → δ of probabilities p and q respectively.The two differentials are then combined in a chosen plaintext and ciphertext attack setting to construct a long boomerang distinguisher, as shown in Figure 1.Later, the basic boomerang attack was extended to the related-key setting and was formulated in [BDK05] by using four related-key oracles.
Let E K (P ) and E −1 K (C) denote the encryption of P and the decryption of C under a key K, respectively.Suppose ∆K, ∇K are the master key differences of the differentials.Then the boomerang framework in the related-key setting works as follows.
2. Repeat the following steps many times.
In step 2(e), if P 3 ⊕ P 4 = α holds, then a right quartet (P 1 , P 2 , P 3 , P 4 ) is found such that P 1 ⊕ P 2 = P 3 ⊕ P 4 = α and C 1 ⊕ C 3 = C 2 ⊕ C 4 = δ.This happens with probability p 2 q 2 under the assumption that the two differentials are independent.

Boomerang Connectivity Table
We introduce here the definitions and propositions related to the boomerang connectivity The differential uniformity of S is the highest value in the DDT except for the first row and the first column.

Notations
We treat the block cipher as E = E 1 • E 0 where there exist differential trails of E 0 and E 1 with probabilities p and q respectively.Let E m denote the middle part of the cipher which contains dependency of the two differential trails.The probability of E m of generating a right quartet is denoted by r.We let Ẽ0 ← E 0 \E m , i.e., the front rounds of E 0 that do not contain dependency, and Ẽ1 ← E 1 \E m , i.e., the rear rounds of E 1 that do not contain dependency.With E m clearly defined, we treat E = Ẽ1 • E m • Ẽ0 so that the probability of generating a right quartet (also called the probability of the boomerang distinguisher) can be computed precisely as p2 q2 r where p (resp.q) is the probability of the differential trail of Ẽ0 (resp.Ẽ1 ).We denote the number of rounds in E by |E|.

Generalized Framework of BCT
In this section, through a new explanation of BCT, we extend the previous analysis in [CHP + 18] for E m with only one S-box layer to the case where E m contains multiple rounds and we also show how to decide the boundaries of E m .
In the beginning, we treat the block cipher as

New Explanation
We first consider the E m with only one S-box layer at the connecting point of E 0 and E 1 , as shown in Figure 2(1).For such E m , the differences α, β are specified by the differential trails.By re-expressing Eq. 1 as it is known that only those γs are possible when Y DDT (α, γ) ∩ (Y DDT (α, γ) ⊕ β) is not empty, i.e., there exists y 1 ∈ Y DDT (α, γ) such that y 1 ⊕ β also belongs to Y DDT (α, γ), as depicted in Figure 2(1).If y 1 ⊕ β ∈ Y DDT (α, γ), we will always have x 3 ⊕ x 4 = α; otherwise, the boomerang never returns.Therefore, the probability of E m for generating a right quartet is Similar results can be obtained as follows with X DDT due to symmetry.
Even though Eq. 3 and 4 look more complex than Eq. 1, they are helpful when we consider E m of multiple rounds.In fact, the dependency of two differential trails may penetrate into multiple rounds.Next, we are to extend the analysis to E m with multiple layers of S-boxes around the connecting point of E 0 and E 1 , and find the boundaries of E m , as well as evaluate the probability of it.

Generalization
Now, we consider S-boxes in general cases which are not necessarily located at the connecting point of E 0 and E 1 .We observe that for S-boxes away from the connecting point, the differences α in Eq. 4 or β in Eq. 3 (to be defined as crossing differences) may not be fixed but follow some distributions.Our intuition is to take into account the distributions which turn out to be a key factor for evaluating the dependency of two differential trails in a boomerang distinguisher.
Suppose the input and output differences of S-boxes in the two differential trails are given.We use r to denote the probability of getting a right quartet that follows exact differential trails.In fact, the differences in between may have many choices.The actual probability r is composed of the probabilities r corresponding to all possible intermediate differences and hence r is usually greater than or equal to any single r.As an analogy to the clustering effect of differentials, we call this the clustering effect of E m .
Active S-boxes in E 0 .Let us consider an active S-box in the upper differential trail of E 0 .This situation is illustrated in Figure 2(2) and 3(1).Suppose the input and output differences α, γ of this S-box are specified by the upper differential trail.β is the difference propagated from the lower differential trail and called the lower crossing difference, as depicted in Figure 2(2).The value of β may not be fixed.From Eq. 2, it can be seen that only when Y DDT (α, γ) ∩ (Y DDT (α, γ) ⊕ β) is not empty will the boomerang return.That is, both y 1 and y 3 = y 1 ⊕ β should belong to Y DDT (α, γ) (see Figure 2(2)).If the distribution of the lower crossing difference β is independent of the upper differential trail, i.e., the value of β is not affected by the upper differential trail as showcased in 3(1), then the probability that the boomerang returns when the output difference of this S-box is γ is When we take all possible output differences γ of this S-box into account, we have Particularly, if the lower crossing difference β is constant, then r = DDT(α, γ) and is exactly the same as Eq. 3. If β is always 0, r = DDT(α, γ) 2 n and r = 1.
If the lower crossing difference β is uniformly distributed, i.e., for any which becomes identical to the computation by p 2 q 2 in the classical boomerang attack.
Active S-boxes in E 1 .For an active S-box in the lower differential trail of E 1 , similar results can be obtained.Suppose the output and input differences β, γ of this S-box are specified by the lower differential trail, as shown in Figure 2(3) and 3(2).In this case, α is the difference propagated from the upper differential trail and called the upper crossing difference.The value of α may not be fixed.From Eq. 4, it can be seen that only when X DDT (γ, β) ∩ (X DDT (γ, β) ⊕ α) is not empty will the boomerang return.That is, both x 1 and x 2 = x 1 ⊕ α should belong to X DDT (γ, β) (see Figure 2(3)).If the distribution of the upper crossing difference α is independent of the lower differential trail, i.e., the value of α is not affected by the lower differential trail as showcased in 3(2), then the probability that the boomerang returns when the input difference of this S-box is γ is When we take all possible input difference γ of this S-box into account, we have Particularly, if the upper crossing difference α is constant, then r = DDT(γ, β) and r = γ r = BCT(α,β) is exactly the same as Eq. 4. If α is always 0, r = DDT(γ, β) 2 n and r = 1.
If the upper crossing difference α is uniformly distributed, i.e., for any a which becomes identical to the computation by p 2 q 2 in the classical boomerang attack.

Interrelated active S-boxes. It is possible that active S-box
A in E 0 and active S-box B in E 1 affect each other, as showcased in Figure 3(3).Suppose the differential of S-box A is α → γ, according to the upper differential trail.Similarly, the differential of S-box B is γ → β , according to the lower differential trail.The interrelation here refers to two things.One is that the upper crossing difference α of S-box B is propagated from S-box A, and the other is that the lower crossing difference β of S-box A is propagated from S-box B. To calculate the probability, we further introduce Let (y 1 , y 2 , y 3 , y 4 ) be the output quartet of S-box A and (x 1 , x 2 , x 3 , x 4 ) be the input quartet of S-box B. The above formula means that both the condition for S-box A that y 1 and y 3 = y 1 ⊕ β belong to Y DDT (α, γ) and the condition for S-box B that x 1 and x 2 = x 1 ⊕ α belong to X DDT (γ , β ) should be satisfied simultaneously.
When we consider all possible output differences γ of S-box A and all possible input differences γ of S-box B, we have If more than two active S-boxes affect each other, a similar analysis can be performed to calculate the probability r.Such examples can be found in Section 4.3.
Boundaries of E m .From the above analysis, it can be deduced that the upper boundary of E m is delineated by the round where the lower crossing differences for its active S-boxes are distributed (almost) uniformly.Also, the lower boundary of E m is marked by the round where the upper crossing differences for its active S-boxes are distributed (almost) uniformly.Due to this, the length of E m heavily depends on the diffusion properties of the cipher, which will be exemplified by the application to SKINNY and AES in the following two sections.

Algorithm for Evaluating r
Given two differential trails over E 0 , E 1 respectively, we are to find the middle part E m that contains dependency and evaluate its probability r for generating a right quartet.Once this is done, the probability of the full boomerang distinguisher of E = Ẽ1 • E m • Ẽ0 can be closely modeled as p2 q2 r where p (resp.q) is the probability of the differential trail over Ẽ0 (resp.Ẽ1 ).
Given two differential trails over E 0 and E 1 respectively, we take the following steps to find the boundaries of E m and calculate the probability r.
1. Extend the upper differential trail forwards with probability 1; also, extend the lower differential trail backwards with probability 1.With the extensions, it can be told whether an active S-box of the differential trail is affected by the other differential trail or not.
2. Initialize E m with the last round of E 0 and the first round of E 1 .In step 4 and 5 of the algorithm, the upper boundary and the lower boundary are determined respectively, and these two steps can be swapped.If the returned r is 0, it means the two differential trails are incompatible.The time complexity of the algorithm depends on the properties of the cipher and the two differential trails of the boomerang distinguisher.We will discuss it in more details in Section 6.

Application to SKINNY
In [LGS17] Liu et al. mounted related-tweakey rectangle attacks against SKINNY.The attacks evaluated the probability of the distinguishers by taking into account the amplified probability but did not consider the dependency of two differential trails.In [CHP + 18] which introduced BCT, the authors accurately evaluated the probability of generating the right quartet for two middle rounds by applying the BCT.
In this section, we revisit this issue by applying the generalized framework of BCT to SKINNY.With the generalized framework of BCT, we are able to identify the actual boundaries of E m and accurately calculate the probability r for E m .Most notably, accurate evaluations of the probability of full distinguishers of SKINNY become possible.The results show that there exist dependency in 5 or 6 middle rounds, which makes real probability much higher than previously evaluated.
In the remainder of this section, we first give a brief description of SKINNY, followed by a review of boomerang distinguishers proposed in [LGS17], for which we show how the generalized framework of BCT helps to evaluate the probability r.At last, an analysis of the results is added.

Description of SKINNY
SKINNY [BJK + 16] is a family of lightweight block ciphers which adopt the substitutionpermutation network and elements of the TWEAKEY framework [JNP14].Members of SKINNY are denoted by SKINNY-n-t, where n ∈ {64, 128} is the block size and t ∈ {n, 2n, 3n} is the tweakey size.The internal states of SKINNY are represented as 4 × 4 arrays of cells with each cell being a nibble in case of n = 64 bits and a byte in case of n = 128 bits.The tweakey state is seen as a group of z 4 × 4 arrays, where, z = t/n.The arrays are marked as T K1, (T K1, T K2) and (T K1, T K2, T K3) for z = 1, 2, 3 respectively.
Encryption.SKINNY iterates a round function for N r rounds and each round consists of the following five steps.

AddRoundTweakey (ART) -
The first two rows of the internal state absorb the first two rows of T K, where The tweakey states T K i are then updated by a tweakey scheduling algorithm.
4. ShiftRows (SR) -Each cell in row j is rotated to the right by j cells.
5. MixColumns (MC) -Each column of the internal state is multiplied by matrix M .The inverse MixColumns operation employs M −1 instead.Note, the branch number of MC is only 2.
Tweakey Scheduling Algorithm.The tweakey schedule of SKINNY is a linear algorithm.The t-bit tweakey is first loaded into z 4 × 4 tweakey states.After each ART step, the tweakey states are updated as follows.

Previous Boomerang Attacks
In [LGS17], Liu et al. proposed boomerang distinguishers for SKINNY-n-2n and SKINNY-n-3n by connecting two short differential trails α → β and γ → δ.For completeness, the differential trails are copied to Table 4, 7, and 8.The probabilities of these boomerang distinguishers are evaluated by p2 q2 , where p = Σ i Pr 2 (α − → β i ) and q = Σ j Pr 2 (γ j − → δ).Since the probabilities are too small to be verified experimentally, the authors of [LGS17] verified two middle rounds, the last round of E 0 and the first round of E 1 , to exclude incompatibility between the trails.
For boomerang distinguishers, two things are of concern.The first is the compatibility of two differential trails.If the differential trails are compatible, then the second concern is the exact probability.In [CHP + 18], the probabilities of the two middle rounds of these distinguishers were evaluated with BCT.However, the problem of accurately evaluating the probability of a full distinguisher remains unsolved.Next, we will show that the generalized framework of BCT provides us with the first solution to this problem.
For convenience, we use the row index and the column index to describe the position of a cell.Different colors are used to show the differential propagation of certain cells.For example, the cell at (2,2) after SC and AC of R1 is outlined in dark green, meaning its lower crossing difference propagates through the dark green cells from the lower differential trail.Similarly, the cell at (3,3) before SC and AC of R4 is outlined in purple, meaning its upper crossing difference propagates through the purple cells from the upper differential trail.
Probability of E m with R2 and R3.According to the upper differential trail, only two cells of the input of R2 are active and the input differences are both β = 0x01.For the lower differential trail, four cells in the first round (R3 in the figure) are active and their output differences are 0x03, 0x20, 0x20 and 0x20.From the differential trails and the extensions, it is known that the lower crossing differences of the two active S-boxes in R2 of the upper trail are always 0, while the upper crossing differences of the active S-boxes at (0,0) and (3,0) in R3 of the lower trail are α (as depicted in Figure 4) which is non-zero and propagated from the active S-box at (0,0) in R2 of the upper trail.Note that in the context of E m with only R2 and R3, the propagation of β → α is independent of the lower trail.By applying the generalized BCT (the exact formulas in use are indicated explicitly), Probability of E m with R1, R2 and R3.Now E m starts from R 1 .In this case, the analysis of the lower differential trail remains the same while the analysis for the upper differential trail changes, compared with the E m with only R2 and R3.In the upper differential trail, only the S-box at (2,2) of R1 is active and has input difference 0x55.However, its output difference β has multiple choices and might not be 0x01.Therefore the S-box at (2,0) of R2 might be active.In R2, the lower crossing difference for S-box at (2,0) is the difference in blue that propagates from the lower differential trail and this difference is independent of the upper differential trail.Another affected active S-box of the upper differential trail is the active S-box in R1.Specifically, its lower crossing difference is the difference in green propagated from S-box (2,1) in R3 of the lower differential trail through two rounds backwards.The probability of this E m is then computed as (by Eq. 6), Probability of E m with R1, R2, R3 and R4.We do not prepend more rounds from E 0 with E m since the three rounds ahead are fully passive and after propagating the lower differential trail by three more rounds backwards, the crossing differences can be seen as uniform.Then we try to append more rounds from E 1 .First, we append R4 to E m .The only active S-box in R4 is located at (3,3) and its upper crossing difference propagates through purple cells from the upper differential trail.We can see that the upper and lower differential trails strongly interrelate.However, compared with the previous situation where E m is composed of R1, R2 and R3, the only extra effect is that the upper crossing difference in purple affects S-box (3,3) of R3.Still, we can calculate )(by Eq. 9) −9.54 .
Probability of E m with R1, R2, R3, R4 and R5.In fact, the upper crossing differences at R5 becomes so random that we can neglect the dependency in R5.However, due to the weak diffusion of the MC, the difference of the only active S-box in R4 of the lower trail does not diffuse to more cells.Actually, the output differences η of the active S-box in R4 on the two faces of the boomerang do not have to be identical.Considering this, we calculate Now we have identified the middle part E m of the boomerang distinguisher of SKINNY-128-256.The E m has 5 rounds and its probability of generating a right quartet is r = 2 −11.45 .The probabilities of intermediate E m with 2 ∼ 4 rounds and the final E m with 5 rounds are confirmed by experiments.By adding three passive rounds to the front and four passive rounds to the rear, we obtain a 12-round boomerang distinguisher of the same probability, namely 2 −11.45 .For the full 18-round distinguisher, the probability of the first four rounds is p = 2 −25.19 by a simple calculation considering the clustering effect of differentials.The probability of the last two rounds is q = 2 −8 (no clustering effect).Therefore, the probability of the full distinguisher is p2 q2 r = 2 −77.83 , which is much higher than 2 −103.84calculated in [LGS17].For other versions of SKINNY, a similar analysis can be done and we summarize the result as follows.

Results
The results of all the four versions of SKINNY-n-2n and SKINNY-n-3n are summarized in Table 1, where the fourth column presents the probabilities r of E m computed by the algorithm in Section 3.3.We carry out experiments on E m and the experimental probabilities are 2 −12.95 , 2 −11.37 , 2 −10.51 and 2 −9.89 , which are close to the probabilities in the fourth column.The computation of r for the four versions is practical and takes 0.38, 189.26, 0.11 and 23.16 seconds respectively on a desktop.The source codes for calculating and verifying the probabilities r for E m are available online1 .
The sixth column stands for the probabilities of the full boomerang distinguishers.It can be seen that the re-evaluated probabilities of the full distinguishers are much higher than the probabilities p2 q2 evaluated before without considering the dependency of the two differential trails [LGS17].Notably, the complexity of the full 17-round distinguisher of SKINNY-64-128 is 2 29.78 , which is practical.Indeed, 9 right quartets are found among 11 × 2 29 quartets by an experiment while one could expect a right one in 2 48.72 quartets according to [LGS17].This big gap shows that the issue of dependency cannot be neglected in boomerang attacks.
Additionally, we have two interesting observations of the results.One observation is that the probabilities of the boomerang distinguishers of SKINNY in Table 1 are much higher than the probabilities of the differential trails of the same number of rounds.In Table 5, we copy the lowerbounds on the number of active S-boxes in SKINNY under the related-tweakey setting from [BJK + 16].For example, the minimal number of active S-boxes of 9-round SKINNY-n-2n is 9, which means the probability of optimal differential trails could not be higher than 2 −18 .Actually, the probability of the optimal differential trails of SKINNY-64-128 is 2 −20 , as studied in [LGS17].The differential trail in Table 7 is an example of the optimal differential trails.This probability can be increased to 2 −18 by considering the clustering effect2 .On the contrary, the boomerang distinguisher of SKINNY-64-128 with 6 up to 13 rounds has a much higher probability of 2 −12.96 .The other observation of Table 1 is that the probability of the 17-round distinguisher of SKINNY-128-384 is slightly higher than the probability of the 17-round distinguisher of SKINNY-64-192.Even though an 8-bit S-box is used in the big versions of SKINNY, its optimal differential probability is 2 −2 which is the same as the optimal differential probability of the 4-bit S-box used in the small versions.Therefore, the probability of a boomerang distinguisher of the big versions is not necessarily lower than the probability of distinguishers of the small versions.

Application to AES
In [BK09], Biryukov et al. presented boomerang attacks on full AES-192 and AES-256 under the related-key setting, specifically, the related-subkey setting.Their attacks were based on high probability boomerang distinguishers constructed by applying the so-called boomerang switches.However, boomerang attacks or distinguishers of AES-128 were not covered in [BK09].One reason might be that the boomerang switches do no work for AES-128 whose differences of a differential trail are much denser than those of AES-192 and AES-256.
In this section, we briefly review the specification of AES-128, and then search for differential trails of AES-128 under related-key setting.By choosing a pair of 3-round differential according to the generalized framework of BCT, we construct the 6-round boomerang distinguisher.The probability of the boomerang distinguisher is 2 −109.42 under the related-subkey setting.

Description of AES
The Advanced Encryption Standard (AES) [DR02] is an iterated block cipher which encrypts 128-bit plaintexts with secret keys of sizes 128, 192, and 256 bits.In this paper, we focus on AES-128 which iterates 10 rounds using a 128-bit key.The internal state of AES can be represented as a 4 × 4 array of bytes.The round function consists of four basic steps as follows.
1. SubBytes (SB) -An 8-bit S-box is applied to each byte of the internal state.The details of the S-box could be found in [DR02].
2. ShiftRows (SR) -Each cell in row j is rotated to the left by j cells.
3. MixColumns (MC) -Each column of the internal state is multiplied by a Maximum Distance Separable (MDS) matrix over F 2 8 .4. AddRoundKey (AK) -A round key is XORed with the internal state.
At the very beginning of the encryption, an additional whitening key addition is performed, and the last round does not contain MC.
The key schedule of AES-128 generates round keys which are used in each of the rounds.The 128-bit master key can be seen as 4 32-bit words (W for i 4 is computed as follows and each round key takes four consecutive words. Property of AES S-box.The best differential probability of AES S-box is 2 −6 .Given any input difference α, there exists exactly one output difference β such that DDT(α, β) = 4 and 2 7 − 1 output differences β such that DDT(α, β) = 2, and vice versa.Its boomerang uniformity is 6.

Search for Differential Trails
In the literature, several methods have been proposed to search differential trails for AES under the related-key setting, such as [BN10, GMS16, SGL + 17, GLMS18].The methods in [GMS16,GLMS18] employ Constraint Programming (CP) and perform well for all versions of AES.Therefore, we adopt the CP-based methods for searching differential trails of AES-128.
As out in [GLMS18], the minimal number of active S-boxes in three rounds of AES-128 under the related-key setting is 5.When we increase the number of rounds to four, the minimal number of active S-boxes becomes 12. Based on this, we aim to construct 6-round boomerang distinguishers of AES-128 from 3-round differential trails.

Boomerang Distinguisher
There exist only two 3-round differential trails with 5 active S-boxes and both have probability 2 −31 .However, these two differential trails turn out to be incompatible, i.e., the dependent part E m of two differential trails could not generate a right quartet.How hard can we find a pair of compatible differential trails?According to the properties of the AES S-box, we give a rough estimation as follows.
For an active S-box in E 0 (resp.E 1 ) whose lower (resp.upper) crossing difference is non-zero and fixed, the differences are compatible (the BCT entry is greater than 0) with probability close to 2 −1 .For a pair of interrelated active S-boxes, the differences are compatible (the probability in Eq. 9 is greater than 0) with probability close to 2 −2 .Therefore, the sparser the differences are, the more easily we can get a pair of compatible differential trails.
We then search for more 3-round differential trails by allowing 6 active S-boxes and obtain 18 differential trails with probability 2 −36 , 2 −37 or 2 −38 respectively.From these 3-round differential trails, we search for a compatible combination by the generalized framework of BCT.The best one we find is composed of a 3-round upper differential trail of probability 2 −31 and a 3-round lower differential trail of probability 2 −37 , as shown in Table 6.
The probability of the 2-round E m is verified by experiments.

Discussions
As showcases, we apply our generalized framework of BCT to SKINNY and AES.Both SKINNY and AES are SPN block ciphers and share similar round functions.However, AES and SKINNY are typical examples of block ciphers with very strong and weak round functions respectively.Specifically, the AES S-box is differentially 4-uniform (6-uniform for BCT) and the AES MC has a branch number of 5. On the contrary, the SKINNY's 8-bit S-box is differential 64-uniform (256-uniform for BCT) and the branch number of its MC is only 2.
Together with the analysis of the two block ciphers, we summarize two general properties of the dependent part E m of the boomerang distinguisher.
Property 1 The length of E m is mainly determined by the diffusion effect of the linear layer, even though it is also influenced by the density of differences of the trails.Note that, AES takes 2 rounds to diffuse an active byte to the full state while SKINNY takes 6 rounds to have the same effect.Compared with the 2-round E m of AES, the E m of SKINNY is quite long and can be 6 rounds, which can be seen from the analysis in Section 4 and 5.
Property 2 The probability of E m is strongly affected by the DDT and BCT of the S-box.For example, when we replace SKINNY-128-256's S-box in Figure 4 with the AES S-box, the probabilities of E m with two and three middle rounds decrease from 2 −1.75 , 2 −6.06 to 2 −15.87 , 2 −31.67 respectively.
As can be seen, these properties are identical to the common criteria for designing symmetric-key primitives.
The time complexity of calculating the probability r of E m mainly depends on the length of E m and the S-box used in the cipher.Specifically, for a short E m with small or weak S-boxes, calculating r is efficient, while for a long E m with large and strong S-boxes, calculating r might be a time-consuming task, i.e., the time complexity might be greater than 2 35 .

Concluding Remarks
In this paper, we revisited the boomerang connectivity table (BCT) and provided a generalized framework of BCT which systematically handles the dependency of two differential trails in boomerang distinguishers.Particularly, our framework not only identifies the actual boundaries of the dependent part E m of the boomerang distinguisher, but also calculates the probability r of E m for generating a right quartet.With our generalized framework of BCT, the sandwich E = Ẽ1 • E m • Ẽ0 now closely models the boomerang distinguisher with probability p2 q2 r where p (resp.q) is the probability of the differential of Ẽ0 (resp.Ẽ1 ).
The power of the generalized framework of BCT was demonstrated by the application to SKINNY and AES.In the application to SKINNY, the probabilities of four boomerang distinguishers of SKINNY were accurately computed for the first time, which show that the actual probabilities are much higher than those previously computed by the formula p2 q2 .In the application to AES, a 6-round related-subkey boomerang distinguisher was constructed with the generalized framework of BCT.
We also discussed the general relation between the dependency of two differential trails in a boomerang distinguisher and the properties of the components of the cipher, and showed that the dependency is strongly influenced by both the diffusion property of the linear layer and differential properties of the non-linear layer.

)Figure 3 :
Figure 3: Toy examples of (1) active S-boxes in E0, (2) active S-boxes in E1 and (3) interrelated active S-boxes, where 'S' denotes an n-bit S-box, 'L' denotes the linear layer and the red (resp.blue) arrows stand for extensions of the upper (resp.lower) differential trails with probability 1.
E m with one more round, (a) Check whether the lower crossing differences for this newly added round are distributed uniformly or not.If yes, peel off the first round of E m and go to step 4. (b) Go to step 3. 4. Append one more round to E m , (a) Check whether the upper crossing differences for the newly added round are distributed uniformly or not.If yes, peel off the last round of E m and go to step 5. (b) Go to step 4. 5. Calculate r using formulas in Section 3.2.

Figure 5 :
Figure 5: The two middle rounds of the 6-round boomerang distinguisher of AES-128

Table 1 :
Probabilities of the boomerang distinguishers of SKINNY where |Em| denotes the number of rounds Em contains

Table 4 :
Differential trails of SKINNY-128-256 where each non-zero cell is given in hexadecimal.

Table 5 :
Lowerbounds on the number of active S-boxes in SKINNY under the related-tweakey setting [BJK + 16]