New Properties of the Double Boomerang Connectivity Table

. The double boomerang connectivity table ( DBCT ) is a new table proposed recently to capture the behavior of two consecutive S-boxes in boomerang attacks. In this paper, we observe an interesting property of DBCT of S-box that the ladder switch and the S-box switch happen in most cases for two continuous S-boxes, and for some S-boxes only S-box switch and ladder switch are possible. This property implies an additional criterion for S-boxes to resist the boomerang attacks and provides as well a new evaluation direction for an S-box. Using an extension of the DBCT , we verify that some boomerang distinguishers of TweAES and Deoxys are flawed. On the other hand, inspired by the property, we put forward a formula for estimating boomerang cluster probabilities. Furthermore, we introduce the first model to search for boomerang distinguishers with good cluster probabilities. Applying the model to CRAFT , we obtain 9-round and 10-round boomerang distinguishers with a higher probability than that of previous works.


Introduction
Differential cryptanalysis, proposed by Biham and Shamir [BS91], is one of the most powerful techniques to assess the security of block ciphers. The main idea is to search for non-random pairs of input and output differences of the cipher with high probability. In many cases, it is hard to find long differentials. To overcome the restriction, Wagner introduced the boomerang attack in [Wag99], which is a development of differential cryptanalysis. The main idea of boomerang attacks is to combine two short differentials with high probabilities to get a long one. In boomerang attacks, a cipher E is regarded as the composition of two sub-ciphers, i.e., E = E 1 • E 0 . Suppose there exists two differentials ∆ 1 → ∆ 2 for E 0 and ∇ 2 → ∇ 3 for E 1 with probabilities p and q. Then it is a boomerang distinguisher of probability:

Boomerang Attack
In [Wag99], Wagner proposed the boomerang attack which is grounded in the idea that combining two short differentials may lead to a good long one. In the left of Fig. 1 where a block cipher E is treated as the composition of two sub-ciphers E 0 and E 1 , we suppose there exist two short differentials ∆ 1 → ∆ 2 and ∇ 3 → ∇ 2 with high probability p and q. Under the assumption that the two differentials are independent, the probability of the boomerang distinguisher of E is The sandwich attack, proposed by Dunkelman et al. in [DKS10,DKS14], is an improvement to the boomerang attack. Instead of assuming the two trails are independent, it takes into account the dependency between the two differentials and handles it in a middle part E m , as shown in the right of Fig. 1. Thus, the sandwich attack regards cipher E as the composition of three sub-ciphers E = E 1 • E m • E 0 , where E m usually contains a small number of rounds. If the probability of a boomerang coming back over E m for random inputs x is P(E −1 m (E m (x) ⊕ ∇ 2 ) ⊕ E −1 m (E m (x ⊕ ∆ 2 ) ⊕ ∇ 2 ) = ∆ 2 ) = r, then the probability of the whole boomerang distinguisher is P(E −1 (E(P ) ⊕ ∇ 3 ) ⊕ E −1 (E(P ⊕ ∆ 1 ) ⊕ ∇ 3 ) = ∆ 1 ) = P E0 · P Em · P E1 = p 2 · r · q 2 .

Previous Methods to Search for Boomerang Distinguishers
To find boomerang distinguishers, the classical approach is to search for two short differential characteristics with high probability and then combine them. In [CHP + 17], Cid et al. proposed an MILP model for searching boomerang distinguishers on Deoxys, which employs the ladder switch in the combination. Later, Liu et al. gave a more generic MILP model for the block cipher GIFT in [LS19]. Note that in these two works the target cipher is divided into three parts Remark. All previous works search for boomerang distinguishers in two steps. The first step finds good truncated characteristics and the second step searches for good instantiations following the obtained truncated characteristics. The clustering effect is significant especially for word-oriented block ciphers. At present, all previous works consider the cluster effect when actual characteristics are obtained in the second step. As far as we know, no method is available in the literature to reflect the clustering effect in the first step.

New Properties of Double Boomerang Connectivity Table
In this section, we define the Double Boomerang Connectivity Table (DBCT) and present new properties of it. Then we discuss the extensions of DBCT. Table (DBCT) DBCT as defined below captures the properties of two S-boxes in a row in boomerang attacks, as depicted in Fig. 3 (left). Definition 6. Let S be a function from F n 2 to F n 2 . The double boomerang connectivity table (DBCT) 4 is defined as

Double Boomerang Connectivity
Like the differential uniformity, a new uniformity can be defined similarly.
Definition 7 (Double Boomerang Uniformity). The double boomerang uniformity of S is the largest value in the DBCT except for the first row and the first column: Note we could represent DBCT(α 1 , β 3 ) as the sum of two parts: 4 DBCT was first introduced in [HBS21] and defined in an algorithmic way. In fact, the DBCT notation used in this paper is the same as in [HBS21] but we use a more succinct definition.  = 0, there are 704, 738, 620, 608, 608, 735, and 735 nonzero values in total, respectively. Considering α 2 ̸ = β 2 , there are only 72, 36, 0, 0, 0, 0, and 0 nonzero values, respectively. We observe that the nonzero dbct(α 1 , α 2 , β 2 , β 3 ) occurs mainly when α 2 = β 2 . This means in most cases the UBCT and LBCT for computing DBCT degenerate to DDT, as shown in Equation (1). Thus, the entries of DBCT can be lower-bounded by a value computed from DDT entries, as formalized in Property 1.
Definition 8 (Hard S-box). Let S be a function from F n 2 to F n 2 . S is hard if the following holds. For ∀α 1 , β 3 ̸ = 0, Remark. For a cipher employing hard S-boxes, it only allows two typical switch effects in two continuous S-boxes, i.e., the S-box switch and the ladder switch. In other words, a right quartet (x 1 , x 2 , x 3 , x 4 ) for the two continuous S-box is always composed of two pairs of the same value, i.e., x 1 = x 4 , x 2 = x 3 , as illustrated in Fig. 3 (right).
Example 1. A good example of hard S-boxes is PRESENT's S-box. Table 1 and Table 2 display PRESENT's DBCT and DDT. It can be seen that for all i, j ̸ = 0, the entry at position (i, j) in DBCT equals the dot products between the i-th row and the j-th column of DDT.
. In this case, the intersection between the two cosets is empty.

Extensions
One may wonder under what circumstances the DBCT is applicable, i.e., there are two active S-boxes in a row. Basically, the linear layer of the round function needs to be simple so that the output difference of one S-box may exactly be the input difference of another S-box. Indeed, this may happen when the linear layer can be represented with a binary matrix. A natural question would be: what if the linear layer is extremely simple or complex?
In this subsection, we first discuss extensions of DBCT in the case where the linear layer is extremely simple. In this case, multiple active S-boxes in a row are possible. Then we discuss the extension in the case where the linear layer is relatively complex.

216
New Properties of the Double Boomerang Connectivity Table   Table 1 Multiple S-boxes. In the case of t > 2 consecutive S-boxes, we could define as well a similar table which is called t-BCT. If the S-box is hard, then we have the following proposition for t-BCT.
Proposition 2. Let S be a function from F n 2 to F n 2 . If S is hard, then for ∀α, β ∈ F n 2 \0 and t > 2, The proof of Proposition 2 is postponed to Appendix A.1.
The AES S-box is an 8-bit S-box, and thus the size of its DBCT is 256 × 256. In the DBCT 0,0 of the AES S-box, all entries for zero input difference (the first row) and zero output difference (the first column) are 65536 owing to the ladder switch effect. For the other    Table 3. We also list the basic DBCT for the AES S-box in Table 3. Note that, the DBCT is related to the linear layer and the S-box. For the basic DBCT with simple XOR operations, the AES S-box is hard without zero entries.
with complex linear layer, most entries are zero.

Revisiting Boomerang Attacks on CRAFT, TweAES and Deoxys-BC
In this section, we revisit some existing boomerang distinguishers of CRAFT, TweAES and Deoxys-BC, respectively. Through the boomerang distinguisher of CRAFT, we demonstrate how DBCT uniformity and hard S-box matter. For the boomerang distinguishers of TweAES and Deoxys-BC, inspired by the property of AES S-boxes with a complex linear layer in between, we verify that two published boomerang distinguishers are flawed using extended DBCT.

Effect of Different S-boxes for Boomerang Distinguishers
CRAFT is a lightweight tweakable block cipher introduced by Beierle et al. [BLMR19] at FSE 2019. For more details of the cipher, please refer to Appendix B.1 and [BLMR19]. In [HBS21], a 13-round boomerang distinguisher of CRAFT is presented and there are 7 rounds in the middle to handle the dependency. We redraw the 7-round middle part E m in Fig. 6, where White cells are zero differences, Yellow cells are nonzero differences and Green cells are unknown differences. The input difference of the upper characteristics is ∆ = [0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] and the output difference of the lower characteristics is ∇ = [0, 0, 0, 0, 0, a, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]. Note that {A, ..., H} are the states for upper characteristics and {a, ..., h} are the states rounds for lower characteristics. The symbols follow those in the original work, details can be referred to [HBS21]. A detailed analysis in [HBS21] showed that the boomerang distinguisher of the 7-round E m involves four  Table 4.  Table 4 compares the uniformity of DDT, BCT and DBCT for different S-boxes, lists whether these S-boxes are hard or not, and gives the probabilities for the 7-round distinguisher under different S-boxes. From Table 4, We have the following observations.
• Even though CRAFT's S-box and PRESENT's S-box share the same DDT and BCT uniformity, the probability of the 7-round distinguisher differs for these two S-boxes. A possible reason for this is that they have different DBCT uniformity.
• Although QARMA's S-box has better BCT uniformity than PRESENT's S-box, it results in a higher probability. We note QARMA's S-box has a higher DBCT uniformity.
• PRESENT's S-box, LBlock-s0 and LBlock-s1 share the same DDT, BCT, and DBCT uniformity; they also lead to almost the same probability for the 7-round distinguisher.
New Properties of the Double Boomerang Connectivity Table   • The S-boxes of MIBS and TWINE have small BCT and small DBCT uniformity; at the same time, the probabilities of the 7-round distinguisher are low and stable for different input and output differences (∆, ∇).
These observations indicate that, apart from the uniformity of BCT and DDT, the uniformity of DBCT is a new measure criterion to evaluate the performance of S-box for resisting boomerang attacks. Therefore, the DBCT uniformity should be used together with the BCT uniformity to have a better evaluation of the S-box against the boomerang attack.
As the DBCT is equivalent to the S-box switch in most cases, i.e., a quartet is formed by two pairs of the same value, it is interesting to see what happens when we force the ciphertexts to form such quartets. It is expected that the probability will increase. An experiment on the 7-round boomerang distinguisher of CRAFT confirm this, as shown in Appendix C.1.

Flawed Boomerang Distinguisher of TweAES and Deoxys-BC
For the S-box of AES, the DBCT with complexity linear layer has too many zero values. It will easily invalidate boomerang characteristics. Inspired by this, we focus on TweAES and Deoxys-BC and revisit some boomerang distinguishers, finding them flawed. It can be seen that only the third, the fourth and the last rounds are critical to the probability of the distinguisher. Since the differential propagation of the last round is simple, we mainly detail the first six rounds and Table  11 gives their setting. We recompute the probability of the two middle rounds and find it 0 rather than 2 −75 as reported in [CDJ + 20]. This means the probability of the full distinguisher is not 2 −75 multiplied by the probability of the last round 2 −48 .

Recompute the Probability of the Boomerang Distinguisher in [CDJ
To compute the probability P Em of the two middle rounds, we divide the state into four columns and calculate them separately. So Without loss of generality, we compute the probability P 1 of the second column in detail. As shown in Fig. 7 It is easy to compute P 1 by trying all possible α, β ′ 1 , and β ′ 2 . We then obtain P 1 = 0. Hence, the differential propagation for the two middle rounds is impossible, i.e., Our computation shows that the probability of the 7-round boomerang distinguisher proposed in [CDJ + 20] is not correct. Deoxys-BC in the model RTK2, where RTK1 denotes single-key attacks on any variant with at least 128 bits of tweak and RTK2 denotes single-key attacks on Deoxys-BC-384 with 256 bits of tweak, or related-key attacks on Deoxys-BC-256. For more details of the attack, please refer to [BL22]. We recompute the probability for the middle part of the cipher in the two boomerang attacks and find it 0 rather than high probabilities.

8-round boomerang attack in the model RTK1.
We compute the probability for the differential transition over the red boxes in the three middle rounds, as illustrated in Fig. 8. The detailed formula for computing the probability is We verify that no matter what β 2 is, the differential transition from 01 to β 2 over two S-box layers is incompatible as ∀β 2 ∈ F 8 2 \ 0 So the probability of P must be 0, which means the characteristic over the three middle rounds is incompatible. 10-round boomerang attack in the model RTK2. We compute the probability for the differential transition over the red boxes in the two middle rounds, as depicted in Fig. 9.
For ∀α ∈ F 8 2 \ 0, the probability is . By trying all possible α, α 1 and β ′ 1 , we get a zero probability. Therefore, the middle part of the 10-round attack is also incompatible.

Discussion
Even though the basic DBCT cannot be directly applied to the boomerang distinguisher of TweAES and Deoxys-BC, employing the extensions, we do confirm that the interactions between two S-box layers matter and should be treated carefully. The source codes for computing the probabilities in subsection 4.2 are available via the link https://www. jianguoyun.com/p/DTV20E4QiPTMChiVlNQEIAA.

MILP Model to Search for Boomerangs with Cluster Probability
It is shown in [SQH19] the probability of a boomerang disginuisher of SKINNY is significantly increased from 2 −103.84 to 2 −77.83 when the clustering effect is considered. Later, better boomerang distinguishers of SKINNY were proposed by exploiting the clustering effect in [DDV20,HBS21]. Generally, the search for boomerang distinguishers proceeds in two steps. The first one is to search for good truncated boomerang characteristics with the least active S-boxes, and the second one is to search for the best instantiations. Although the cluster effect is very significant for word-oriented block ciphers, it is hard to be well considered in the above two steps. In fact, in the previous works [DDV20, HBS21] multiple boomerang characteristics are counted only when a good boomerang characteristic is given.
In other words, multiple characteristics are searched under the fixed input and the output difference of a given boomerang characteristic. Due to the limitation of computing and storage capacity, there is no guarantee that the search will lead to boomerang clusters with sufficiently good probability.
To partially overcome the drawbacks, we propose a new strategy to search for boomerang distinguishers. Note that, for a boomerang distinguisher, only the input difference of the upper characteristic and the output difference of the lower characteristics are fixed while the difference of the intermediate state can vary. This motivates us to borrow the methods for calculating the probability of truncated differentials and provide a formula for estimating the probability of a boomerang cluster. In particular, Property 1 is used to simplify the computation of the probability of the middle part E m . In this section, this formula is presented by taking a boomerang distinguisher of the block cipher CRAFT as an example. With this formula, we then propose a new MILP model to search for truncated characteristics with good cluster probability as the objective. The efficiency of the formula and the new model is demonstrated by its application to CRAFT, where better 9-round and 10-round boomerang distinguishers are obtained.

Formula for the Probability of Boomerang Clusters
The existing strategy for searching for good boomerang distinguishers is to search for a single boomerang characteristic with minimal active S-boxes as the objective at first. Our basic idea is that if we could replace the objective function with the cluster probability, it is more likely to obtain good boomerang clusters.
In the following, we formulate the boomerang cluster probability for SPN ciphers via an example of CRAFT under a common assumption used in truncated differential cryptanalysis and then show how to model the probability of clusters with MILP. Note that we consider SPN ciphers with n parallel S-boxes of s bits each in the nonlinear layer.

The previous formula for the probability of boomerang clusters
We assume E m contains dependency, i.e., the differential probability of its active S-boxes cannot be evaluated only by DDT. We denote two DDTs by UDDT2 and LDDT2, where L and U denote whether the two DDTs belong to lower path or upper path, and the probability transition formulas are P UDDT2 (α 1 , α 2 ) = (P DDT (α 1 , α 2 )) 2 and P LDDT2 (β 1 , β 2 ) = (P DDT (β 1 , β 2 )) 2 where (α 1 , α 2 ) is the input difference and output difference of the S-box in upper path and (β 1 , β 2 ) is the input difference and output difference of the S-box in lower path. Taking CRAFT as an example, we give the previous formula for the probability of boomerang clusters.
Considering the clusters, the final formula of the probability for E is p 2 · q 2 · r.

Our new formula for the probability of boomerang clusters
Next, taking CRAFT as an example, we propose a high-level procedure to generate a formula that approximates the probability of its best boomerang cluster without focusing on a single characteristic.
In [MSAK99], Moriai et al. proposed a method to calculate the truncated differential probability for word-oriented SPN block ciphers. Typically, for two s-bit cells a and b which are independently uniformly distributed on F 2 s \ 0, the probability distribution of In the boomerang clusters, the intermediate differences can take many possible values as in the truncated differentials, so the above probability distribution also applies here. For E 0 (resp. E 1 ), only UDDT2 (resp. LDDT2) is used for computing the probability. Actually, the probability for E 0 (resp. E 1 ) is equivalent to the differential probability given fixed input (resp. output) difference. Thus we could directly compute the probability using the method common to the one used in the truncated differential analysis.
1. Inspired by the idea of truncated differential, we transform the computation of p and q into counting the equality conditions of XOR operations byp andq. As shown on the left of Fig. 11, there are 3 cells need to be 0 and the last cell need to be the fixed value, thus the probability isp = 1 15 4 = 2 −15.63 on average if each difference distributed uniformly on F 2 s \ 0. Given a specific ∆ as above, the probabilityp can be further adjusted top = 1 2 2 ·15 2 = 2 −11.81 on average by taking into account DDT(0xa, * ) = 4. Similarly, the probabilityq is 2 −11.81 on average, for ∀h 5 ̸ = 0. 2. Computing r is complex. For example, in Equation 2, there are 12 variables to be traversed, which is very computationally intensive. We try to convert r to the case containing only DDT, further simplifying the evaluation by the idea of truncated differentials. We obtain its lower bound by simplifying the computation using Property 1, r = B9,C12,f12,g9 P DDT (A 5 , B 9 ) · Pr(B 9 4 DDT ←− f 12 ) · P DDT (B 9 , C 12 ) · Pr(C 12 3 DDT ←− f 12 )· Pr(C 12 3 DDT −→ f 12 ) · P DDT (f 12 , g 9 ) · Pr(C 12 4 DDT −→ g 9 ) · P DDT (g 9 , h 5 ) where B 9 = b 9 , C 12 = c 12 . Because of the nature of DBCT, the formula is simplified to the model only with DDT, and the probability can be calculated by using the technique of truncated differential evaluation. As shown in the middle of Fig. 11, there are 2 cell-wise conditions consumed in f 12 , 1 cell condition consumed in g 9 and 1 cell condition consumed in h 5 . Essentially because there are 4 UBCT · LBCTs, 4 connections are established and 4 cells condition consumed are created. Therefore, the probability of E m isr = 1 15 4 = 2 −15.63 on average for any A 5 , h 5 ̸ = 0 ∈ F 2 4 . 3. With the above techniques, the calculation of the probability of E will become very simple:P r = ∆1,∇1̸ =0p 2 ·q 2 ·r = 15 2 · 2 −11.81 * 2 · 2 −11.81 * 2 · 2 −15.63 = 2 −55.06 .

226
New Properties of the Double Boomerang Connectivity Table   Figure 11: The difference propagation of E 0 (left), the difference propagation of E m (middle) and the difference propagation of E 1 (right) It can be inferred from DBCT and the borrowed technique from truncated differential analysis, that the stronger the S-box is, the better our computation approximates. We then replace CRAFT's S-box with other S-boxes and then compute the probability under all possible input and output differences (∆, ∇). The results are summarized in Table 5. It can be seen that the probability by our formula is closer to the actual optimal probability when a stronger S-box is used. Remark 1. Note that our formula is valid only if the characteristics are the same in both faces at E m of the boomerang. Actually, the two faces of the boomerang could have completely different differential characteristics. Because we study the property of DBCT for the same in both faces, we do not take this into account. Now we will give the general formula for estimating the probability of the best boomerang cluster under certain active patterns.
Probability in E 0 /E 1 . Suppose E 0 covers the first r 0 rounds, E 1 consists of the last r 1 rounds. For ∀∆, ∆ 1 , ∇ 1 , ∇ ̸ = 0, the probability are P E0 (∆ ⇄ ∆ 1 ) =p 2 and P E1 (∇ 1 ⇄ ∇) =q 2 on average, i.e.,p = 2 −s·c0 · 1 where c 0 and c 1 are the number of cells which need to be zero from uniformity and s is the cell size. For the sake of simplicity, we substitute 2 −s for 1 2 s −1 .
Probability in E m . Suppose E m is composed of the middle r m rounds. For ∀∆ 1 , ∇ 1 ̸ = 0 the probability is P Em (∆ 1 ⇄ ∇ 1 ) =r on average and where c m is the condition consumed in E m . Usually, the characteristic of E m is complex and c m could not be determined easily. We make simplifications using Property 1 and then directly evaluate the probability by counting the number of conditions consumed. In the following, we discuss the computation of c m in different situations.
• → : As with E 0 and E 1 , the difference of E m is not constrained to propagate with probability 1. When there is one cell needed to be zero (White) from uniformity (Green) in the upper or lower path, the probability has to be multiplied by 2 −s . Equivalently, it consumes 1 condition.
• UDDT2 and LDDT2: UDDT2 is independent of the lower path. When we process an S-box where UDDT2 applies, the probability has to be multiplied by β∈F 2 s P DDT (α, β) 2 ≥ 2 −s for ∀α ∈ F 2 s . Equally, there is 1 cell condition consumption for one UDDT2. Similarly, LDDT2 is independent of the upper path. When there is a LDDT2, it consumes 1 condition.
• UBCT · EBCT m · LBCT: While EBCT may exist or not, UBCT and LBCT must appear in pairs (otherwise, it will degenerate into BCT). Due to Property 1 and Definition 8, for UBCT · EBCT m · LBCT, m ≥ 0 the effect is almost an S-box switch. Thus to satisfy an S-box switch, the probability is 2 −s on average. It is the same as the truncated differential processing technique and equivalent to 1 condition consumption for one pair UBCT and LBCT.
Thus, the condition consumed in E m is the sum of the number of cells which need to be zero from uniformity, the number of UDDT2 and LDDT2, the number of UBCT · EBCT m · LBCT and the number of BCT.
Probability in E. The probability of a boomerang distinguisher of E is: where c ′ 0 is the number of UDDT2 in the upper boundary round and c ′ 1 is the number of LDDT2 in the lower boundary round.
Remark 2. In the above formula, all tables consume the same number of conditions. However, given an exact S-box, the consumption of condition for different tables may differ from 1 and a proper coefficient can be used to have a more accurate estimation.

MILP Mode to Search for Boomerangs with Good Cluster Probabilities
In this subsection, we give our MILP model for searching boomerangs, which takes the number of conditions consumed as the objective function. This model can be used alone to obtain good truncated boomerangs. Particularly, good boomerang clusters can be found if we instantiate the input and output differences for the obtained truncated ones.

Notions.
We consider E, a classical SPN cipher with the round function composed of cell-level operations. Let E be a cipher with N r rounds and n cells state.

1.
We use two bit variables to encode whether the difference of a cell will be free or controlled and whether its difference value will be known or unknown. A free difference can take any (nonzero) value uniformly while a controlled difference can not. Notably, a White cell is controlled, a Green cell is free and a Yellow cell is indeterminate. More specifically, (0, 1) : the difference is nonzero and controlled; (1, 0) : the difference is nonzero and free; (1, 1) : the difference is unknown and free.

2.
For different tables, the definitions are:

Modeling of the Attribute Propagation through SubBytes(S-RULE).
The SubBytes operation does not change the activeness of a cell, but would change its difference from free to controlled, i.e.,

Modeling of the Upper Boundary and Lower Boundary.
The target cipher E is segmented into three parts automatically, such that the overall probability is maximized. We use two sets of variables Remark 3. For the upper characteristic, the S-RULE, the upper boundary tag u , and XOR-RULE are forward propagation from i-th round to i + 1-th round. For the lower characteristic, in turn, the S-RULE, the lower boundary tag l , and XOR-RULE are reverse propagation from i + 1-th round to i-th round.
Objective Function. According to equation 3, the objective to minimize is the number of conditions consuming for E: We use the boundary tags tag u and tag l to automatically segment E into E 0 , E m and E 1 , and use the variable c u and c l to identify the condition consumed in XOR operation. Then 230 New Properties of the Double Boomerang Connectivity Table   the objective function is

Discussion
Similar to [DDV20], our model has the advantage of handling dependencies in the middle rounds automatically without specifying E m in advance. Besides, our model has two remarkable features as follows.
1. It incorporates Property 1 and Proposition 1 of DBCT so as to evaluate the probability of E m more accurately. Specifically, UBCT · EBCT t · LBCT which involves t + 2 active S-boxes actually consumes only about one condition, i.e., contributes a probability about 2 −s . Therefore, our model reflects the probability of E m more accurately than just counting the number of active S-boxes of E m as has been done in previous works [DDV20,HBS21].
2. The clustering effect in both E 0 and E 1 are also well considered. We use variables tag u /tag l to mark the boundaries of E m so that the technique borrowed from the truncated differential analysis can be applied to take into account the clustering effect in E 0 and E 1 .
As a result, our model is more likely to offer a good boomerang cluster when the input and output differences are instantiated, which will be exemplified in the next subsection. The basic idea of modelling clusters' probability, which transforms calculating the probability to simply recording the condition consumed, can be generalized to other attacks for word-oriented block ciphers, such as the boomeyong attack [RSP21], the mixture differential cryptanalysis [Gra18], and the retracing boomerang attack [DKRS20], which embedded yoyo within a boomerang.

Boomerang Clusters by Applying the New Modeling
For a specific block cipher, the first step is to use our model to get a good boomerang cluster with truncated input and output differences together with the corresponding approximate probability. The second step is to instantiate the input and output differences and obtain the exact probability by experiments or computations if possible.
We apply the new model to CRAFT and obtain boomerang distinguishers of 6-14 rounds, including new 9-round and a new 10-round boomerang distinguishers with higher probability than the ones presented in [HBS21]. Fig. 12 depicts the 9-round and the 10round boomerang distinguishers, where the 10-round boomerang distinguisher is obtained by appending one round to the 9-round boomerang distinguisher. They have the same 6-round E m , which is divided automatically by the MILP model. The input and output differences in the 9-round distinguisher are chosen as follows: The estimated probability by our method is: The experimental probability is about 2 −12.95 which is higher than the probability 2 −14.50 of the 9-round boomerang distinguisher in [HBS21]. The input and output differences of the 10-round boomerang distinguisher are as follows: ∆ = 0x000a00aa0000000a, ∇ = 0x0000a00000000a00.
And the estimated probability by our method is: The experimental probability is approximately 2 −16.40 which is higher than the probability 2 −18.17 of the 10-round boomerang distinguisher in [HBS21]. Our sourse code is provided in https://drive.google.com/file/d/1DIExHZpL0rbv9h1Ma0JrCXC0b3QpQMqR/ view?usp=sharing.

Conclusion
In this paper, we observe an exciting property of DBCT that the ladder switch and S-box switch constitute most cases for two continuous S-box and all cases for certain S-boxes in boomerang attacks. The meaning of this observation is at least twofold. From the point of view of cryptanalysis, when there is strong dependency between the two differential trails (this is the case for many lightweight ciphers, such as CRAFT), DBCT helps to capture dependency easily, and when the S-box is hard, the treatment of dependency can be simplified further, while this is not unveiled in previous works. For hard S-boxes with a complex linear layer, the property of the extension of DBCT also shows that the interactions between two S-box layers matter and should be treated carefully to avoid proposing flawed boomerang distinguishers. From the point of view of designers, for a cipher using a lightweight linear layer, one needs to pay more attention to the DBCT uniformity when choosing the S-box.

236
New Properties of the Double Boomerang Connectivity Table   Due to α 2 = β 2 , we get α = β ′ and α 3 = β 3 . Thus Similarly, it can go straight to the case with more continuous S-boxes, i.e.,

A.2 Proof of objective function
= c m − X.

B.1 Specification of CRAFT
CRAFT is a lightweight tweakable block cipher which introduced by Beierle et al. [BLMR19] at FSE 2019. CRAFT supports 64-bit message, 128-bit key and 64-bit tweak, and its round function is composed of involutory operations. The round function is shown in Fig. 13 and its operations are listed as follows: • MixColumns(MC): The MC layer is the multiplication of internal state by the following binary matrix: • AddRoundConstants(ARC): The state XOR-ed constant in cells 4 and 5.
• Sbox(SB): CRAFT uses a 4-bit involutory S-box, the detail is given in Table 9.

B.2 Specification of TweAES
The tweakable block cipher TweAES is one of the underlying primitives of Authenticated Encryption with Associated Data (AEAD) scheme ESTATE [CDJ + 20], which is a secondround candidate of the NIST Lightweight Cryptography Standardization project. It is tweaked from AES-128 [DR02] and takes in as input a 4-bit tweak, a 128-bit key and a 128-bit block. Its round function has five operations, which are identical to that of AES except AddTweak. Next, we briefly describe the round function of TweAES.
• ShiftRows: The bytes in the i-th row are cyclically shifted by i places to the left.
• MixColumns: Multiply each column with an invertible MDS matrix.

B.3 Specification of Deoxys-BC
Deoxys-BC is an AES-based tweakable block cipher [JNPS16], based on the tweakey framework [JNP14]. The Deoxys authenticated encryption scheme makes use of two versions of the cipher as its internal primitive: Deoxys-BC-256 and Deoxys-BC-384. Both versions are ad-hoc 128-bit tweakable block ciphers which besides the two standard inputs, a plaintext P (or a ciphertext C) and a key K, also take an additional input called a tweak T . The concatenation of the key and tweak states is called the tweakey state. For Deoxys-BC-256 the tweakey size is 256 bits.
Deoxys-BC is an AES-like design, i.e., it is an iterative substitution-permutation network (SPN) that transforms the initial plaintext (viewed as a 4 × 4 matrix of bytes) using the AES round function, with the main differences with AES being the number of rounds and the round subkeys that are used every round. Deoxys-BC-256 has 14 rounds.
Similarly to the AES, one round of Deoxys-BC has the following four transformations applied to the internal state in the order specified below: • AddRoundTweakey -XOR the 128-bit round subtweakey to the internal state.
• SubBytes -Apply the 8-bit AES S-box to each of the 16 bytes of the internal state.
• MixColumns -Multiply the internal state by the 4 × 4 constant MDS matrix of AES.
After the last round, a final AddRoundTweakey operation is performed to produce the ciphertext.
We denote the concatenation of the key K and the tweak T as KT , i.e. KT = K||T . The tweakey state is then divided into 128-bit words. More precisely, in Deoxys-BC-256 the size of KT is 256 bits with the first (most significant) 128 bits of KT being denoted W 2 ; the second word is denoted by W 1 . Finally, we denote by ST K i the 128-bit subtweakey that is added to the state at round i during the AddRoundTweakey operation. For Deoxys-BC-256, a subtweakey is defined as The 128-bit words T K 1 i , T K 2 i are outputs produced by a special tweakey schedule algorithm, initialised with T K 1 0 = W 1 and T K 2 0 = W 2 for Deoxys-BC-256. The tweakey schedule algorithm is defined as

C.1 A Note on the Boomerang Attack with Swapped Ciphertexts
Recall that the probability of the 7-round boomerang distinguisher of CRAFT involves four DBCT. As the DBCT is equivalent to the S-box switch in most cases, i.e., a quartet is formed by two pairs of the same value, we check how the probability changes when new pairs of ciphertexts are generated by swapping certain cells of obtained pairs of ciphertexts, as the attacker does in the yoyo attack. We reuse the 7-round boomerang distinguisher of CRAFT and test three kind of S-boxes, namely the S-box of CRAFT, PRESENT and TWINE. Note the latter two are hard S-boxes. We then consider three pairs of boomerang attacks as follows.
Case A: Standard boomerang distinguisher with exact input difference and output difference (∆ in , ∆ out ) allowing the highest probability. Note that Case B is a variant of the standard boomerang attack which allows a higher probability at the cost of a lower signal to noise ratio. Case B has been used in attacks against AES and Deoxys-BC [Sas18]. And Case C' is actually the yoyo attack. We conduct an experiment and the probabilities are summarized in Table 10. From Table 10 we have two observations.
• For each pair of cases like (A, A'), the probability is increased with swapped ciphertexts and the increase in probability is more significant for PRESENT's S-box and TWINE's S-box. This is reasonable as these two kind of S-boxes are hard and thus only allow the S-box switch for the for DBCT.
• The probabilities in Case B and C (or B' and C') are very close for S-boxes PRESENT's S-box and TWINE's S-box which have both good BCT and DBCT uniformity. That is, there is no special input difference ∆ in leading to much higher probability than others and truncated ones are good enough. This reminds us that searching for truncated boomerang distinguishers with good probability might be a good idea that is worth trying. We try this idea in Section 5. 240 New Properties of the Double Boomerang Connectivity Table   Table 11: Setting of the first six rounds of the boomerang distinguisher of TweAES Round State difference before SubBytes Tweakey difference 1 ∆ 1 = 0x1100110000000000 ∆T K 1 = 0x1100110000000000 7 ∇ 7 = 0x0000000000000000 ∇T K 7 = 0x0011001100000000   00 00 00 00 01 00 00 00 00 00 00 00 00 00 00 00