Finding Bit-Based Division Property for Ciphers with Complex Linear Layers

. The bit-based division property (BDP) is the most eﬀective technique for ﬁnding integral characteristics of symmetric ciphers. Recently, automatic search tools have become one of the most popular approaches to evaluating the security of designs against many attacks. Constraint-aided automatic tools for the BDP have been applied to many ciphers with simple linear layers like bit-permutation. Constructing models of complex linear layers accurately and eﬃciently remains hard. A straightforward method proposed by Sun et al. (called the S method), decomposes a complex linear layer into basic operations like COPY and XOR , then models them one by one. However, this method can easily insert invalid division trails into the solution pool, which results in a quicker loss of the balanced property than the cipher itself would. In order to solve this problem, Zhang and Rijmen propose the ZR method to link every valid trail with an invertible sub-matrix of the matrix corresponding to the linear layer, and then generate linear inequalities to represent all the invertible sub-matrices. Unfortunately, the ZR method is only applicable to invertible binary matrices (deﬁned in Deﬁnition 3).


Introduction
The division property, proposed as a generalized integral property at Eurocrypt 2015 [Tod15b], has been applied to many symmetric ciphers.It has improved many previous integral distinguishers for block ciphers, and a remarkable application of the division property was that, for the first time, it broke the full MISTY1 [Mat97] at Crypto 2015 [Tod15a].Furthermore, the division property enhanced significantly the cube attacks which are the most powerful attacks for stream ciphers and those providing authenticated encryption properties, by eliminating the biggest limit of the classical cube attacks -that only practical-size cubes could be chosen [TIHM17, WHT + 18, HIJ + 19, WHG + 19].
Since its proposal, the division property has been further investigated for more applications.The original division property (word-based division property) in [Tod15b] propagates through the successive rounds of a cipher by capturing some information resulting from the algebraic degree of the round function.However, since it treats the round function at the word level, by its nature some propagation information through it cannot be captured.In order to further consider the Boolean properties of the Sboxes, Boura and Canteaut [BC16] gave a more precise description for the division property at Crypto 2016.The bit-based division property (BDP), which was proposed at FSE 2016 [TM16], treats the components of the target primitive at the bit level so that more information in the structures can be used.Compared with the word-based division property, the BDP is more likely to find better integral characteristics.
However, it is tedious to trace the BDP by taking a commonly used programming language such as C, because the time and memory complexities of such programs are usually estimated as O(2 n ) where n is the block size.This is the exact reason why integral distinguishers based on the BDP could only be found in [TM16] for SIMON-32 [BSS + 15] and SIMECK-32 [YZS + 15] where the block size of the ciphers is limited to 32 bits.In order to overcome this bottleneck, Xiang et al. transformed the searching for the BDP to Mixed Integral Linear Programming (MILP) problems at Asiacrypt 2016 [XZBL16], which had been commonly applied to search for differential and linear characteristics [MWGP11, SHW + 14, FWG + 16, CLCW19].As a result, the BDP of ciphers with the block sizes much larger than 32 bits can be obtained efficiently, based on which improved integral attacks for many ciphers can be achieved [SWW20, FTIM17, WGR18, SWLW18].Since then, MILP-aided automatic search techniques have become the dominant tools for finding the BDP.Later, some automatic search tools aided by Boolean satisfiability problem (SAT) [Coo71] and Satisfiability Modulo Theories (SMT) [BSST09] are also built to find the BDP [SWW17,HW19] .
Symmetric primitives need to be non-linear, and they are composed of non-linear and linear components.It is common to base a cipher on Sboxes to facilitate the non-linear/confusion property.Diffusion properties can typically be achieved by linear operations including simple ones such as bit-permutation (e.g. the linear layer of PRESENT [BKL + 07]) or complex ones such as an MDS matrix (e.g. the MixColumns operation of the Advanced Encryption Standard (AES) [DR02]).The automatic search models for the division property including the MILP model and SMT/SAT model describe these components as constraints and call the corresponding solvers to find whether there is a solution to the model.If the result returned by the solver shows there is no solution, then balanced properties are satisfied.Components such as Sboxes, and basic operations COPY, AND, and XOR have been modeled well in [XZBL16].Thanks to these, the BDP propagations for ciphers with a bit-permutation linear layer can be easily handled.However, the problem of how to model the complex linear layers, e.g. the multiplication with an MDS matrix has remained.So far two methods have been proposed to solve this problem.S method.Any (complex) linear operation can be decomposed into a sequence of basic operations of COPY and XOR.The core of the S method is to model these corresponding basic operations with some auxiliary variables instead of modeling the complex linear layer directly.As for the basic operations, they have been handled very well in [XZBL16].
The obvious advantage of the S method is that it can be applied to all those kinds of complex linear layers.However, the crucial limitations of the S method is that it does not consider the cancellation between terms so it will bring in some invalid division trails, which will make the unit vectors appear in advance.In other words, the S method might make us miss the best integral property.
Another problem of the S method is that sometimes it cannot efficiently find the BDP for some ciphers.For instance, in terms of the 6-round LED cipher, with 52 active bits as the input, the S method cannot judge if a balanced property exists [SWW20] even after 24 hours.Why this is so slow is unknown, because it may depend on the internals of the MILP solvers it calls.ZR method.An invertible linear layer L : F n 2 → F n 2 can be represented as a matrix M ∈ F n×n 2 .In [ZR19], Zhang and Rijmen constructed a one-to-one relation between BDP of the matrix M and the invertibility of the sub-matrices of M .Then they generated a system of inequalities to describe the BDP through these sub-matrices and called MILP solvers to judge the validity of the trails.Therefore, the ZR method can detect every valid division trail accurately.
Unfortunately, the Achilles heel of the ZR method is that it can only be applied to a so-called binary matrix (defined in Definition 3).Some simple examples for the binary matrix are the MixColumn(s)1 matrices used in SKINNY [BJK + 16] and Midori [BBI + 15].As far as we know, non-binary linear layers cannot be evaluated accurately using BDP because in the ZR method the huge number of linear inequalities to describe all the sub-matrices of the matrix in the linear layer will overwhelm the capacity of any MILP solver.However, the security of some popular designs such as the AES [DR02] and ISO standard CLEFIA [SSA + 07] depends on non-binary mappings.Therefore, it is important to build a new automatic model to efficiently describe the BDP of the non-binary linear layers.

Our Contributions
In this paper, we construct a new model to describe the BDP which is significantly superior to the previous ZR or the S methods.Compared with these two methods, our new model has three advantages as follows.
Applicable to non-binary matrices of a large size.Unlike the ZR method, our approach does not build a huge system of linear inequalities representing all the sub-matrices of the matrix, but it enables us to judge the validity of a candidate division trail by checking the invertibility of its corresponding sub-matrix.Therefore, the computing scale of our model can be tackled by many constraint-based solvers, which makes our method practical.
One of our applications is to model the MDS matrices which cannot be handled by the ZR method.In the ZR method, if one wants to model the MDS matrix of the AES MixColumns operation, one has to generate (2 32 − 1) linear inequalities to represent the BDP of it, which makes it infeasible.As a result, there is no guarantee that no key-independent integral distinguishers for the 5-round AES exist.However, the 5-round AES is an important primitive which has been used in many new designs [WP14].By taking our method, for the first time, the BDP of the AES matrix can be accurately traced and we prove that there are indeed no 5-round key-independent distinguishers for the AES based on the BDP.Although our result does not break the previous records for the AES key-independent distinguisher, we answer the open question whether a longer integral distinguisher can be found if we can accurately search for the BDP of its MDS matrix.Our new method provides more confidence in the security of the 5-round AES and designs based on it.
Applicable to non-invertible matrices.In our model, we remove the condition from the ZR method that only invertible matrices can be considered.We search for the 5-round key-dependent integral distinguisher proposed by Sun et AES , and x 0 , x 1 and x 2 take all the possible values, i.e., x takes all the 2 24 values.The four bytes of the output y = (y 0 , y 1 , y 2 , y 3 ) T must have at least one byte y i taking 2 8 values.It requires that the first two bytes of the input are always equal.To check this property by using the BDP, we could prepare a shrunk matrix T and x takes all the 2 24 values, then the output is just y = (y 0 , y 1 , y 2 , y 3 ) T .In the BDP model for M shrunk , suppose the input vector is u = (ff,ff,ff), i.e. the division property of the input is D 24 u , then the output vector 2 8 should have at least one v i such that v i = ff.This property cannot be checked by the ZR method, because it is infeasible to generate a huge number of linear inequalities to present all the invertible sub-matrices.Nor can it be detected by the S method, because the S method will insert invalid division trails that contain vectors with all four bytes less than ff.Any such vectors result in a quicker loss of the balanced property than the AES itself should.By using our method, this property can be obtained.We reproduce the 5-round integral distinguisher in [SLG + 16] and also construct a similar 4-round integral distinguisher with 2 24 data complexity.More details are shown in Section 5.1.
Find more accurate BDP.We follow the principle of the ZR method that for a nonbinary matrix every valid division trail is mapped to an invertible sub-matrix, and manage to efficiently model this relation in our new model.Unlike the S method, we get rid of the problem of inserting invalid division trails to the search models, and we are more likely to achieve better results.Since the ZR method can find more accurate division properties for SKINNY [BJK + 16] and Midori [BBI + 15] than the S method, and our method is based on a more generalized theory than the ZR method, we can find those BDP which cannot be found by the S method.Another example to show we have the potential to get more accurate propagation of the BDP is the 5-round AES key-dependent integral distinguishers which obviously cannot be found by the S method as mentioned above.
Possibly find better BDP.For a given number of rounds, our new model can find BDP results with less time and memory, which means it has the potential to find BDP of more rounds or with more balanced output bits.With the help of our new model, we improve the previous BDP results for several ciphers.
Firstly, we apply the new model to LED, which is an SPN cipher with a heavy MDS matrix as part of its linear layers.With the previous S method, we cannot obtain the accurate BDP results.For the first time, we show that LED has a 7-round integral distinguisher, which is the longest to date.
MISTY1 [Mat97] cipher was broken by Todo using integral attack [Tod15a] at Crypto 2015.The 6-round integral distinguisher used in [Tod15a] was found by the word-based division property with 63 active bits out of 64 bits, which implied that the integral attacks required almost the whole codebook [Tod15a,BK16].So finding integral distinguishers with more rounds or less data is meaningful to reduce the complexity for the entire attack.As is commonly believed, the BDP can find better integral distinguishers that have either more rounds or more balanced output bits or that require less data complexity than the word-based division property.However, the only result for MISTY1 found by the BDP till now in [EKKT18] only reaches three rounds2 .Thus it is valuable to check the security of MISTY1 by the BDP.For the first time, we give a BDP result for 6-round MISTY1.Furthermore, with 62 active bits at the input, we can get a 6-round distinguisher with 32 bits balanced 3 .
Moreover, we give a 10-round distinguisher for CLEFIA [SSA + 07].Although this result does not break the previous record, we answer the open question whether longer integral distinguishers can be found because we can search for the accurate BDP of them for the first time.At last, we also give the longest BDP of 7-round Camellia [AIK + 00] with F L/F L −1 functions 4 .We list all our new BDP results obtained in Table 1.Since we input CVC files to the STP solver, the stopping rules are updated manually.To give the total running time, we set the sum of all the balanced bits we obtained from the manual stop phase as 1. ‡ The running time for 5-round result is less than the 4-round one, because we use a speed-up strategy in the 5-round model.† With 2 51 data, they did not find any balanced bit; with 2 52 data, their model did not return any result.So for 6-round LED, the S method needs at least 52 active bits.We also implement the S method ourselves, and no results were obtained even after 24 hours.* No previous BDP results.

Organization of this paper
Sect. 2 briefly recalls the automatic search problems of the BDP.In Sect. 3 we introduce the previous S and ZR methods and point out their limitations.In Sect.4, we generalize Zhang and Rijmen's theory by removing the conditions and propose a new method which can be applied to complex linear layers.Next, we show some applications of our new method in Sect. 5. We give some suggestions in choosing automatic tools when evaluating ciphers against the BDP based integral attacks in Sect.6.

Notations
Throughout this paper, we use blackened italic lowercase letters to represent n-bit vectors, e.g.u ∈ F n 2 .The i-th element of u is expressed as u i (u 0 is the msb) and the Hamming weight wt(u) is calculated as wt(u) = n−1 i=0 u i .For any u and v, we define u v if u i v i for any i.The inner product of two n-bit vectors u and v is defined as u For a matrix M = (m i,j ) n×n , we use the notation M (i, j) to represent the element of M located at the i-th row and j-th column.M (i, * ) stands for the i-th row and M ( * , j) stands for the j-th column of M .Given two vectors u and v, we use M u,v to represent a sub-matrix of M , s.t.
For a matrix M ∈ F s×s 2 m , we can always transform it to M ∈ F ms×ms where M and M are equivalent except that they are defined over different linear spaces.Throughout this paper, we call M the primitive matrix of M as in [SWW20].

(Bit-Based) Division Property and Automatic Search Models
At Eurocrypt 2015, the division property [Tod15b] was proposed as a generalization of the integral property.The division property was originally defined at the word level and it traced information resulting from the algebraic degree of the round function which cannot be captured by the classical integral attack.With the help of the division property, better integral distinguishers for many cryptographic primitives have been detected.
The bit-based division property [TM16] was proposed to investigate the division property at the bit level.As a result, more rounds of integral characteristics have been found with this new technique.For example, the integral distinguishers of SIMON32 have been improved from 10 to 14 rounds5 .We give the definition of the BDP as follows.

Definition 1 (Bit-Based Division Property [TM16]
).Let X be a multiset whose elements belong to F n 2 .Let K be a set whose elements are n-bit bit vectors.When the multiset X has the division property D n K , it fulfills the following conditions for any u ∈ F n 2 : The round functions of many cryptographic primitives are often composed of basic bitwise operations like COPY, XOR and AND, combined with an Sbox layer followed by a linear layer.When these operations are applied to the elements in X, transformations of the division property should also be made following the propagation rules for COPY, XOR and AND which have been proved in [TM16, XZBL16]6 .

Rule 1 (COPY). ([TM16]
) Let the COPY operation create y = (y 0 , y 1 ) ∈ F 2 × F 2 from x ∈ F 2 as y 0 = x and y 1 = x.Assume the input multiset has the division property D 1 k , then the corresponding output multiset has the division property D 2 (i,k−i) where 0 ≤ i ≤ k.
Since we consider the BPD, the input multiset division property D 1 k must have 0 ≤ k ≤ 1.If k = 0, the output multiset has the division property D 1 (0,0) ; otherwise, the output multiset has the division property D 2 (0,1)(1,0) .We denote the division property propagation of the COPY operation as x COPY − −− → (y 0 , y 1 ).

Rule 2 (XOR). ([TM16]) Let the XOR operation create the output y
) , then the corresponding output multiset has the division property ), the division property propagation will abort.We denote the division property propagation of XOR operation as (x 0 , x 1 ) When evaluating the BDP of a cipher, the attackers only need to determine the division property of the chosen plaintexts, denoted by D 1 n K0 .Then, after r-round encryption, the division property of the output ciphertexts, denoted by D 1 n Kr , can be deduced according to the round function and the propagation rules.More specifically, the attackers determine an index set I = {i 0 , i 1 , . . ., i |I|−1 } ⊂ {0, 1, . . ., n − 1} of the bit indices of the plaintext and prepare 2 |I| chosen plaintexts where the variables indexed by I take all possible values.The division property of such chosen plaintexts is D 1 n k , where k i = 1 if i ∈ I and k i = 0 otherwise.Then, the propagation of the division property from D 1 n k is evaluated as where D Ki is the BDP after the i-th round propagation.If the division property K r does not contain a unit vector e i , then the i-th bit of the r-round ciphertexts is balanced.

Propagation of BDP in Automatic Search Model.
Finding the propagation of BDP is tedious because the size of K i increases rapidly.At Asiacrypt 2016, Xiang et al. showed that the propagation can be efficiently evaluated by using MILP [XZBL16].Firstly, they introduced the division trail as follows.
Definition 2 (Division Trail [XZBL16]).Consider the propagation of the division property Moreover, for any vector k i+1 ∈ K i+1 , there must exist a vector k i ∈ K i such that i i can propagate to k i+1 by the propagation rule of the BDP for the current operation.Furthermore, for (k Thanks to the introduction of the division trail, the propagation of set K i was transformed into the propagation of the division trails.Let E k be the target r-round iterated cipher.Then, if there is no division trail 0 we know that there is no unit vector e i in K r , i.e., the i-th bit is balanced.So once all the unit vectors appear in K r+1 , there will be no balanced bits at the end of the (r + 1)-th round of E k , and the maximum number of rounds that integral distinguisher based on BDP can cover is r rounds.
With the help of division trail, finding the BDP is transformed into a problem of finding a division trail ended at a unit vector.To find the division trail, we need to construct the search model by describing all operations of E k by propagation models and adding the initial and stopping rules to the plaintext and ciphertext, respectively.For plaintext, we assign 1 to those active plaintext bits and 0 to the constant bits.For the ciphertext, we first let the sum of all bits be 1 and solve the model.If the result returned by the solver is that the model is infeasible, then no division trail can lead to a unit vector and all the ciphertext bits are balanced.If the result is feasible, there is at least one unit vector.Check the solution of the feasible model and find out the unit vector, then we could know which bit is imbalanced.Next, let the sum of remaining bits be 1 and get back to solve the updated model.Repeat the steps until we find all the possible unit vectors.A common framework of automatic search for BDP is given in Appendix A.

Two of Previous Methods Modeling Linear Layers
For ciphers with a bit-permutation linear layer like PRESENT, RECTANGLE, etc, after the nonlinear layer, there is no cost for the BDP.This paper focuses on ciphers having a non-bit-permutation in their linear layer, for instance, SKINNY, Midori and the AES, which have been considered by the S method [SWW20] and the ZR method [ZR19].

S Method
The idea of the S method [SWW20] is to represent a matrix M ∈ F s×s 2 m at the the bit level.Given the irreducible polynomial of the field F 2 m where the multiplications operate, the representation of the matrix over F 2 is unique, which we call the primitive matrix of M and is denoted by M = (m i,j ) n×n where m i,j ∈ F 2 and n = m × s.In order to describe the multiplication with the primitive matrix, auxiliary binary variables t i,j (0 ≤ i, j ≤ n−1) are introduced to represent XOR and COPY operations derived from multiplying with it.Then the Mix-Columns operation y = M x, where x = (x 0 , x 1 , . . ., x n−1 ) T and y = (y 0 , y 1 , . . ., y n−1 ) T , can be modeled as x j COPY − −− → (t 0,j , t 1,j , . . ., t (n−1),j ) and (t i,0 , t i,1 , . . ., t i,(n−1) ) XOR − − → y i .However, the S method brings in some invalid trails to K. We give Example 1 in Appendix B as an illustration to show the problem.From this example, we know that the S method cannot handle the cancellation phenomenon between terms, so it will introduce some invalid trails generating a unit vector that breaks the balanced property of the output bits.
Furthermore, the S method is not efficient for some ciphers, it takes too much time and memory to evaluate the BDP.For example, it cannot give the BDP results for 6 or more rounds for LED cipher because the memory requirements exceed the limit of most computers.We will give more details in Subsect.5.2.

ZR Method
Zhang and Rijmen [ZR19] proposed the ZR method to convert the verification of a division trail through a linear layer into checking whether the corresponding sub-matrix is invertible or not.In other words, for a linear layer, every candidate division trail is one-to-one mapped to a sub-matrix of its corresponding matrix.As a result, the ZR method can solve the term cancellation problem.We recall their main result in Theorem 1.

Theorem 1 ([ZR19]). Let M be the n × n primitive matrix of an invertible linear transformation and u
From Theorem 1, the validity of the division trail (1, 1, 0) M − → (1, 1, 0) can be determined by the invertibility of the matrix M v,u .Apparently, this matrix is not invertible.Therefore, u M − → v is not a valid division trail.The cancellation problem for this simple example resulted from the S method can be solved by the ZR method.
In [ZR19], Zhang and Rijmen tried to use a set of inequalities, denoted by L, to describe all the invertible sub-matrices for the linear layer M .They claimed that their technique can be applied to ciphers with so-called "binary linear layers".We redefine it as follows.
Definition 3 (Binary Matrix).Suppose for a matrix M = (m i,j ) s×s ∈ F s×s 2 m , we represent the element m i,j in M as a polynomial in the extension field F 2 m F[x]/(f ), where f is the irreducible polynomial over F 2 with degree m, then we call M a binary matrix if all such polynomials in M can only be 0 or 1.Otherwise, M is called a non-binary matrix.
The matrix which is used in the MixColumns of SKINNY falls into this category while the matrix for the MixColumns operation of the AES is a non-binary matrix, because Given such a binary matrix M = (m i,j ) ∈ F s×s 2 m , and denote n = m × s, we can represent M in its primitive form as M = (m i,j ) n×n .According to Theorem 1, one needs to determine all its l-order invertible sub-matrices for 1 l n and to describe them with a set of linear inequalities as proposed by [ZR19].In order to describe all the invertible sub-matrices of M , one first divides M 's row indices into the following m cosets In each coset, there are s elements.Since the rows in different cosets have no common nonzero entries in the same column, which is the key feature of a binary matrix, one can take into account the XOR operation of rows in each coset separately.Assume u = , then for t (1 t < s) coordinates v i 's with indices in the same coset, without loss of generality we take the i-th coset S i = {i, i + m, . . ., i + (s − 1)m} as an example, we construct inequalities to describe the corresponding t-th order invertible sub-matrix as where i 0 , i 1 , . . ., i t−1 ∈ S i .For t = s, we could simply use equations is j=i0 For each coset like S i , we have s t inequalities and equations, where 1 t s.There are m such cosets in total, so the total number of inequalities to represent the binary linear layer is All solutions to each of these inequalities cover a large class of invertible sub-matrices.Thus, all the inequalities covering all the l-th order for 1 ≤ l ≤ n sub-matrices of M are sufficient to describe all the invertible sub-matrices of the linear layer matrix M .We refer the readers to [ZR19] for more details about the construction of L.
For some binary linear layers such as the linear matrices of the block cipher SKINNY and Midori, m and s in Equation ( 3) is usually small, so #L is not very big.For example, in order to describe the matrix of SKINNY with 64-bit block size, they only need 4 × (2 4 − 1) = 60 inequalities7 .
Limitations of the ZR method.However, for non-binary matrices, if we consider them in the similar way of using cosets, the key feature for the binary matrices stated above that the rows in different cosets have no common nonzero entries in the same column does not hold.Then when we need to check the invertibility of all the l-th order sub-matrices, where 1 ≤ l ≤ n, we have n l inequalities to mark all the l-th order invertible matrices8 .Finally, the number of inequalities #L should be computed in formula It appears that #L becomes too large when n is big, where n = m × s.For example, when considering M AES , we have m = 8 and s = 4, so the total number of inequalities used to describe all its valid division trails will be 2 8×4 − 1 = 2 32 − 1.First of all, it is extremely hard to generate such a huge number of equations.Secondly, such a large number of equations is usually not supported by any MILP solvers.As a result, the ZR method is not applicable to many of the ciphers that have non-binary matrices as part of their linear layers.

Our Model for General Linear Layers
Since many symmetric primitives base the security on the non-linear components (usually Sboxes) and complex linear layers, and the BDP propagation for the Sboxes has been well modeled [XZBL16, BC16, SWW17], it is important to find a method that can accurately and efficiently describe the BDP propagation for non-binary linear layers.However, as discussed in Sect.3, the S method cannot always accurately model the BDP of a matrix, meanwhile the ZR method is practically infeasible for non-binary matrices with a large size.In this section, we propose a new method that can trace the BDP propagation accurately and efficiently for any type of matrices.
Main Idea.Suppose that u and v are the input and output vectors of the linear layer M .From the ZR method, if and only if u M − → v is valid, then M v,u is invertible.Therefore, given the constraints that M v,u should be invertible, all the (u, v) satisfying them are valid.Here we first outline these constraints in Proposition 1 in order to give a high-level overview of our new model.The reason for setting them will be discussed in Subsect.4.1.
Proposition 1.For a primitive matrix M ∈ F n×n 2 , a division trail (u, v) 9 is valid if and only if (u, v) meets the following constraints where E is a n × n identity matrix and M expand v,u ∈ F n×n 2 is an auxiliary matrix with n 2 elements.
According to Proposition 1, for M ∈ F n×n 2 we need to generate totally n 2 (for all 0 i, j n − 1) 4-degree constraints with n 2 auxiliary variables representing M expand n×n .Note the constraints in Proposition 1 are all fourth degree.As the name implies, MILP focuses on the linear programming.As far as we know, although some popular MILP solvers such as Gurobi can handle the quadratic constraints, higher degree constraints are far beyond their capability.Thus, the constraints in Proposition 1 can only be processed by SAT/SMT solvers.The proof of Proposition 1 is given in Subsect.4.1.

Derive Constraints for Invertible Sub-Matrices
Determining whether M v,u is invertible or not is equivalent to determining whether has a solution, where E wt(v)×wt(v) is a wt(v) × wt(v) identity matrix (we no longer mark the size if it is not ambiguous).Note u and v are only used to indicate the sub-matrix and the specific values of them are not required.Then, we construct the constraints that M v,u is invertible by two steps introduced as follows.

Step 1: Compute the Expanded Matrix of M v,u
The essential difference between our method and the ZR method is that we do not need to know all the sub-matrices of M .Instead, M v,u varies dynamically according to the current candidates of the division trails, thus M v,u has a variable size.In many languages, however, the size is commonly required explicitly when declaring a new variable.To overcome this obstacle, we define the expanded matrix M expand v,u for M v,u .
Definition 4 (Expanded Matrix).Given a primitive matrix M ∈ F n×n 2 and one of its sub-matrix M v,u , the expanded matrix has the same size with M which can be known in advance.At the same time, it contains all the information about M v,u .Now we can declare M and M expand v,u (in the input language of the STP solver) as an ARRAY variable, and denote it as ARRAY(index, value)10 .The first parameter index is the concatenation of the row and column indices (i, j), and the second parameter value is M (i, j) or M expand v,u (i, j).

Step 2: Check the Invertibility of M v,u
In Step 1, M expand v,u is defined with a fixed size and it covers M v,u .It is possible to add constraints on M expand v,u to ensure that M v,u should be invertible.This can be guaranteed by the following theorem.).Correspondingly, E v is a matrix as follows,

. Then a sub-matrix M v,u is invertible if and only if there exists a matrix
From the condition of Theorem 2, we know that and E v , we can get the following equation Equivalently, we obtain the following non-trivial equations Since the second equation of Equation ( 5) always has a trivial solution X 0,1 = 0, then whether Equation (4) has solutions can totally be decided by the first equation of Equation (5).In other words, if Equation (4) has solutions, then M v,u X 0,0 = E has solutions; otherwise, M v,u X 0,0 = E has no solution.
Finally, if and only if Equation (4) has solutions, M v,u is invertible.X 0,0 is just the inverse matrix of M v,u .
Through Theorem 2, we can check the invertibility of M v,u by checking whether and E v are fixed and known in advance, which enable us to define and model constraints on them in an automatic search tool.

The Compact Automatic Search Algorithm for M
In this subsection, we show how to condense the theory in previous subsection into a compact algorithm.Firstly, we introduce two useful observations that help to generate the matrices M expand v,u and E v .

Observation 1. According to Definition 4, M expand v,u
can be generated by the following formula, Observation 2. The matrix E v in Theorem 2 can be generated by the following formula, Where E is the wt(v) × wt(v) identity matrix.
Now we can construct our new model in a system of compact constraints.Firstly, we let the hamming weight of v and u be equal.
Secondly, we allocate n × n auxiliary variables which represent the matrix = E v can be written as the following n 2 constraints, (k, j), for 0 i, j n − 1.
Algorithm 1 shows how to construct the constraints for a linear layer M with (u, v) as its input and output variables.

Removing the Invertible Condition from Theorem 1 of ZR Method
In [ZR19], it is believed that only invertible matrices can be handled by ZR method.However, for complement, we show this condition can be removed from Theorem 1, i.e., no matter if M is invertible or not, wt(u) = wt(v) is always satisfied as long as u → v is valid.Without the invertible condition, we introduce a more general Theorem 3 than Theorem 1. Theorem 3. Let M be the p × q primitive matrix of a linear transformation.For

valid division trail of the linear layer M if and only if M v,u is invertible.
Proof.Recall the proof of Theorem 1 in [ZR19], the condition that M is invertible is only used to prove wt(u) = wt(v) if u M − → v is valid.Therefore, to prove our Theorem 3, we only need to prove that although M is not invertible, wt(u) = wt(v) is still satisfied if u M − → v is valid.The remaining proof is the same as that of Theorem 1 in [ZR19].
SAT/SMT solvers to handle our new model, thus we take SAT/SMT tool Cryptominisat11 (version 5.6.5) and STP12 (version 2.1.2) to conduct our experiments.A brief introduction is given in Appendix C to explain how to model the components of ciphers based on STP.All the experiments for our applications are conducted on a work station with Intel(R) Xeon(R) CPU E5-2620 0 @2.00GHz, 128GB memory, 64bit Ubuntu 16.04.4LTS.The source codes are available in https://gitee.com/hukaisdu/BDP_for_ComplexLinear.git.
Notations for integral property.We introduce the notations that we are going to use to present the results of our applications.Let Λ be a collection of state vectors X = (x 0 , . . ., x 2 n −1 ) where • ?: if the sum of all x i in Λ cannot be predicted, X is called unknown When considering them at the bit level -i.e.let x i ∈ F 2 (m = 1), we use lower case letters instead of uppercase letters, that is a represents an active bit, b a balance one, c a constant bit and ?an unknown bit.For example, "aaac" for a nibble means that only the least significant bit is constant, all the others are active.Similarly, "???b" means that only the least significant bit is balanced, while the rest are unknown.
In the remaining of this paper, we arrange the input and output state of our distinguisher in a 4 × 4 matrix for the AES and LED.For Feistel ciphers such as MISTY1, CLEFIA and Camellia, the states are also arranged in a matrix in the row-major order.For example, if we write the 64-bit state of MISTY1 state into an n × m matrix, then the m bits in the first row are the leftmost m bits of MISTY1 state.Details are given in Figure 3, 4 and 5 in Appendix D.

Applications to the AES
The AES is the most widely used block cipher.It follows a construction known as the substitution-permutation network (SPN).The AES has a fixed block size of 128 bits, a key size of 128, 192 or 256 bits, and a number of 10, 12 or 14 rounds respectively.The 128-bit state is arranged in a 4 × 4 grid where each byte represents an element from F 2 8 with an underlying polynomial for field multiplication.Initially there is a whitening key addition (AddRoundKey) to the state.Then each internal round of the AES is composed of four operations: SubBytes (SB), ShiftRows (SR), MixColumns (MC) and AddRoundKey (AK).In the final round of each variant of the AES, the MC is missing.
Key-independent integral distinguishers cover 3 or 4 rounds of the AES.Any integral property that surpasses 4 rounds is interesting because it brings more insights to the security of the AES.In [SLG + 16] and [HCGW18], two integral variants are introduced for 5 rounds of the AES, but they both depend on the value of one key byte and are therefore called key-dependent integral distinguishers.So far, no 5-round key-independent integral distinguishers have been reported.The 5-round key-dependent distinguisher in [SLG + 16] exploited the structure of the MDS matrix, which falls exactly into the research scope of our new technique, in particular of the generalized Theorem 3.
At Crypto 2016, Sun et al. [SLG + 16] introduced the first 5-round integral distinguisher, based on the fact that M AES has two equal elements "1" in each column.Assume x = (x 0 , x 0 , x 2 , x 3 ) T is the input to M −1 AES , where x i takes all possible 2 8 values, and y = (y 0 , y 1 , y 2 , y 3 ) T is the output.Then x = M AES × y can be represented as which implies x 0 = 2y 0 ⊕3y 1 ⊕y 2 ⊕y 3 = y 0 ⊕2y 1 ⊕3y 2 ⊕y 3 .Then we have 3y 0 ⊕y 1 ⊕2y 2 = 0, i.e., y 0 , y 1 and y 2 are linearly dependent and the dimension of (y 0 , y 1 , y 2 ) is at most 2. Since the dimension of the input x is 3, we conclude that y 3 is independent of (y 0 , y 1 , y 2 ), i.e., the number of possible values for y 3 is 2 8 and the number of all possible values for (y 0 , y 1 , y 2 ) is 2 16 .In other words, after the inverse of the MC, y 3 must be an active byte, i.e., A. Note M −1 AES does not have two equal values in one column, so the integral distinguishers are only valid for the AES inverse.To verify the property of M AES by our method, we first re-write the inverse operation of the AES as where x 0 , x 1 and x 2 are all active bytes.Note in the first round, we apply the shrunk matrix M shrunk to the 3 bytes in the first column and apply the matrix of the inverse of the MC to the remaining 3 columns (see the bottom part of Figure 1).

4-round key-dependent integral distinguishers.
As a result, with these 3 active bytes, we construct the plaintexts to the first round of the AES inverse, then by appending 3 rounds of the AES inverse we could build a 4-round integral distinguisher.For constructing this distinguisher, in total we need 2 3×8 = 2 24 chosen plaintexts, and all the output bits are balanced.By our method in Sect.4, we obtain a 4-round integral distinguisher after around 68 minutes.
5-round key-dependent integral distinguishers with simplified constraints for MC.Similarly, with 3 active bytes in one column and 12 active bytes in the other 3 columns, i.e., 2 (3+12)×8 = 2 120 chosen plaintexts we should be able to construct a 5-round integral distinguisher by appending 4 rounds of the AES inverse.To check if there exists such a 5-round integral distinguisher with 120 active bits, the automatic solver might take a very long time before it gives a result.To speed up the process of checking the existence of the 5-round distinguishers, we simplify the propagation rules for the MC in the last four rounds, by only asserting a constraint to the matrix that the hamming weight of the input variable is always equal to the one of the output variable.In fact, compared to the constraints set to the last 3 round in the 4-round distinguishers, these constraints are much loose, which means more solutions/trails will be derived by the solver from this simplified model.Therefore, for the 5 rounds AES inverse, if there is no solution to this simplified model (i.e. a balanced property is satisfied), there will also be no solution to the original model.Finally, our method shows that the BDP model of the 5-round AES inverse has no solution after about 50 minutes, thus we obtain the 5-round key-dependent integral distinguishers13 .Since M shrunk is an non-binary matrix of size 32 × 24 at the bit level, as is pointed out in Sect.3.2, the ZR method cannot process it because of the huge computations for generating linear inequalities.What's more, the S method inserts invalid division trails after this specific shrunk matrix, which will result in the failure to find the two types of integral distinguishers obtained by our method.Actually, we also implemented the S method, and the result shows that S method can guarantee the number of all possible values for (y 0 , y 1 , y 2 , y 3 ) after the shrunk matrix M shrunk is 2 24 , but cannot guarantee that y 3 takes all possible 2 8 values.In other words, this property of the AES MDS matrix cannot be made use of by the S method to find the 5-round key-dependent distinguishers in [SLG + 16].Therefore only our model can capture the BDP propagation feature through the shrunk matrix M shrunk accurately and efficiently.We hope our method provides a useful tool for exploring security analysis in the AES and AES-like designs.
More discussions about the security of the AES against integral attacks.As is wellknown, the AES has 4-round key-independent integral distinguishers with 2 32 chosen plaintexts [DKR97,KW02] while it is still an open problem for 5 rounds.However, the 5-round AES is an important primitive which has been used in many new designs [WP14], thus it is important to answer this public question.
It is commonly believed that the BDP is the most powerful tool for searching for integral distinguishers.Unfortunately, no accurate BDP results targeting the key-independent integral are known for the 5-round AES.The reason is that, as we discussed above, neither the S or ZR method is suitable to trace the accurate BDP of the AES MDS matrix.By taking our method, for the first time the BDP for the AES matrix can be accurately traced.We experimentally prove that the 5-round AES has no key-independent integral distinguishers.Although it does not break the records, we answer the open question that 5-round AES has no key-independent integral property.Our new method provides more confidence for the security of the 5-round AES and designs based on it.

Applications to LED
LED [GPPR11] is a 64-bit block cipher that can handle key sizes from 64 bits up to 128 bits.Since our distinguisher works for both key sizes, we generally denote them as LED in this paper.Similar to the AES, four operations are applied to each round: AddConstant (AC), SubCell (SC), ShiftRows (SR) and MixColumnsSerial (MC).Four of these internal rounds are called one step, and are followed by a AddRoundKey (AK) operation.Since AC and AK in our automatic search model do not influence the BDP, we do not consider them here.We set 64-bit variables X (i) , Y (i) and Z (i) to represent the variable before SC, SR and MC in the i-th round respectively.We add constraints to bits of X (i) and Y (i) using the model for Sbox (Model 3 in Appendix C) and to bits of Z (i) and X (i+1) using our new model for the linear layer.Finally, we get the entire automatic search model for LED.
In [SWW20], the S method found that for 6-round LED, 51 active bits would not lead to integral distinguishers.However, they cannot obtain the results for 6 rounds of LED with 52 active bits because of the huge memory and time requirements.Therefore, if there exist a BDP result for 6-round LED with 52 active bits is still a public question.By using our method, after about 15 minutes we find an integral distinguisher for 6-round LED with 52 active bits.The 6-round BDP we found is given below Further more, by choosing a proper initial BDP, we can even extend the distinguisher by one more round, i.e, we find 7-round integral distinguishers for the first time.The integral distinguisher with 127 active bits in the plaintexts and full balanced bits in the ciphertexts, is given as below

Applications to MISTY1
MISTY1 MISTY1 cipher was broken by Todo [Tod15a] using integral attacks at Crypto 2015.The 6-round integral distinguisher used in [Tod15a] was found by the word-based division property with 63 active bits.However, the integral attack requires almost the whole codebook [Tod15a,BK16].Finding integral distinguishers with more rounds or less data are meaningful to reduce the complexity for the entire attack.As is commonly believed, the BDP can find better integral distinguishers than word-based DP, which have either more rounds, or more balanced bits or less data complexity.However, the only result found by the BDP in [EKKT18] can reach maximal three rounds of MISTY1.Thus it is valuable to check the security of MISTY1 by the BDP.
As can be seen, MISTY1 does not have a classical complex linear layer such as the MC operation of AES or LED, but it consists of COPY, XOR and Sbox only.To apply our methods to MISTY1, operations in every red dash line rectangle in Figure 3 (b)(c)(d) in Appendix D are regarded as a matrices, then we can describe them in a similar way to the MC of the AES or LED.X (i) is used to represent the input to the i-th round of MISTY1 and Y (i) stands for the output of F L. Variables used to describe the division trails such as A (i) ∼ J (i) are shown in Figure 3 in Appendix D. As a result, our new model is quite efficient and can find much longer BDP than 3 rounds.With 63 active bits where the most significant bit is set as constant, we find the same 6-round integral distinguishers as the one in [Tod15a] and prove that no BDP results having more rounds or more balanced bits exist with the same data14 .With 62 active bits where the two most significant bits are set as constant, the results show that the 32 bits in the right branch are balanced.
The new integral characteristic with 2 62 chosen plaintexts is shown as follows  4 in Appendix D).The MDS matrix in the left branch is called M 0 while the one on the right branch is called M 1 .
For our automatic search model, as shown in Figure 4 in Appendix D, we use 128-bit variables X (i) to represent the input to the i-th round, and 64-bit variables A (i) , B (i)  and C (i) to represent the input to the four Sboxes, input to M 0 , output of M 0 in the left branch of the i-th round, respectively; at the same time, D (i) , E (i) and F (i) stands for the input to four Sboxes, input to M 1 , output of M 1 on the right branch of i-th round, respectively.Matrices M 0 and M 1 are described with our new model.
From our experiments, a 10-round BDP is obtained, which is the same as the one by word-based division property [SWW17].Although no new integral distinguishers are found, it is still meaningful to re-evaluate the security of CLEFIA against the BDP since the BDP has more potential to find stronger integral distinguishers.This is the first time that one can evaluate the BDP for CLEFIA.The 10-round BDP is give as for the 128-bit version and 24 for the other two versions.Our distinguisher works on all the three versions of Camellia, so we simply denote them as Camellia.The round function of Camellia is a permutation and takes a layer of eight 8-bit Sboxes and a word-level linear layer.The structure of one round Camellia is given in Figure 5 in Appendix D.

Applications to Camellia
In [Tod15b], the word-based division property found 6-round integral distinguishers but no results of the BDP are available until now.We construct the BDP search model for 7-round Camellia with a F L/F L −1 layer using our technique.The seven rounds of Camellia include one layer of F L/F L −1 located after the first six rounds (6 rounds → F L/F L −1 → 1 round).For the F L/F L −1 functions, we treat them as we do for the F L functions in MISTY1 (Figure 3(b) in Appendix D), i.e. we regard them as the pure linear mapping, which can be represented in the bit matrices.The position of the diffusion layer is included in the red dash line rectangle in Figure 5.The result is shown as follows, ? ???????
Since the linear layer in the round function of Camellia is actually a binary linear mapping (Definition 3), the ZR method would find the same result as ours.

Conclusions and Discussions
In this paper, we provide an improvement on the existing automatic tools searching for the BDP of complex linear layers.We remove restriction for the ZR method that it applies only to the invertible binary linear layers.Furthermore, the weakness that the S method may ignore some balanced property for ciphers with a complex linear layer has also been overcome by our method.With our new method, more accurate BDP for many block ciphers with a general linear layer such as LED, the AES, MISTY1 can be obtained within reasonable time, which was not possible by the previous methods.
Although our method has many advantages over the previous ones, it also has some limitations.For a primitive matrix of size n × n, the number of our constraints is n 2 , while for the S method it is only 2n, which means for ciphers with larger linear layer such as LowMC [ARS + 15], the number of constraints in our new model increases much faster than the S method.As a result, our method does not fit ciphers with over large (n 32) matrices.Also note that the constraints we construct in this paper are fourth degree constraints (Proposition 1) which makes our model limited to SAT solvers.Other constraint-based solvers like MILP also play a very important role in searching BDP-based integrals, however MILP solvers can only process linear constraints (Gurobi can handle quadratic constraints).How to implement our model using MILP solvers or similar ones, will be a future work.Now we would like to give some suggestions to the designers who need to evaluate their designs against the BDP-based integral attacks.If their design takes a binary matrix as one component of linear layer, the priority is to consider the ZR method and our method.Suppose it is defined in F s×s 2 m .We know by the ZR method, at most m × (2 s − 1) inequalities are introduced to describe its BDP, so if s is in a reasonable range, we can get an accurate BDP quickly enough.If the cipher takes a large (e.g., n = m × s 32) non-binary matrix as a linear layer, the number of constraints that our new method needs to include in the model will increase sharply.On the other hand, though the results returned by the S method may not be accurate, i.e., some balanced bits of the output could be ignored by the S method, it may still give us feedback on some useful BDP.Therefore, in these circumstances, we recommend the S method for these ciphers if it can return us a BDP result within an acceptable time.For other linear layers, i.e., non-binary matrices or non-invertible matrices with a moderate size (e.g., n 64), our new method becomes the optimal choice, since it can find the accurate BDP results with a competitive efficiency.Assume the input and output of M are x = (x 0 , x 1 , x 2 ) T and y = (y 0 , y 1 , y 2 ) T respectively, then we have y = M x.We transform the representation of this multiplication to a vectorial Boolean form as      Note that x 0 appears in the first and second equations, so according to the propagation rule of COPY 1, we need to apply COPY operation to x 0 .We introduce two binary variables t 0 and t 1 to represent its output, i.e., x 0 COPY − −− → (t 0 , t 1 ).
x 2 only appears once in y i 's, thus COPY does not apply to it.Then we model XOR operations with the help of binary variables t i 's as (t 0 , t 2 ) XOR − − → y 0 , (t 1 , t 3 ) XOR − − → y 1 , x 2 = y 2 .

C.2 Models of Operations with SMT/SAT
Since we use the STP solver to search for the BDP, we introduce the SMT/SAT models describing components such as Copy, XOR and Sbox.
Model 1 (COPY [SWW17] ).Denote (a) In this paper, we introduce a new method to describe the bit-based division property propagation of an Sbox.To make it more concrete, we take the division trail table of the PRESENT Sbox [BKL + 07], shown in Table 2, as an example.

Algorithm 1 :
Generate constraints for automatic search model of M Input: Model variables: u, v, M ∈ F n×n 2 and an identity matrix E Output: A set C containing all the constraints for M Allocate C = ∅; // ARRAY(2n, 1) means the array is indexed by 2n bits and the value is 1 bit Allocate ARRAY(2n, 1) variable M expand v,u ; Auxiliary variables Improvement of Our New Method Compared to the ZR Method.Both ZR method and ours can determine valid division trails accurately.However, the ZR method needs to compute all the invertible sub-matrices for a given linear layer M .For an n × n M , it has as many as Π n i=1 n i 2 sub-matrices.It is impossible to check all of them and model the invertible ones by automatic search tools if M has no particular structure, e.g., binary.In terms of our new method, checking the invertibility of M v,u is transformed to finding if M expand v,u M expand v,u = E v has solutions.Since the size of M expand v,u and E v are known in advance, it enables us to model them by an automatic search tool.

Figure 1 :
Figure 1: The top part is the first round of the integral distinguisher in [SLG + 16] appended by the 4 rounds AES −1 .The bottom part is the beginning of our new search model appended by the 4 rounds AES −1 .The operations in an AES −1 round are AK −1 → SB −1 → SR −1 → MC −1 .
[Mat97] operates on 64-bit blocks and requires a 128-bit key.It iterates an 8-round Feistel structure built on a 32-bit round function F O, which is itself a 3-round Feistel construction called the MISTY structure having a 16-bit non-linear function F I. F I consists of a similar 3-round unbalanced MISTY structure with a 7-bit and two 9-bit Sboxes called S 7 and S 9 .An additional component, two 32-bit functions F L are inserted to both 32-bit halves before F O function every two rounds of the cipher.The F L function is a simple transformation.Secret key material is mixed with message in both F O and F I function.Since it does not affect our distinguishers, we do not include details about it here.The structure of MISTY1, F O, F L and F I functions are shown in Figure3in Appendix D.

Algorithm 2 :
The framework of automatic search for the BDP of E R k[XZBL16] Input: I represents the index set of the active bits in the plaintext Output: S represents the index set of the balanced bits in the ciphertext // C is the constraint set for the model.a r i is the division property variable at the i-th bit of the r-th round.Allocate C = ∅; for each operation f of E R k do C ← constraints for f ; for i = 0; i < n; i = i + 1 do if i ∈ I then C ← a 0 i //evaluate the BDP based on the constraints Allocate S = {0, 1, . . ., n − 1}; Try to find solutions for the constraints in C; while There is a solution do Find i s.t. a R i = 1 in the solution; S = S remove − −−−− → i; C ← a R i = 0; Update the model and resolve; return S; Example 1. Suppose the linear layer is a matrix

Table 1 :
Division property results for LED, the AES, MISTY1, CLEFIA and Camellia.

127 64 99min Subsect. 5.5
Now we re-consider Example 1 in Appendix B by the ZR method, to see if it can handle the cancellation influence which the S method cannot.Since the candidate division trail is one of the valid division trails of the linear transform M if and only if M v,u is invertible.
u is invertible if and only if M v,u has an inverse matrix.Without loss of generality, we assume M v,u is located in the top-left corner of M (and M expand ISO standard cipher CLEFIA [SSA + 07] is a 128-bit block cipher supporting a key length of 128, 192, and 256 bits.The corresponding rounds are 18, 22 and 26 for 128, 192 and 256 key bits, respectively.The round function follows a 4-branch Type-2 general Feistel structure.Two parallel F functions are used in every round and each F function consists of four Sboxes and one MDS matrix (Figure Camellia is a 128-bit block cipher designed by Aoki et al. in 2000 [AIK + 00].It is a Feistel-like construction where two key-dependent layers F L and F L −1 are applied every 6 rounds to each branch.There exist three different versions of the cipher, which are Camellia-128, -196 and -256, depending on the key size used.The number of rounds is 18