Weak Keys in Reduced AEGIS and Tiaoxin

. AEGIS-128 and Tiaoxin are two AES -based primitives submitted to the CAESAR competition. Among them, AEGIS-128 has been selected in the ﬁnal portfolio for high-performance applications, while Tiaoxin is a third-round candidate. Although both primitives adopt a stream cipher based design, they are quite diﬀerent from the well-known bit-oriented stream ciphers like Trivium and the Grain family. Their common feature consists in the round update function, where the state is divided into several 128-bit words and each word has the option to pass through an AES round or not. During the 6-year CAESAR competition, it is surprising that for both primitives there is no third-party cryptanalysis of the initialization phase. Due to the similarities in both primitives, we are motivated to investigate whether there is a common way to evaluate the security of their initialization phases. Our technical contribution is to write the expressions of the internal states in terms of the nonce and the key by treating a 128-bit word as a unit and then carefully study how to simplify these expressions by adding proper conditions. As a result, we found that there are several groups of weak keys with 2 96 keys each in 5-round AEGIS-128 and 8-round Tiaoxin , which allows us to construct integral distinguishers with time complexity 2 32 and data complexity 2 32 . Based on the distinguisher, the time complexity to recover the weak key is 2 72 for 5-round AEGIS-128 . However, the weak key recovery attack on 8-round Tiaoxin will require the usage of a weak constant occurring with probability 2 − 32 . We expect that this work can advance the understanding of the designs similar to AEGIS and Tiaoxin .


Introduction
Strong diffusion and confusion are two principles to design secure symmetric-key primitives.It is undoubtable that for almost all symmetric-key primitives the attackers lose the capability to write the accurate boolean expressions of the output bits in terms of the input bits.To address this problem, the cube attack [DS09] was invented to capture partial information of the boolean expressions of the output bits.Especially, with the evolvement of the technique called division property [Tod15,TM16], there exist automatical tools [XZBL16,WHG + 19,HLM + 20,HSWW20] to search for the desired partial information of the boolean expressions.An evident advantage to utilize the division property with the automatical tools is that attackers can find integral distinguishers [KW02] in a relatively easy way.However, it seems that a naive implementation can only find integral distinguishers holding for all secret keys.
When it comes to the weak-key setting, without knowing what the weak key is in advance, the naive implementation of the division property will obviously fail in finding the integral distinguishers not holding for all keys.How to identify a set of weak keys is nontrivial as there may exist different perspectives of what a weak key should be like and how a weak key influences the attack, which can be seen from the development of the invariant subspace attack [LAAZ11, LMR15, TLS16, GJN + 16, Bey18], a popular attack on lightweight symmetric-key primitives in the weak-key setting.In general, the time complexity of a reasonable distinguisher and key-recovery attack in the weak-key setting should be smaller than the number of weak keys.This work will focus on the integral attack [KW02] on round-reduced AEGIS-128 and Tiaoxin in the weak-key setting.AEGIS-128 [WP13] and Tiaoxin [Nik] are two authenticated encryption (AE) schemes adopting a stream cipher based design.Their common feature consists in the round update function, where the internal state is divided into several 128-bit words and each word has the option to pass through an AES round or not.For AEGIS-128, all the state words will independently pass through the AES round function in one-round update, while only partial state words will independently pass through the AES round function in one-round update for Tiaoxin.Designing a round update function in this way will allow parallel calls to AES-NI, which is a set of new instructions for the AES round function designed by Intel.Consequently, the round update functions of both primitives are rather efficient even with several AES rounds.Notably, Jean and Nikolić generalized the way to construct such round update functions in FSE 2016 [JN16], which has attracted the interest of the community.
To improve the unit time to process a 128-bit message block, after the initialization phase and processing the associated data, only one round update function is used to compute each ciphertext block.To ensure the security of the encryption phase, the state sizes of both AEGIS-128 and Tiaoxin are large and the output is computed based on a quadratic boolean function in terms of several state words, which can prevent attackers from recovering the whole secret internal state with the output faster than an exhaustive key search.Moreover, such a way to generate the output also makes reversing the round update function impossible without guessing many state words.
For such a way to compute the output, i.e. the keystream, it has been pointed out by Minaud that there exists a linear bias in the keystream [Min14] soon after the publication of AEGIS.Especially, AEGIS-256 was shown to be insecure against this statistical attack.Later, this idea was applied to MORUS in ASIACRYPT 2018 [AEL + 18] and how to construct a model to automatically search for the linear bias in the keystream was also proposed in CRYPTO 2019 [SSS + 19].Such a model was then adapted to re-evaluate the keystream of AEGIS [ENP19].Although a better linear bias was found for AEGIS-256, both AEGIS-128 and AEGIS-128L remain secure in their keystream generation phase [ENP19].
For AEGIS-128 and Tiaoxin, when the associated data is empty, the phase to process the associated data will be skipped and the attacker can immediately observe the information of the internal state at the encryption phase.As only one-round update is utilized to process each message block in the encryption phase, it is obvious that the security of the initialization phase of both primitives dominates the security of the whole AE scheme.Although the designers made an initial study of the initialization phase, only the resistance against the differential attack was investigated.For AEGIS-128, the designers claimed that there are 50 AES rounds in initialization and a difference in the controllable input will pass through more than 10 AES rounds [WP13].For Tiaoxin, the designers claimed that 6 rounds are sufficient to resist against the differential attack by counting the number of active S-boxes [Nik].
The designers only took the conventional differential attack into account but the feasibility to apply the well-known integral distinguisher on 4-round AES was not discussed.Surprisingly, there is no third-party cryptanalysis of the initialization phase for both primitives even until now.To fill in the gap, we are motivated to investigate the possibility to utilize the well-known integral distinguisher on 4-round AES [DR02] to analyze their initialization phases.
Our contributions.Due to the fact that the state words are independently processed via the AES round function and the diffusion between the state words is rather weak in the round update function, we find it more suitable and feasible to write the expressions of the internal states in terms of the input by treating a 128-bit word rather than a bit as a unit.The reason to consider the integral distinguisher is also simple as it allows to study the integral property for each term in the expression independently, which fits very well with the way to generate the output for both AEGIS-128 and Tiaoxin.However, other attacks may rely on the interaction between the terms.
To study the integral property for the output, it is essential to study some unusual integral properties that will never appear in real AES but will appear in AEGIS-128 and Tiaoxin.Specifically, we will prove that for some unusual combinations of the AES round function, in the multiset of the outputs for a certain structure of inputs, the same value will appear even times.Without noticing this property, one may add redundant conditions to obtain the integral distinguishers on reduced AEGIS-128 and Tiaoxin or even fail in finding them.
After writing the expressions, analyzing them and adding proper conditions to simplify them are crucial steps to identify the integral distinguishers.It is common in symmetric-key cryptanalysis to add conditions to control the propagation in variables, which has lead to the powerful collision attacks on the MD-SHA hash family [WLF + 05, WY05, WYY05], the conditional differential attack [BB93,KMN10] and the conditional cube attack [HWX + 16], albeit that the conditions are carefully derived from the bit level.In our attack, the conditions are derived from the byte level and they are hidden in the expressions.It can also be found that the conditions on the key occur for very different reasons in AEGIS-128 and Tiaoxin.
Benefiting from writing the expressions, we observed the feasibility to mount a keyrecovery attack by introducing a new variable to represent the output of one AES round for a certain 128-bit word, which is equivalent to appending a round for key recovery.Especially for Tiaoxin, there seems to be one useless round.
As a result, we identified several sets of weak keys with 2 96 keys each for 5-round AEGIS-128 and 8-round Tiaoxin.For each set of weak keys, the time complexity and data complexity to construct the integral distinguisher are both 2 32 .The weak key recovery for 5-round AEGIS-128 requires 2 32 data, 2 25 memory and 2 72 time.For 8-round Tiaoxin, the weak key recovery will require the usage of a weak constant occurring with probability 2 −32 .In addition, the size of each set of weak keys in 8-round Tiaoxin will be reduced to 2 72 in the key-recovery attack.If a weak constant is used, the key-recovery attack on 8-round Tiaoxin will require 2 32 data, 2 24 memory and 2 48 time.The results are summarized in Table 1.

Table 1:
The results of the analysis of reduced AEGIS-128 and Tiaoxin in the weak-key setting, where M/T/D represent the memory/time/data complexity, respectively.In addition, the column named "Size" represent the size of the set of weak keys.The column named "Constant" represents the requirement on the constant."-" represents negligible.Organization.The paper is organized in the following way.In Section 2, we will introduce some necessary notations and the specification of the initialization phases of AEGIS-128 and Tiaoxin.Then some new integral properties of the AES round function will be discussed in Section 3. In Section 4 and Section 5, how to derive the distinguishers and key-recovery attacks for 5-round AEGIS-128 and 8-round Tiaoxin will be detailed, respectively.Then, we summarize the attacks on AEGIS-128 and Tiaoxin and discuss the usage of division property in Section 6.Finally, the paper in concluded in Section 7.

Preliminaries
In this section, we explain the notations in this paper and briefly describe the initialization phase of AEGIS-128 and Tiaoxin, respectively.

Notation
1. SB, SR, MC and AC represent the SubBytes, ShiftRows, MixColumns and Constant Addition operations defined in AES, respectively.
2. MC −1 (X) represents the application of the inverse of the Mixcolumns operation of AES to a 128-bit value X.
5. A r (X) represents r times of the application of the function A to X.
6. R r (X) represents r times of the application of the function R to X.
7. S(x) represents the output of the S-box of AES when the input is x ∈ F 8 2 .
8. x • y represents the product of x ∈ F 8 2 and y ∈ F 8 2 where the operation • works in the field GF (2 8 ) as in MC of AES.
In addition, we also introduce some related notations to describe the AES state.Specifically, the AES state is organized as a 4 × 4 two-dimensional array.If the AES state is denoted by s, then s[i][j] represents the byte located in the i-th row and j-th column, as shown in Figure 1.Moreover, to group several bytes of the AES state, we introduce the following notations: where the indices are considered within modulo 4. Further, we group several groups as follows:

The Initialization Phase of AEGIS-128
AEGIS-128 is an authenticated encryption (AE) scheme composed of four phases: initialization, processing the associated data, encryption and finalization.This works only focuses on the initialization phase.
The state of AEGIS-128 consists of five 128-bit words.For simplicity, we denote the initial state by (X 0 0 , X 1 0 , X 2 0 , X 3 0 , X 4 0 ).Similar to block ciphers, there is a round update function to update the AEGIS-128 state and we denote the state after r rounds by (X 0 r , X 1 r , X 2 r , X 3 r , X 4 r ).The inputs of the initialization phase are composed of a 128-bit nonce N and a 128-bit key K.There are also two 128-bit constants (C 0 , C 1 ) defined in AEGIS-128.First, the state is initialized in the following way: Then, the state will be updated for 10 rounds, i.e. computed until (X 0 10 , . . ., X 4 10 ).When r > 1 is odd, X 0 r is updated as follows: When r > 0 is even, X 0 r is updated as follows: For the remaining state words, they are updated in the same way in each round: where 1 ≤ r ≤ 10 and 0 ≤ i ≤ 3. When the associated data is empty, from the ciphertext, the attacker is able to know the output When the initialization phase is reduced to r rounds, we denote the output by θ 0 , as specified below:

The Initialization Phase of Tiaoxin
The Tiaoxin state is composed of thirteen 128-bit words.For convenience, we denote the initial state of Tiaoxin by Similarly, the inputs of the initialization phase consist of a 128-bit nonce N and a 128-bit key K.There are also two 128-bit constants (Z 0 , Z 1 ) defined in Tiaoxin.First, the initial state is filled in the following way: Then, a same round update function will be iterated for 15 rounds.Denote the state after r rounds of update by Then, the round function can be formalized as follows: When the associated data is empty and the initialization phase is reduced to r rounds, as the earliest ciphertext is outputted only after one more round update function, the known output (µ 0 , µ 1 ) can be specified as follows: Therefore, it is equivalent to that there are 16 rounds for the initialization phase of Tiaoxin.

Integral Properties for the AES Round Function
In the proposal of AES, the designers presented an efficient integral distinguisher for 3-round AES and it is still the basis of the best key-recovery attack on 6-round AES in the single key setting.For completeness, the 3-round integral distinguisher is depicted in Figure 2, where A denotes that the corresponding byte takes all 2 8 possible values, C denotes that the corresponding byte takes a constant value and B denotes that the corresponding byte is balanced, i.e. its sum is zero.Formally, the following relation holds: It is well-known that the 3-round distinguisher can be trivially extended to a 4-round one.Specifically, if s[D(0)] traverses all 2 32 possible values and the remaining bytes of s are assigned to random constants, the sum of all the outputs after 4-round AES must be zero, i.e.
(1) It is simple to explain why the above equation holds.Specifically, for such an input set of s, the first column of R(s) must also take all 2 32 possible values and the remaining columns are still constants.Therefore, the set of R(s) can be divided into 2 24 different subsets according to the value of (R(s Obviously, for each subset, if it passes through 3-round AES further, the sum of the outputs must be zero according to the 3-round integral distinguisher.As the sum of the outputs for each subset is zero, the sum of all the outputs must be zero.

New Integral Properties for the AES Round Function
Due to the special structure of AEGIS and Tiaoxin, some complex integral properties will occur while they will never occur in real AES.We have to emphasize that the proof of these properties is non-intuitive as we will be faced with a multiset of values which cannot be accurately captured by A, C or B. Specifically, the multiset has the feature that the same value will appear even times.Indeed, such a feature has once been utilized to break the SASAS scheme in whitebox cryptography [BS01].
Property 1.Given 3 arbitrary 128-bit values c 0 , c 1 and c 2 , for an arbitrary function f : F 128 2 → F l 2 , the following property must hold: (2) . When s only varies at s[0][0], the expressions of s 0 can be written as follows according to the first column of the matrix in MC: where x i , y i and c[i][j] (j = 0, 0 ≤ i ≤ 3) are constants depending on (c 0 , c 1 ) and the constant part of s.Therefore, when c 0 In this case, it obviously holds that

Denote the value of s
In other words, the same value of s 0 must appear even times and hence the same value of f (s 0 ⊕ c 2 ) must appear even times.Consequently, we have which completes the proof.
Based on Property 1, there will be a special integral distinguisher for n + 1 rounds of the AES round function, i.e.
However, such an integral distinguisher will never appear in real AES.
In addition, due to the symmetry of the AES round function, Property 1 can be generalised.Specifically, for an arbitrary choice of (i, j) where 0 ≤ i ≤ 3 and 0 ≤ j ≤ 3, the following property must holds: (3) Property 2. Given 3 arbitrary 128-bit values c 0 , c 1 and c 2 , the following property must hold: Proof.The basic idea to prove this property is similar to that used in the proof of Property 1.Let t = R(s) and then our aim is to study the property of and the remaining bytes of s are all C, there will be For the first column of u, it can be deduced that where i,j are constants depending on (c 0 , c 1 ) and the constant part of t.
If denoting the value of u by u α when t[0][0] = α, there must be , the same value of u[Col(0)] will appear even times.Similarly, we can write the expressions for bytes of u located in the remaining three columns.Due to the symmetry of the AES round function, the same conclusion can be derived.Specifically, if c 0 the same value of u[Col(4 − i)] will appear even times, where the indices are considered within modulo 4. Therefore, it can be derived that either ) must appear even times.For both cases, there must be As both SR and MC are linear operations, we thus have which completes the proof.
The difference between Property 1 and Property 2 should be emphasized.Specifically, in the proof of Property 1, we view the whole AES state as a unit and we derive that the same value of the AES state will appear even times.However, in the proof of Property 2, we view each byte of the AES state as a unit and derive that the same value of each byte will appear even times.It is not difficult to derive that we lose the integral property for In addition, it is simple to generalise Property 2. Specifically, for an arbitrary choice of (i, j) where 0 ≤ i ≤ 3 and 0 ≤ j ≤ 3, the following property must holds: (5) Property 3. Given 3 arbitrary 128-bit values c 0 , c 1 and c 2 , for an arbitrary function f : F 128 2 → F l 2 , the following property must hold: where
According to Property 3, there exists an integral property for n + 2 rounds of the AES round function, as specified below: Similarly, due to the symmetry of the AES round function, Property 3 can be generalised.Specifically, for an arbitrary i satisfying 0 ≤ i ≤ 3, there must be The conditional integral property.However, for an arbitrary choice of the four 128-bit values of (c 0 , c 1 , c 2 , c 3 ), there is no deterministic property for the sum To obtain a deterministic property, some additional conditions can be added.Specifically, from the proof of Property 3, when , combined with the integral distinguisher for 4-round AES, there must be Apart from the above properties, it is necessary to prove some additional integral properties to increase the accuracy of our integral distinguishers.
Property 4. Given an arbitrary 128-bit value c 0 , for any i satisfying 0 ≤ i ≤ 3, the following property must hold: Proof.Due to the symmetry of the AES round function, as in all the above proofs, we only need to prove Firstly, we focus on the first three columns of R(s) ∧ R(R(s) ⊕ c 0 ).As s[Col(0)] takes all the 2 32 possible values, the set of s can be divided into 2 24 different subsets according to ).In this way, for each subset of s, R(s)[Col(0, 1, 2)] will be a fixed constant.For each such subset of s, each byte of R(R(s) ⊕ c 0 ) will take all the 2 8 possible values.Therefore, for each such subset, we have Similarly, it can be simply derived that Consequently, which completes the proof.
Property 5. Given an arbitrary 128-bit value c 0 , for any i satisfying 0 ≤ i ≤ 3, the following property must hold: Proof.As in all the above proofs, it is sufficient to prove Consider the case when only s[0][0] takes all the 2 8 possible values while the remaining bytes of s are constants.For such a set of inputs, except (R 2 (s) ⊕ s)[0][0], each byte of R 2 (s) ⊕ s will independently take all the 2 8 possible values.In other words, when s[D(0)] takes all the 2 32 possible values, the same value of (R 2 (s) ⊕ s)[i][j] with (i, j) = (0, 0) will appear even times as there are 2 24 different values of (s ) and 2 24 is an even number.From a different perspective, consider the set of inputs where only s[1][1] takes all the 2 8 possible values.In this case, except (R 2 (s) ⊕ s) [1][1], each byte of R 2 (s) ⊕ s will independently take all the 2 8 possible values.Based on similar reasons, the same value of (R 2 (s) ⊕ s)[i][j] with (i, j) = (1, 1) will appear even times when s[D(0)] traverses all the 2 32 possible values.
Combining both cases, when s[D(0)] takes all the 2 32 possible values, the same value of each byte of (R 2 (s) ⊕ s) will appear even times.However, it should be emphasized that the same value of R 2 (s) ⊕ s will not necessarily appear even times.As a result, we have which completes the proof.

Cryptanalysis of 5-Round AEGIS-128
In this section, we will describe how to identify the weak keys in 5-round AEGIS-128 by tracing the expressions of the internal states in terms of the initial state.As the AEGIS-128 state is composed of five 128-bit words and the round update function treats each word as a unit, we are motivated to write the expressions of the internal states in terms of the initial state by treating a 128-bit word rather than 1 bit as a unit.Therefore, it is no more difficult to write the accurate expressions, while it is almost impossible in bit level.

Writing the Expressions of 5-Round AEGIS-128
To understand how the weak keys influence the expressions, it is essential to know the original expressions for an arbitrary key.For simplicity, when writing the expressions, we omit the constants and only focus on how the nonce evolves as the state is updated.Consequently, in the following, A(N ) may represent A(N ⊕ c) where c is a constant depending on the key and the constant part of the initial state.In addition, when the state word does not depend on the nonce N , it is simply written as 0. This way will make the expressions more explicit and readable.
The initial state of AEGIS-128 is defined as below: Therefore, the expressions of (X 0 1 , . . ., X 4 1 ) can be written as follows: Similarly, we can write the expressions of (X 0 r , . . ., X 4 r ) for 2 ≤ r ≤ 5.When r = 2, we have When r = 3, the expressions are When r = 4, there will be Finally, when r = 5, we have According to the output of AEGIS-128, if targeting 5 AEGIS-128 initialization rounds, the attacker can only know It seems that the simple integral distinguisher for 4-round AES can be directly applied to 5-round AEGIS-128 as X 0 5 does not influence the output.However, the logic AND operation between X 2 5 and X 3 5 simply makes it impossible.Therefore, we are motivated to investigate whether it is possible to simplify the expressions of X 2 5 and X 3 5 by adding proper conditions.Indeed, similar ideas are commonly used in symmetric-key cryptanalysis, which is to add additional conditions to slow down the propagation of variables, although most of them are deduced by carefully tracing the influence of a certain bit condition.However, it seems difficult to analyze the constructions based on AES from the bit level.Thus, we will study how to add conditions from the byte level as it is more compatible with the AES specification.

Adding Conditions To Simplify the Expressions
To carefully investigate how the conditions affect the expressions, it is necessary to write the accurate expressions of the internal states in terms of the nonce, the key and the constant part of the initial state.In the following, we will expand on how the conditions are derived and how the expressions are simplified.
Similarly, the initial state is defined as below: Our aim is to write the expressions of (X 0 r , . . ., X 4 r ) for 1 ≤ r ≤ 5 by involving all the information.In the process, we will continuously introduce new variables C i (i > 1) to represent constant values to reduce the length of the expressions.To save space, we will not repeat the definitions of these new variables.
When r = 1, we have When r = 2, we have When r = 3, we have From the expressions of X 1 3 and X 2 3 , it can be found that both of them contain the expression A(K ⊕ N ) ⊕ A(N ⊕ A(K ⊕ C 1 )).From the proof of Property 3, when the following condition holds: ) will take a constant value if only N [D(0)] varies.Therefore, in the following, we only consider the expressions when N [D(0)] takes all the 2 32 possible values while the remaining bytes of N take random constant values.In this case, when Equation 14 holds, ) is constant.Hence, we define that In this way, the expressions of (X 0 3 , X 1 3 , X 2 3 , X 3 3 , X 4 3 ) can be further simplified, as shown below: From the simplified expressions, we further observed that we could introduce an intermediate variable T to represent C 0 ⊕ A(K ⊕ N ), i.e.
In this way, the expressions can be further simplified, as shown below: It can also be found later that introducing an intermediate variable T is crucial to understand the key-recovery attack on 5-round AEGIS-128.Consequently, when r = 4, there will be Finally, we consider the case r = 5 and there will be For the output θ 0 , we have It should be emphasized that the values of T and N are related according to Equation 15.As N [D(0)] takes all the 2 32 possible values and the remaining bytes of N are constants, T [Col(0)] will also traverse all the 2 32 possible values while the remaining bytes of T are constants.As a result, the values of T can be further divided into 2 24 different subsets according to the value of (T ), though how to divide them depends on the secret key.
Different from the expression of X 1 5 , the expressions of X 2 5 and X 3 5 do not contain the variable N after introducing the variable T .Therefore, there is no need to relate T and N when studying the integral property of X 2 5 ∧ X 3 5 .In other words, we simply treat T as irrelevant to N and T [Col(0)] will take all the 2 32 possible values.Therefore, we can focus on the case when only T [0][0] is A while the remaining bytes of T are C.In this case, X 2 5 [Col(1, 2, 3)] are constants, thus resulting that the integral property of (X 2 5 ∧ X 3 5 )[Col(1, 2, 3)] indeed only depends on the integral property of Indeed, based on Property 4, there will be For X 1 5 , it is simple to derive that However, if considering the inverse of the AES round function, we can find that when only Finally, we are only left with the integral property of X 4 5 .Similarly, as its expression does not contain N , we simply consider the case when only T [0][0] is A while the remaining bytes of T are C. From the integral property of 3-round AES, we have According to Property 2, we have Hence, there will be Combining Equation 17, Equation 19 and Equation 20, we thus have which will be the basis of our key-recovery attack.
Combining Equation 18, Equation 19 and Equation 20, we directly obtain a distinguisher for 5-round AEGIS-128 with time complexity and data complexity 2 32 , as shown below:

The Key-Recovery Attack on 5-Round AEGIS-128
Based on the above analysis, we can design a weak key recovery attack.First of all, a weak key should satisfy the following condition: Notice that after guessing After the table KT 0 is constructed, the correct value of K[D(0)] can be simply recovered in the following way: Step 1: Construct a set of size 2 8 for T by traversing T [0][0] while T [i][j] (i, j) = (0, 0) is set as a random constant.
Step 2: For each candidate of K[D(0)] in KT 0 , construct the set of size 2 8 for N .Specifically, compute the 2 8 different values for (1 ≤ i ≤ 3), they are set as random constants.Encrypt all possible 2 8 different values of N with 5-round AEGIS-128 and collect the corresponding 2 8 outputs θ 0 .If then the current candidate for K[D(0)] is the correct value and exit.Otherwise, consider the next candidate for K[D(0)] and repeat.

Complexity evaluation. For each candidate for K[D(0)],
it is necessary to compute 2 8 different N .As there are 2 24 candidates, the data complexity1 is upper bounded by 2 32 .The time complexity is also upper bounded by 2 32 encryptions.

Experiments.
The experiments 2 are performed on the small-scale AES [CMR05].For both the distinguishing attack and the key-recovery attack, experiments show that they succeed with probability 1.For a random key, the attacks always fail.
Efficiently recovering the weak key.Notice that a weak key satisfies After the above procedure, K[D(0)] is known.Therefore, there are 96 unknown key bits left.As K[D(0)] is known, we can independently guess . Consequently, we can collect 2 24 candidates for K[D(1)], K[D(2)] and K[D(3)], respectively.In total, there are 2 24×3 = 2 72 candidates for the 128-bit key.Therefore, the weak key can be recovered with time complexity 2 72 , which is 2 96−72 = 2 24 times faster than an exhaustive search.
Combining with the procedure to recover K[D(0)], the time complexity, data complexity and memory complexity to recover the weak key are 2 72 , 2 32 and 2 25 , respectively.
Remark.When a key is not a weak key, it is expected that in the procedure to recover K[D(0)] the event θ 0 [Col(1, 2, 3)] = 0 will not happen during the 2 24 tests.Indeed, by imposing different conditions (1 ≤ i ≤ 3), we can determine different sets of weak keys and they can be recovered in the similar way.In conclusion, there are 4 different sets of weak keys with 2 96 keys each.For each set of weak keys, the time complexity and data complexity to recover the correct one are 2 72 and 2 32 , respectively.

Failing in Attacking 5-Round AEGIS-128L
It seems that the above method can be applied to 5-round AEGIS-128L, we are thus motivated to study whether it is actually feasible.However, due to a more clever way to generate the output from the state, the similar attack on 5-round AEGIS-128L cannot work.
To explain this, we can carefully study the expressions of the internal states.
Similarly, the AEGIS-128L state is composed of eight 128-bit words and we denote the initial state by (S 0 0 , . . ., S 7 0 ).The inputs of the initialization phase of AEGIS-128L are the same as those in AEGIS-128.According to the specification of AEGIS-128L, the initial state of AEGIS-128L is defined as follows: The round update function is: where (S 0 r , . . ., S 7 r ) denotes the state after r rounds of update.When writing the expressions of (S 0 r , . . ., S 7 r ) for 1 ≤ r ≤ 5, similar to our way to analyze AEGIS-128, we directly introduce new variables P i (i ≥ 0) to represent the constant part of the expression.
When r = 2, there will be When r = 3, there will be As r increases, the expression becomes more and more complex.Thus, we first consider the output of 5-round AEGIS-128L, as shown below: It can be found from the expression of S 3 3 that it is impossible to eliminate A(A(A(N ⊕ K) ⊕ C 1 ) ⊕ P 1 ) and N has passed through 3 AES rounds.Hence, in the expression of S 5 5 , N will pass through 5 AES rounds and it is impossible to reduce the number of AES rounds that N will pass through by adding proper conditions.Similarly, it is impossible to eliminate A(A(A(N ⊕ K) ⊕ P 3 ) ⊕ P 4 ) and N also has passed through 3 AES rounds.Therefore, in the expression of S 1 5 , N will pass through 5 AES rounds and it is impossible to reduce the number of AES rounds that N will pass through by adding proper conditions.As the conventional integral distinguisher on 3-round AES can not be adapted to 5 rounds, the successful attack on 5-round AEGIS-128 cannot be applied to 5-round AEGIS-128L.
Only for interest, we find that the quadratic parts S 2 5 and S 3 5 can be simplified by adding proper conditions, the expressions of which can be written as follows: Weak Keys in Reduced AEGIS and Tiaoxin When r = 5, we have where When r = 6, we have Then, the expressions can be updated, as shown below: When r = 7, we have

Analyzing the Output of 8-Round Tiaoxin
As the current expressions are sufficiently complex, we will not write the expressions for the case r = 8.Since our aim is to analyze 8-round Tiaoxin, we need to focus on the output, as shown below: In the following analysis, we assume that Q[D(0)] traverses all the 2 32 possible values and Q[D(i)] (i = 0) is assigned with a random constant.
As U 1 8 = A(U 0 7 ) and N has passed through 4 AES rounds in the expression of U 0 7 , we only focus on the integral property of µ 0 .First, consider the quadratic part, as listed below: Therefore, for the assumed input pattern of Q, Y 3 8 [Col(1, 2, 3)] and W 3 8 [Col(1, 2, 3)] are always constants.In other words, it can be derived that As the algebraic degree of one AES round is 7, from the perspective of the algebraic degree, we indeed have Then, it is necessary to analyze W 1 8 = A(W 0 7 ).As according to Property 3, it can be known that the same value of will appear even times for the assumed input pattern of Q.Hence, it can be derived that the same value of W 1 8 [Col(1, 2, 3)] will appear even times.In other words, Next, we are required to analyze U 0 8 = A(U 2 7 ) ⊕ U 0 7 ⊕ Z 0 .For better understanding, the integral property of U 0 7 and A(U 2 7 ) will be separately discussed.From the expression of U 0 7 , it is easy to know that For the integral property of Then, the set of Q can be divided into 2 24 different subsets according to the value of ), though the division depends on the key.In this way, for each subset of Q , it can be deduced that Therefore, it can be further derived that Consequently, we have For U 2 7 , similar to the analysis of W 1 8 , it can be known that the same value of U 2 7 [Col(1, 2, 3)] will appear even times.As a result, we have Therefore, we can deduce that Finally, it is essential to analyze the integral property of U 2 8 = U 1 7 .However, we find that is uncertain even though a very similar form will have an integral property which has been discussed for W 1 8 .According to a similar conditional integral property as specified in Equation 10 constant for the assumed input pattern of Q.Consequently, we only need to evaluate A(A 2 (Q ⊕ Z 4 ) ⊕ Q) and obviously there will be Indeed, based on Property 5, there will be The condition Z 3 [D(0)] = Z 4 [D(0)] corresponds to a set of weak keys satisfying The condition Z 2 [D(0)] = Z 4 [D(0)] corresponds to a set of weak keys satisfying When a weak key is used, there must be However, due to the influence of the integral property of U 0 8 as specified in Equation 29, we will lose the integral property for the output µ 0 without applying a linear transform MC −1 to µ 0 .As MC −1 is a linear transform, based on Equation 27, Equation 28and Equation 33, it can be simply derived that Hence, when a key satisfies any of the three conditions specified in Equation 30, Equation 31 and Equation 32, combined with Equation 29, an integral property for µ 0 can be derived, as shown below: where (i, j) ∈ {(0, 1), (0, 2), (0, 3), (1, 1), (1, 2), (2, 1), (2, 3), (3, 2), (3, 3)}.
The distinguisher and weak keys.From the above dedicated manual analysis, there are 9 balanced bytes in MC −1 (µ 0 ) when Q takes the assumed input pattern.As Q = A(N ), N can be computed from Q without the knowledge of the key K. Therefore, to make Q take the assumed input pattern, we simply assign values to Q such that they can form the assumed input pattern and then compute the corresponding set of N .In the distinguishing phase, we simply encrypt the set of N and observe the integral property of MC −1 (µ 0 ).Therefore, the time complexity and data complexity to distinguish 8-round Tiaoxin are both 2 32 when a weak key is used.For weak keys, there are 3 sets with 2 96 keys each.Similarly, if imposing the condition on different diagonals, there will be in total 3 × 4 = 12 sets of weak keys with 2 96 keys each.
Remark.It can be observed that what dominates the integral property of µ 0 will be the dedicated analysis of U 0 8 and U 2 8 .From the evaluation of U 0 8 , we learned that we need to evaluate MC −1 (µ 0 ) rather than µ 0 .From the evaluation of U 2 8 , we learned that we need to add proper conditions in order to obtain an integral property.In addition, from our process to write the expressions, we found that we could arbitrarily choose an input pattern for A(N ) rather than N , which is equivalent to obtaining a free round and implies that one round is useless in Tiaoxin.

Feasibility of the Key-Recovery Attacks
As our key-recovery attack on 5-round AEGIS-128 succeeds, it is natural to ask whether it is also feasible to recover the weak key efficiently for 8-round Tiaoxin.One main reason why we can mount a key-recovery attack on 5-round AEGIS-128 is that an intermediate variable T = A(N ⊕ K) is introduced and the integral property of the output can be determined when only one byte of T is A and the remaining bytes are C.As a result, we need to evaluate whether the same strategy can be applied to 8-round Tiaoxin.
First of all, we need to understand why the 8-round distinguisher requires 2 32 data.One main reason arises in the evaluation of U 0 8 = A(U 2 7 ) ⊕ U 0 7 ⊕ Z 0 as Q will pass through 4 AES rounds both in A(U 2 7 ) and U 0 7 .However, different expressions in terms of Q appear in U 0 7 and U 2 7 , which are A 2 (Q ⊕ Z 3 ) and A 2 (Q ⊕ Z 4 ).Although there are three choices of the conditions that the weak keys should satisfy, if choosing the condition that Z 3 [D(0)] = Z 4 [D(0)], the expressions of U 2 7 and U 1 7 can be significantly simplified if only Q[D(0)] varies, as shown below: where In this way, only in the expression of U 0 7 , Q will pass through 4 AES rounds.Thus, it is necessary to introduce a new variable G to represent A(Q ⊕ Z 4 ) in order to mount a key-recovery attack, i.e.
In this way, the expression of U 0 7 can be updated as follows: Therefore, based on the relation between Q and G, it can be derived that According to the expression of U 2 7 , we have As a result, we have For the quadratic part Y 3 8 ∧ W 3 8 , it remains that Weak Constants.For U 2 8 = U 1 7 , if Z 2 is treated as independent of Z 4 , we immediately lose the integral property for U 2 8 .As a result, we try to further reduce the size of weak keys by adding the condition Z 4 In this way, the key should satisfy the following conditions: It should be noted that when [0] will be A and the remaining bytes will be C. What we want to emphasize is that there is no need to add a stronger condition like One may also find that there are 4 possible ways to add conditions on Z 4 and Z 2 in order to make that only . Therefore, we only focus on the way to choose conditions such that there are no additional conditions on the constant Z 0 .
Let K = A(K) and we will have which implies that the number of weak keys becomes 2 24×3 = 2 72 .Once the key satisfies the above conditions, when only G[0][0] is A and the remaining bytes of G are all C, we can know that only A(Q ⊕ Z 2 )[0][0] is A and the remaining bytes of A(Q ⊕ Z 2 ) are all C. Therefore, according to the expression of U 1 7 , we can know that For W 1 8 = A(W 0 7 ), we also lose its integral property.However, if adding the condition ) is a constant and we denote it by Z 16 .In other words, when the following condition holds, W 0 7 can be written as As a result, However, combined with the condition on the key, the condition specified in Equation 37 implies that In other words, if the constants Z 0 and Z 1 satisfy Equation 38 and the key satisfies Feasibility for 8-round Tiaoxin.As the above analysis shows, to mount a key-recovery attack on 8-round Tiaoxin, the size of the set of the weak keys will be reduced to 2 72 and there are 32 bit conditions on the constants (Z 0 , Z 1 ).Obviously, the constants used in Tiaoxin does not satisfy Equation 38 and thus the above key-recovery attack cannot be applied to 8-round Tiaoxin.However, it is an alarm that in the design like Tiaoxin, the round constants should be carefully chosen.

Discussions
The basic idea to attack 5-round AEGIS-128 and 8-round Tiaoxin is simple, which is to utilize the conventional integral distinguisher for 4-round AES.However, the feasibility of the simple idea needs to be highlighted, which can obviously advance the understanding of AEGIS-128 and Tiaoxin.
Feasibility for 5-round AEGIS-128.For 5-round AEGIS-128, the feasibility much relies on the fact that X 0 5 is not involved in the output, where N will pass through 5 AES rounds.For the distinguishing attack, it is essential to consider the weak key.However, the weak key is not obvious and hidden in the expressions of the internal states.Therefore, our way to write and analyze the expressions plays an important role to identify the weak keys.For the key-recovery attack, the feasibility also contributes to the analysis of the expressions as we find that it is possible to introduce a variable T to replace A(K ⊕ N ), which directly makes the variable N disappear in almost all expressions.In other words, introducing T is equivalent to appending a round for key recovery.While appending several rounds before a distinguisher for key recovery is common in the analysis of block ciphers, it is obviously non-intuitive for AEGIS-128.
Feasibility for 8-round Tiaoxin.It is common in the cryptanalysis of symmetric-key primitives to add conditions to reduce the algebraic degree.Therefore, the condition on the keys for 5-round AEGIS-128 is still traceable if more attention is paid.However, the condition on the key for 8-round Tiaoxin appears for a very different reason, which we believe hard to detect without writing and analyzing the expressions of the internal states.Especially, from our process to write the expressions until the 6th round, we find that the expressions can be represented in terms of A(N ) rather than N .By replacing A(N ) with a new variable Q, instead of considering N itself, we can directly study Q as computing N from Q requires no secret knowledge, which implies that there is 1 useless round in Tiaoxin.To mount a key-recovery attack, we have to add more conditions as there are several different terms like A(Q ⊕ ν i ) where ν i represents a different 128-bit constant for different i.However, by carefully studying the expressions, we find it still feasible to mount a key-recovery attack when a weak constant is used, which occurs with probability 2 −32 .In summary, lots of useful information related to attacks is hidden in the expressions and it is necessary to perform dedicated analysis of them.

On the Usage of Division Property
The bit-based division property (BDP) [TM16] is a popular tool to search for integral distinguishers, especially when equipped with the automatic tools [XZBL16].However, without the identification of the weak keys with our methods, a naive implementation without taking the conditions on the key into account will obviously fail in finding the integral distinguishers for 5-round AEGIS-128 and 8-round Tiaoxin.The reason is that these distinguishers will not hold when a non-weak key is used.Especially for Tiaoxin, the input pattern of N is unstructured as we indeed consider a structured pattern of A(N ).Without noticing this fact, it is impossible to find the distinguisher with 2 32 data complexity with the naive implementation of BDP.
Only for interest, we also tested whether the conventional bit-based division property (CBDP) [TM16] can detect the phenomenon when the same value appears even times in a multiset.When saying CBDP, we mean that the integral property is evaluated based on whether there exists a feasible desired division trail [XZBL16].
We implemented CBDP to evaluate the integral property for It has been proved in Property 3 that the sum must be zero.However, it is evaluated as unknown with CBDP.It should be mentioned that the sum is correctly predicted, which is obviously reasonable as To explain why CBDP failed, we will construct a special example.Specifically, consider a mapping ϕ(v 0 , v 1 , v 2 , κ 0 , κ 1 , κ 2 , κ 3 , κ 4 , κ 5 ) : F 9 2 → F 3 2 and denote the three output bits by (ι 0 , ι 1 , ι 2 ), as defined below: Obviously, for an arbitrary choice of (κ 0 , κ 1 , κ 2 , κ 3 , κ 4 , κ 5 ), there must be It is certain that CBDP can predict these integral properties.
Then, the question becomes whether CBDP can capture it.To show that CBDP cannot capture it, it is sufficient to find a feasible division trail ending with "1".Therefore, we consider a quadratic boolean function g(ι 0 , ι 1 , ι 2 ) = ι 2 ∧ (ι 0 ⊕ ι 1 ).The circuit to calculate g is depicted in Figure 3. Based on the propagation rules, we can deduce by hand a feasible division trail ending with "1" as illustrated in Figure 3, thus revealing that CBDP cannot predict the sum of g.Consequently, it is an evidence that CBDP is unable to capture the property that the same value appears even times in a multiset.Although the polynomials are known in this special example and can indeed be simplified, i.e. the circuit will change after simplification, it is too complex to write the boolean expression of an output bit for a cryptographic primitive and therefore the evaluation is indeed based on the (non-simplified) circuit to compute the output.This is why we directly study the circuit rather than the accurate simplified expression of ι 2 ∧ (ι 0 ⊕ ι 1 ) in this special example.A potential method to address this problem is to use the recent new ideas [WHG + 19, HLM + 20, HSWW20], which is to count the number of division trails.When it is even, the sum is treated as zero, which fits very well with our theoretical analysis as we prove that the same value must appear even times in a multiset.However, due to the influence of the matrix multiplication of AES, as revealed in [HLLT20], the number of trails will explode when the number of matrix multiplications increases.Thus, it is questionable whether the solver can enumerate all the feasible solutions in practical time, especially when evaluating s[D(0)]∈F 32 2 R n (R 2 (s ⊕ c 0 ) ⊕ R 2 (s ⊕ c 1 )) for large n, which implies the importance to prove Property 3. Combining all the above discussions, it seems plausible why the distinguishers for 5-round AEGIS-128 and 8-round Tiaoxin with data complexity 2 32 in the weak-key setting are not found during the long CAESAR competition.Moreover, the feasibility to append additional rounds for key recovery is deeply hidden in the expressions of the internal states.

Conclusion
By expressing the internal states in terms of the input state words, we observed the possibility to adapt the well-known integral distinguisher on reduced AES to AEGIS-128 and Tiaoxin in the weak-key setting.With dedicated analysis of these expressions, the set of weak keys are eventually identified, which are used to simplify the quadratic part of the output for AEGIS-128 and to turn a probabilistic integral property into a deterministic one for Tiaoxin, respectively.To make the derived integral distinguisher theoretically correct, we have proved some integral properties for some unusual combinations of the AES round function, which will easily occur in the constructions like AEGIS and Tiaoxin but will never occur in real AES.To efficiently recover the weak key, we introduce a new variable related to the key to represent the output of the AES round function for a certain 128-bit word and then study the updated expressions.Such a way is almost equivalent to appending rounds for key-recovery before a distinguisher and the feasibility much relies on the careful analysis of the updated expressions.Consequently, distinguishing and key-recovery attacks on 5-round AEGIS-128 are achieved in the weak key setting.For 8-round Tiaoxin, we could only construct the distinguisher, while the key-recovery attack requires the usage of a weak constant occurring with probability 2 −32 .This is the first third-party cryptanalysis of the initialization phase for both AEGIS-128 and Tiaoxin and all the attacks reach half of the total number of rounds.Based on our analysis, it seems that attacks on constructions like AEGIS-128 and Tiaoxin in the weak-key setting have more potential.
Although Equation 40 is very similar to Equation 6 and Equation 12, we are unable to find a similar way to prove it.Although it is trivial to deduce from Property 3 that (41) Let τ = R 2 (s) ⊕ R 2 (s ⊕ c 1 ) ⊕ s ⊕ c 0 .From the proof of Property 3, it can be found that the expression of τ [0][0] can be written as follows: where is a constant depending on c 0 , c 1 and the constant part of s.For the small-scale AES round function, we performed an exhaustive search.Specifically, for each value of c 1 [D(0)], which is 2 16 in total for the small-scale AES, we traversed all the 2 16 possible values for s[D(0)] and collect the corresponding set of τ [0][0].It is found that the same value of τ [0][0] in the set always appears even times.Then, the expressions for τ [i][i] (1 ≤ i ≤ 3) can also be written.After performing a similar exhaustive search for each τ [i][i], it is also observed that the same value of it in the computed set always appears even times, thus explaining why the whole W 1 8 is balanced in our experiments.However, when we change it to the real AES round function, the exhaustive search is obviously infeasible.However, to disprove something, it suffices to find a counter-example.
In the experiments, we randomly chose a value for c 1 [D(0)] and and evaluated the sum of the set of S(τ [0][0]) when s[D(0)] traverses all the 2 32 possible values.It is found the sum is always non-zero, thus disproving Equation 41.Similar experiments were also performed to independently evaluate the sum of S(τ [i][i]) (1 ≤ i ≤ 3).

Figure 1 :
Figure 1: The AES state