Finding Collisions against 4-Round SHA-3-384 in Practical Time

. The Keccak sponge function family, designed by Bertoni et al. in 2007, was selected by the U.S. National Institute of Standards and Technology (NIST) in 2012 as the next generation of Secure Hash Algorithm (SHA-3). Due to its theoretical and practical importance, cryptanalysis of SHA-3 has attracted a lot of attention. Currently, the most powerful collision attack on SHA-3 is Jian Guo et al. ’s linearisation technique. However, this technique is infeasible for variants with a smaller input space, such as SHA-3-384. In this work we improve upon previous results by utilising three ideas which were not used in previous works on collision attacks against SHA-3. First, we use 2-block messages instead of 1-block messages, to reduce constraints and increase flexibility in our solutions. Second, we reduce the connectivity problem into a satisfiability (SAT) problem, instead of applying the linearisation technique. Finally, we propose an efficient deduce-and-sieve algorithm on the basis of two new non-random properties of the Keccak non-linear layer. The resulting collision-finding algorithm on 4-round SHA-3-384 has a practical time complexity of 2 59 . 64 (and a memory complexity of 2 45 . 94 ). This greatly improves upon the best known collision attack so far: Dinur et al. achieved an impractical 2 147 time complexity. Our attack does not threaten the security margin of the SHA-3 hash function. However, the tools developed in this paper could be used to analyse other cryptographic primitives as well as to develop new and faster SAT solvers.


Introduction
Cryptographic hash functions are unkeyed primitives that accept an arbitrarily long input message and produce a fixed length output hash value, or digest for short. Since Diffie and Hellman [DH76] suggesting signing a cryptographic hash value of a message rather than the message itself, hash functions became extremely useful in various cryptographic protocols: authentication (e.g., HMAC [BCK96]), password protection, commitment schemes, key exchange protocols, etc. Hence, the need for a secure and efficient hash function is great, both for real life applications and as a component of more complex constructions.
The first de-facto cryptographic hash function was MD5 [Riv92], developed by Rivest to fix a few issues with its predecessor MD4. Later, the US National Institute of Standards in Technology (NIST) published the SHA standard [NIS93]. Two years later, SHA was updated into the later named SHA-1 [NIS95] to prevent some attacks that were not disclosed to the public (but were later rediscovered by Chabaud and Joux [CJ98]). With the need for output sizes larger than SHA-1's 160-bit, NIST published a new family of hash functions, called SHA-2, with output sizes of 224-512 bits [NIS02].
In 2005, Wang et al.
[WLF + 05, WY05,WYY05] broke several cryptographic functions. These fundamental works demonstrated the way to attack most of the existing hash functions using several techniques and ideas: using modular differences (i.e., using both an XOR difference and an additive difference); using multi-block collisions (i.e., collisions that span over several blocks, an idea independently discovered in [BCJ15]); and introducing the message modification technique (a method to tweak a pair of messages conforming to some differential characteristic up to a certain round, so that it satisfies the characteristic for more rounds).
These advances, along with results on the Merkle-Damgård hash function [Dam89,Mer89], which is the design all previously mentioned hash functions followed, led NIST to start a cryptographic competition for the selection of a new hash function standard. The process started in 2008, and in 2015 Keccak [BDPA] was published as the new SHA-3 standard [NIS15].
Keccak [BDPA], designed by Bertoni et al., is a sponge construction. It has a 1600-bit state which is updated by XORing message blocks to the state. The number of bits that compose a message block depends on the required output size as the capacity of the sponge function should be twice as large as the output size, and the remaining bits are XORed with the message block. Then, a 24-round permutation, Keccak-f, is applied to the state and another block is absorbed into the state, until the last block is absorbed. Finally, the internal state is updated again using Keccak-f, and some bits of the internal state are revealed as the output.
The Keccak sponge function can be deployed in different modes, namely, keyed mode and unkeyed mode. Since the publication of Keccak in 2008, the analysis of both keyed mode and unkeyed mode Keccak has attracted considerable attention. For the keyed Keccak, a cube-like attack proposed by Dinur et al. [DMP + 15] and a conditional cube attack proposed by Huang et al. [HWX + 17] are the most powerful tools for analysing primitives based on the Keccak sponge function.
The purpose of a collision attack on a hash function H is to find a pair of distinct messages M and M ′ such that H(M ) = H(M ′ ). Finding a colliding pair should be computationally difficult for a secure hash function. In 2011, Naya-Plasencia et al. reported a collision attack on 2-round Keccak-512. 1 , among several other practical attacks on the Keccak hash function [NRM11]. In 2012, Dinur et al. proposed practical collision attacks on 4-round Keccak-224/256 [DDS12]: they combined a 1-round connector with a 3-round low weight characteristic by algebraic techniques. In 2013, the same authors constructed practical collision attacks on 3-round Keccak-384 and Keccak-512 and theoretical attacks on 4-round Keccak-384 and 5-round Keccak-256 using generalised internal differentials [DDS13].
Currently, the most powerful tool for building a practical collision attack against the SHA-3 hash function is the linearisation technique [QSLG17, SLG17, GLL + 20]. In [QSLG17], Qiao et al. followed the framework proposed by Dinur et al. in [DDS12] and extended the previous 1-round connector by one more round. In that work, the authors developed a novel algebraic technique to linearise all S-boxes in the first round. Song et  Keccak-512 Practical [DDS13] al. [SLG17] developed a new non-full linearisation technique to save degrees of freedom in the attack. Using this new technique, they launched several practical collision attacks on the Keccak family, such as 5-round SHA-3-224 and 5-round SHA-3-256. Jian Guo et al. recapped the two results [QSLG17,SLG17] in [GLL + 20]. There was no further development after that. However, that technique cannot be directly applied to variants with a smaller input space, such as SHA-3-384 and SHA-3-512. It is because the appended conditions consume many degrees of freedom, which variants with a smaller input space cannot provide enough of.
Morawiecki et al. [MS13] applied a SAT solver to the analysis of an unkeyed mode modified Keccak, using different parameters than the recommended ones, to find the preimage of a hash value up to 3 rounds. However, the authors did not consider Keccak's algebraic structure in their work, which reduces SAT solver's power. In our collision attack, we combine algebraic non-random characteristics with a SAT solver to make full use of its efficiency.
Our Contributions Our work extends previous results [DDS12] on finding collisions in SHA-3: a 3-round differential characteristic that leads to a collision is used in rounds 2-4, whereas a connecting phase is used in the first round to lead the input message pair into the input difference of the differential characteristic.
Inspired by the collision attacks proposed by Boissier et al. in [BNR21] and the preimage attacks against Keccak-224/256 proposed in [LS19] by Li et al., we use more than a single block in the colliding message pair. Unlike the inner collision attacks against smaller Keccak variants in [BNR21], our technique can work on the outer part of a Keccak default variant. Namely, we noticed that often good input differences impose conditions on the input that cannot be satisfied (as they are in the capacity part of the state). By first finding a message pair that satisfies these conditions, we levy this restriction. This step increases the flexibility in choosing a differential characteristic. In addition, we use the first block to set some capacity bits to values that help the connectivity step.
In addition, we introduce another two techniques into collision attacks on Keccak. Our second contribution is to replace the linearisation connection phase that was used before in [GLL + 20] with a SAT-connection phase. This idea is inspired by the dedicated collision attack against SHA-1 with aid of a SAT solver, proposed by Stevens et al. in [SBK + 17]. Namely, we use SAT solvers to find message values that satisfy the required difference conditions while previous works [GLL + 20, QSLG17, SLG17] did the connection from the input difference of the characteristic to the message conditions using linearisation. Again, there are two advantages for this approach -the first, is that we gain greater flexibility in choosing the differential characteristic as now we can "connect" to a wider range of input differences. Secondly, non-linear conditions which are useful in finding collisions (i.e., fixing intermediate bits to some values) are much easier to be satisfied using this sort of tools.
The third contribution is the introduction of detection and sieving tools. They complete internal states more efficiently than applying a SAT solver directly on non-linear problems. This reduces the number of unknowns and simplifies relations, making SAT solvers more efficient by orders of magnitude. We introduce a Truncated Difference Transform Table: for a given truncated differential transition, the table stores the possible differential transitions. I.e., if a truncated differential is followed. The table allows to efficiently find actual bit differences that were involved in the transition. We also introduce a Fixed Value Distribution Table, a precomputed table used to efficiently identify values that correspond to certain truncated difference transitions (just like in the original work of [BS93] stored in the difference distribution table also the values that correspond to the transition). Using these two tools enables, for each pair, the deduction of information needed to satisfy the differential characteristic.
We combine these ideas and produce the first practical attack that can find collisions in 4-round SHA-3-384. The expected running time of this attack is below 2 60 (we remind the reader that the SHA-1 collision found by [SBK + 17] used about 2 63 computation). While we implemented the attack and verified it, our best result at the moment is a 4-bit semi-free internal collision, which is to date, the best known semi-free against SHA-3-384. We compare our result with previous results of collision attacks on SHA-3 in Table 1 [SLdW07]. In this situation, the attacker's task is then to find a collision while starting from a random difference in the internal state (due to the prefixes pair that is not controlled at all by the attacker). Chosen-prefix collision attacks are more difficult to mount but are stronger attacks more relevant to practice because the chosen prefixes can be arbitrary meaningful texts.
Organisation of the paper The rest of the paper is organised as follows. In Section 2, we describe the SHA-3 hash function and properties of the Keccak round function. In Section 3, we revisit the collision attack proposed by Guo et al. The new framework of our attack is illustrated in Section 4. The methods of constructing the differential characteristic and generating the first blocks are stated in Section 5. The SAT-connection phase is described in Section 6. In Section 7, experimental results of our attack are given. Finally, we conclude the paper in Section 8.

SHA-3 hash function
The Keccak algorithm. In this section we describe the Keccak hash function in its default version. We refer the reader to [BDPA,NIS15] for the complete Keccak specification.
There Initially, the state is filled with zeroes and the message is split into r-bit blocks. There are two phases in the Keccak hash function. In the absorbing phase, the next r-bit message block is XORed with its first r-bit segment of the state and then the state is processed by an internal permutation that consists of 24 rounds. After all the blocks are absorbed, the squeezing phase begins. In the squeezing phase, Keccak-n iteratively returns the first r bits of the state as the output of the function with the internal permutation, until an n-bit digest is produced.
The operations ρ and π implement a bit-level permutation of the state. Let us denote this combined permutation by σ = π•ρ, which forms a mapping on integers {0, 1, · · · , 1599} such that σ(i) is the new position of the i-th bit in the state after applying π • ρ. We denote by L the first three linear operations θ, ρ and π, which we call a half round. We rewrite the expression of L in Equation 1: (1) In Equation 1, A is the input state of L while B is the output state. col[ϕ 1 (σ −1 (i))] and col[ϕ 2 (σ −1 (i))] are the sums of the five bits in the ϕ 1 (σ −1 (i))th and ϕ 2 (σ −1 (i))th columns, respectively. The sum of the five bits in one column is called a column sum. Padding rule. The Keccak hash function uses a multi-rate padding rule. By this rule, the original message M is appended with a single bit 1 followed by the minimum number of 0 bits and a single 1 bit such that the resulting message is of length that is a multiple of the bitrate r. Specifically, the resulting padded message is M = M |10 * 1.
In the four Keccak variants adopted by the SHA-3 standard, the message is first appended with '01', then the padding rule is applied. Namely, the resulting padded message is M = M |0110 * 1.

Properties of the Keccak round function
In this section we show five properties of the Keccak round function. The first property is called the column parity kernel (CP-kernel) equation: for states in which all columns have even parity, θ is the identity [BDPA]. This property has been widely used in cryptanalysis of Keccak. E.g., the attacks in [HWX + 17] use it to control the diffusion of cube variables.

Property 1. (CP-kernel Equation)
For every i-th and j-th bits in the same column of the state A we have: where A and B are the input and output states of L, respectively, and 0 ≤ i, j < 1600, i ̸ = j.
Property 1 can be easily verified through Equation 1. As the operations in the first half round are all linear, the equality also holds for differences of corresponding bits.
Before we present four differential properties of the non-linear operation χ, we first recall the definition of the difference distribution table (DDT) of χ [BS93]. The operation χ is applied to each row of the state independently, and can be regarded as an S-box. In differential cryptanalysis proposed by Biham and Shamir in [BS93], the DDT of an S-box counts the number of cases where the input difference of a pair is a and the output difference is b. In our case, for an input difference a ∈ F 5 2 and an output difference b ∈ F 5 2 , the entry δ(a, b) of the DDT of the Keccak S-box S is:  We summarise all cases with special output differences in Table 2 when the output differences in Property 3 and Property 4 are shifted to the right. If only one input bit of the Keccak S-box is known, two output differences of the S-box become linear. Let us show only the case when the least significant input bit is given in Property 5. The other cases are shown in Appendix A. Suppose the two 5-bit inputs of the Keccak S-box are The corresponding outputs are y 4 y 3 y 2 y 1 y 0 and y ′ 4 y ′ 3 y ′ 2 y ′ 1 y ′ 0 . Property 5 directly follows from the algebraic relation between the input and the output of χ. When x 1 and x ′ 1 take different values, the expressions of δ out [0] and δ out [4] are linearised as shown in Table 3.

Conditions
Linear Expressions

Guo et al.'s collision attacks on SHA-3
In this section we revisit the most dedicated existing collision attacks against the SHA-3 hash function. They were constructed by Guo et al. utilising an algebraic and differential hybrid method [GLL + 20], which follows the 1-round connector technique proposed by Dinur et al. in [DDS12]. The framework of the attack against SHA-3-n is shown in Figure 2. Given an n 2 -round high-probability differential characteristic ∆S I → ∆S O with the first n bits of the output difference ∆S O as zeros, the attack consists of two stages. In the first stage the adversary applies an n 1 -round connector by linearising the first n 1 rounds. Thus the adversary obtains message pairs as where ∆S I is the input difference of the differential characteristic. In the second stage, the adversary finds a colliding pair following the n 2 -round differential characteristic by searching through pairs of messages obtained in the first stage. The main drawback of this approach is that in the linearisation technique in the first stage, bit conditions are added in order to linearise the first n 1 rounds, thus consuming many degrees of freedom. As the input space of SHA-3-384 is too small for a sufficient level of degrees of freedom, extra bit conditions may cause contradictions with restrictions on the initial values in the capacity part, thus making the linearisation technique infeasible.

A new framework for a collision attack against 4-round SHA-3-384
In this section we introduce a new framework for a collision attack that overcomes drawbacks in the techniques of both Dinur et al. and Guo et al. There are three stages in our attack, namely the 1st block generation stage, the 1-round SAT-based connector stage, and the collision searching stage, as depicted in Figure 3. Before we overview the three stages, we introduce some notations, definitions, and parameters.
f L The 3-round differential characteristic depicted in Figure 3 is given in Table 4. The '?' in Table 4 means that the corresponding nibble is unknown. The probability of the characteristic is 2 −42 . The method of constructing the characteristic is discussed in Section 5.2. The differential transition of the i-th round in the characteristic is denoted by Let α 0 be the input difference of the second block after adding message blocks as shown in Figure 3. Next, we overview the three stages.

1st block generation stage
In this stage the adversary generates prefix pairs fulfilling required conditions on corresponding chaining values. The method of deriving these conditions is given in Section 5.1. Instead of working on single-block messages like in the previous technique, we turn to two-block messages, similar to the attacks proposed by Wang et al. against MD-like hash functions [WLF + 05, WY05,WYY05]. This helps to reduce the impact of the initial values in the capacity part on the 1-round connector. The Keccak permutation on a prefix is regarded as a pseudo random number generator (PRNG), which provides pseudo random chaining values in our attack. Once a corresponding 1-round connector fails, the adversary just generates another random prefix pair (M 1 , M ′ 1 ) fulfilling the required conditions over the chaining values. In comparison, Guo et al.'s adversary would have to search for a new differential characteristic with a special form, which is a hard and time-consuming process.

1-round SAT-based connector stage
We develop a new 1-round SAT-based connector that replaces Guo et al.'s linearisation technique, and removes the constraint on the size of the input space. In this stage, for each prefix pair generated in the first stage, the adversary searches for a suffix pair which connects the chaining values with a preset 1-round input difference α 1 . This is a connectivity problem, defined as follows.
The connectivity problem can be reduced to a satisfiability (SAT) problem and solved by a SAT solver. However, solving the connectivity problem using a SAT solver for every prefix pair generated in the first stage is still time consuming. Instead, we develop a preliminary deduce-and-sieve algorithm that filters prefix pairs based on their differential properties.
In the deduce-and-sieve algorithm, the adversary rejects a prefix pair (M 1 , M ′ 1 ) if there exists no differential transition from α 0 to α 1 , where the last 1600 − 828 = 772 bits of α 0 are the sum of the chaining values. Thus, the adversary can efficiently dismiss most of prefix pairs that have no solution for corresponding connectivity problems.
Then, the adversary applies the SAT solver to solve the connectivity problem for the remaining pairs. For a prefix pair (M 1 , M ′ 1 ), if there exists a suffix pair (M 2 ,M ′ 2 ) such that Equation 2 holds, the SAT solver then returns the corresponding suffix pair, which is called a suffix seed pair; otherwise, the SAT solver rejects the prefix pair. The 1-round SAT-based connector stage is described in more detail in Section 6.

Collision searching stage
The method in the collision searching stage follows Guo et al.'s work: once the adversary obtains a prefix pair (M 1 , M ′ 1 ) and a suffix seed pair (M 2 ,M ′ 2 ), the adversary aims to find a suffix pair (M 2 , M ′ 2 ) following the differential characteristic depicted in Figure 3. All solutions for a corresponding connectivity problem form an affine subspace. Next, we explain how to derive that subspace of suffixes M 2 , which also applies to deriving a subspace of suffixes M ′ 2 . We will discuss later that searching the affine subspace of M 2 for the colliding pair is equivalent to searching the affine subspace of M ′ 2 . Given a pair of prefix and suffix seeds (M 2 ,M ′ 2 ), the input difference of the operation χ, denoted as β 0 , can be deduced by computing . By Property 2, given β 0 and α 1 , all the linear equations on the input affine subspaces of active S-boxes in the first round can be derived and expressed as where A 1 is a block-diagonal matrix in which each diagonal block together with corresponding constants in b 1 forms equations for one active S-box. Additional constraints that x needs to fulfill are that given M 1 , the chaining values are prefixed: where A 2 is a submatrix of L −1 and b 2 is the vector of those prefixed chaining values computed from f (M 1 ||0). Thus, x is in an affine subspace, which is equivalent to M 2 being in an affine subspace. The adversary combines and solves Equation 3 and Equation 4 and obtains all solutions to the connectivity problem. The adversary then exhaustively searches for the colliding pair that follows the 3-round differential characteristic depicted in Figure 3.
Searching the affine subspace of M 2 , denoted as W 1 , for the colliding pair is equivalent to searching the affine subspace of M ′ 2 , denoted as W 2 . As discussed above, )) satisfies a system of linear equations combined with two parts. The first part is in a similar form to Equation 3, which is The second part is writen as: where A 2 is a submatrix of L −1 and b 2 is the vector of those prefixed chaining values . We will show next that x β 0 fulfills both Equation 5 and Equation 6. As x ∈ W 1 , (x, x β 0 ) should follow the differential characteristic β 0 → α 1 , which indicates that x β 0 fulfills Equation 5. As L is linear, the following equation should work: As L −1 (x) is the internal input state of the second block, the least significant bits of L −1 (x β 0 ) are indeed the prefixed chaining values computed from f (M ′ 1 ||0). Therefore, x β 0 fulfills both Equation 5 and Equation 6. It can be concluded that searching the affine subspace of M 2 for the colliding pair is equivalent to searching the affine subspace of M ′ 2 .

Algorithm 1 Deriving Linear Conditions
Input: α 1 Output: Set of Linear Conditions S A 1: Compute the output difference for each S-box from α 1 . 2: Initialise the system of equations E.
δ out is the output difference of the S-Box.

6:
if δ out = 0 then 7: for each bit in the S-box do 8: Add the corresponding equation β 0 [i 1 ] = 0 to both E and S 0 .

Constructing a 3-round differential characteristic
In this section, we first introduce the deduction of conditions on chaining values given an input difference. Afterwards, we discuss two criteria when constructing a differential characteristic. With the differential characteristic, we explain the method of generating prefix pairs satisfying conditions on corresponding chaining values.

Requirements on the chaining values
For the connectivity problem to have at least one solution (at least one pair of compatible suffixes), chaining values must follow certain linear conditions. We demonstrate how to comply with these conditions in chaining values. With α 1 , the adversary obtains the output difference of each S-box from α 1 . There are three types of output differences from which the adversary can derive conditions on β 0 . The other output differences do not derive conditions on β 0 . These cases are listed as follows: • Type-I Output Difference: The output difference of an S-box is zero when it is inactive.
The adversary can derive conditions from the three types of output differences by applying linear algebra. The procedure is shown in Algorithm 1. From Line 6 to Line 12, the adversary obtains the three types of output differences from α 1 and writes the corresponding conditions on the input differences according to Property 3 and Property 4. Then, the adversary transforms the system of equations E in the terms of α 0 according to θ operation and reduces E to its row echelon form. At last, the adversary checks each equation in E and obtains linear conditions on chaining values.

How to construct a 3-round differential characteristic
The 3-round differential characteristic in our attack adapts the second characteristic in [GLL + 20, Table 9]. The last two rounds of their characteristic are used as the last two rounds characteristic from the third round to the fourth round in our attack. We slightly change the output difference of their characteristic to make the first 384 bits have a zero difference. Thus, the probability of the last two rounds characteristic is 2 −16 instead of 2 −15 in the original one.
We extend the 2-round backward characteristic by one extra round. When β 1 is fixed, the 3-round differential characteristic is determined. We choose β 1 according to two criteria as follows: • Criterion 1: The affine subspace in the collision searching stage should be sufficiently large to find a collision pair.
• Criterion 2: The number of conditions on the chaining values should not be too large.
If Criterion 1 is not fulfilled, the affine subspace defined by Equation 3 and Equation 4 is so small that the probability that a collision pair is obtained in the third stage becomes negligible.
If the characteristic does not follow Criterion 2, the procedure of generating the first message blocks will become infeasible to be realised in practice. To keep our attack practical, the differential characteristic should satisfy Criterion 2.
The difference α 2 has 8 active S-boxes in the second round. From the DDT of the S-box, the probability of a nonzero differential transition is at least 2 −4 . Thus, the probability of our 3-round differential characteristic is no less than (2 −4 ) 8 · 2 −16 = 2 −48 . The dimension of the affine subspace in the collision searching stage should be larger than 48. The probability of the first round transition should not be smaller than 2 −828+48 = 2 −780 . As the average probability of a nonzero differential transition in DDT is 2 −3 , there should be no more than 780/3 = 260 active S-boxes in the first round to satisfy Criterion 1.
As for Criterion 2, we set the threshold for the number of conditions as 50. We use a hash table to generate prefix pairs, as discussed in Section 5.3. When the number of conditions is too large, the memory consumption when generating the first message blocks is infeasible.
To extend the 2-round characteristic, the adversary picks a β 1 at random, which is compatible with α 2 from the second characteristic in [GLL + 20, Table 9]. Then, the adversary computes α 1 as L −1 (β 1 ). Given α 1 , the adversary obtains the number of active S-boxes in the first round. With Algorithm 1, the adversary deduces conditions on the chaining values. If the two criteria are satisfied, the adversary outputs a 3-round characteristic; otherwise, the adversary picks another compatible β 1 and continues the procedure.
In our differential characteristic presented in Table 4, there are 228 active S-boxes in the first round. Applying Algorithm 1 on the differential characteristic, there are 39 conditions on the chaining values. These conditions are listed in Appendix B. The probability of the differential characteristic is 2 −42 .
We note that the differential characteristic searching process needs to be run only once and its complexity is of polynomial time from our experiment. Furthermore, the found differential characteristic may not be the most optimal one. Finding the optimal differential characteristic is thus an open problem.

Algorithm 2 Generating Prefix Pairs
Initialise an array Counter of length 2 39 with zeros.

5:
for each integer i ∈ [0, 2 n ) do 6: Randomly pick a message M of 832 bits and compute the value string c.

Generating prefix pairs fulfilling the requirements
We explain the generation procedure of prefix pairs fulfilling the 39 conditions in Table 9. In our approach, we use a hash table to trade off memory for time and data complexities. The memory consumption in this procedure is mainly from a hash table indexed by 39 bits. Before we describe the procedure, we introduce some additional definitions.
We define a constant XOR as a binary string c 38 c 37 · · · c 1 c 0 , where c i is the sum in the i-th condition, 0 ≤ i ≤ 38. In our case, the constant XOR is 0x7c00000000 from the conditions in Table 9. The conditions are the sums of differences in certain bit positions of the input state of the second block. To check whether a given prefix pair (M 1 , M ′ 1 ) satisfies the conditions, the adversary first computes sums of binary values in these positions, which we call by the value string and records with a 39-bit string. Then, the adversary sums up value strings and checks whether the result equals the constant XOR value. If true, then the adversary obtains a prefix pair fulfilling all conditions; otherwise discards it.
The procedure of generating prefix pairs satisfying the conditions is shown in Algorithm 2. First, the adversary generates 2 n first-block messages M of length 832 bits and computes corresponding value strings c. The adversary places M into the cth row of the hash table. Thereafter, the adversary searches through the hash table for prefix pairs that satisfy the constraints (Lines 8 to 12).
Algorithm 2 generates around 2 n · 2 n−1 · 2 −39 = 2 2n−40 pairs. The time and data complexity is 2 n . The memory consumption is mainly from the hash table, which is also 2 n . The value of n is experimentally discussed in Section 7.
It should be noted that the prefixes generated in Algorithm 2 can be extended to messages of length 832n b bits, where n b is the number of blocks and n b ≥ 1. The procedure of generating multi-block prefix pairs is shown in Algorithm 16 of Appendix C. As shown in Algorithm 16, the adversary starts from arbitrary chosen messages (P, P ′ ) as part of the prefixes in the first (n b − 1) blocks. The remaining block is picked randomly to fulfill the linear conditions on the chaining values. Thus, our attack can be extended to a chosen-prefix collision attack like the works of [SLdW07, LP19].

1-round SAT-based connector
We develop a new 1-round SAT-based connector to solve the connectivity problem in an efficient way. The connector includes two phases. First, we use a deduce-and-sieve algorithm to filter prefix pairs generated by Algorithm 2. Then, for each remaining prefix Algorithm 3 Initial Phase of the Deduce-and-sieve Algorithm The second block is controlled by the adversary.
pair, the connectivity problem is solved by applying a SAT solver.

Deduce-and-sieve algorithm
In the deduce-and-sieve algorithm, we assume that for a prefix pair (M 1 , M ′ 1 ) there exists a suffix pair (M 2 , M ′ 2 ) in the connectivity problem. It indicates that in some S-boxes, input differences for Type-I and Type-II output differences should be of a special form. Thus, some bit differences of β 0 are supposed to be fixed, which are then recorded in the sets S 0 and S 1 , see Algorithm 1.
There are two phases in the deduce-and-sieve algorithm. In the difference phase, given a prefix pair (M 1 , M ′ 1 ) the adversary deduces new bit differences and checks whether a contradiction has been reached, in which case the prefix pair is discarded. The value phase helps to sieve prefix pairs more efficiently, as the filtering rate of the difference phase is low. In the value phase, more bit values can be deduced from the algebraic properties of Keccak's S-box. If new bit differences are obtained from new bit values, the adversary returns to the difference phase to seek a contradiction and to discard the prefix pair.
In the initial phase of the deduce-and-sieve algorithm (Algorithm 3), the adversary computes the chaining values for a prefix pair (M 1 , M ′ 1 ) and stores the values in two vectors A and A ′ of length 1600, respectively. As the bit values in vectors A and A ′ can be either known or unknown, two extra vectors A S , A ′ S , called indicator vectors, record whether the bit value in a corresponding position is known. To be more specific, for 0 ≤ i < 1600, if and only if the i-th bit in A is known then A S [i] = 1. The adversary then computes the bit difference α 0 [i], where 828 ≤ i < 1600, and sets the corresponding bit differences of β 0 as a constant vector. Two indicator vectors α S 0 and β S 0 record whether the bit difference in a corresponding position is known.
In the deduce-and-sieve algorithm, we use the vector Σ of length 320 to record sums of five bit differences in each column. In the beginning, each entry of the indicator vector Σ S is initialised as 0 denoting an unknown state.

The difference phase
In the difference phase of the deduce-and-sieve algorithm, for a given prefix pair (M 1 , M ′ 1 ), bit differences of α 0 in the chaining value part can be obtained. As we assume that there exists a solution in the connectivity problem for the prefix pair (M 1 , M ′ 1 ), some bit differences of β 0 should be certain values from the sets S 0 and S 1 , deduced by Algorithm 1.
With the CP-kernel equations and the expression of L, the adversary can deduce new bit differences from α 0 and β 0 . We investigate the differential transition of the Keccak S-box and develop a tool called Truncated Difference Transform Table (TDTT), which is inspired by truncated differential cryptanalysis, proposed by Knudsen in [Knu94]. With the TDTT of the Keccak S-box, a contradiction may be reached for the prefix pair (M 1 , M ′ 1 ) as there is no compatible differential transition from the input difference of the first round α 0 to the output difference α 1 . Before we define the TDTT of an S-box, we first introduce truncated difference in Definition 2. 3 Definition 2. An n-bit difference is called a truncated difference if only m bits of it are known, where m < n.
We use a 2n-bit integer a||b, where a, b ∈ F n 2 , to represent a truncated difference. Let (a n−1 , · · · , a 0 ) and (b n−1 , · · · , b 0 ) be the binary representations of a and b, respectively. For each i where 0 ≤ i < n, if b i is known, a i = 1; otherwise, a i = 0. In regular truncated differences, for each i where 0 ≤ i < n, a i ≥ b i . Otherwise, the truncated difference is called irregular.
Differences covered by a regular truncated difference a||b, where a, b ∈ F n 2 , are {d ∈ F n 2 |d i = b n+i if a i = 1, 0 ≤ i < n}. A difference d ∈ F n 2 being covered by a truncated difference ∆ ∈ F 2n 2 is denoted by d ⪯ ∆. We observe that with an output difference of the Keccak S-box and a corresponding truncated input difference, more bits of the input difference can be fixed. For example, suppose that the output difference of an S-box is 0x1 and the truncated input difference is 0||0 where none of the input bit differences is known. Property 3 indicates that the least significant input bit difference should be 1. Then, the truncated input difference of the S-box can be updated as 00001||00001 in the binary representation. A complete transforming behaviour of the differences of an S-box can be described by its TDTT: Definition 3. Given a truncated input difference ∆ T in and an output difference ∆ out , the entry TDTT(∆ T in , ∆ out ) of the S-box's TDTT is: in , if more bits of the input difference can be derived ∆ T in , if no more bits can be derived where ∆ T ′ in is the new truncated input difference, ∆ T in , ∆ T ′ in ∈ F 2n 2 and ∆ out ∈ F n 2 . In case of Property 3, it can be observed that TDTT(0||0, 1) is 0x1||0x1.
The TDTT of an S-box can be constructed from its DDT, which is shown in Algorithm 4. For a regular truncated difference a ∈ F 2n 2 and an output difference b ∈ F n 2 , the adversary finds all the covered differences ∆ in by a such that the differential transition ∆ in → b is compatible, where ∆ in ∈ F n 2 (Line 4 in Algorithm 4). If there exists no such ∆ in , TDTT(a, b) =null. Otherwise, the adversary finds the new truncated difference T covering all ∆ in and TDTT(a, b) = T (Line 8 to 14 in Algorithm 4). For an irregular truncated difference a ∈ F 2n 2 , the adversary labels the entire row of the TDTT as null, shown in Line 16 of Algorithm 4. if a is a regular truncated difference then 3: ) is computed in the loop.

Algorithm 5 Deducing New Differences in the i-th Column
Compute the indices of the five bits in the i-th column as i 0 , i 1 , · · · , i 4 . 7: if f lag then 8: Deducing new differences from the expression of L. New bit differences can be derived from the expression of L. Applying Equation 1, the bit difference β 0 [i] can be expressed as: where Σ[ϕ 1 (σ −1 (i))] and Σ[ϕ 2 (σ −1 (i))] are the sums of the five bits in the ϕ 1 (σ −1 (i))th and ϕ 2 (σ −1 (i))th columns, respectively, and 0 ≤ i < 1600. If only one variable in Equation 8 is unknown, its value can be deduced. Before we show the technique of deducing new difference applying Equation 8, we introduce the method of computing the column sum. We classify the situations of the column sum into three cases. The first case is that the column sum is known. The second is that the column sum becomes known with one more known bit. The third is that more than two bits of information are needed to derive the column sum. In order to compute the column sum, we use another variable called sum state that encodes the three states: 1 for the first case, 2 for the second case, and 0 for the third case.
Representatives, when the sum state is 1, are shown in Figure 5. 4 For example, as shown in Figure 5 ( are known. Applying the CP-kernel equation, the column sum can be computed as Representatives of the cases when the sum state is 2 are shown in Figure 6. 5 In these three representatives the sum of four bits of one column can be derived with the CP-kernel equation. The index of the left bit in the column, defined as the marked bit, is recorded. The difference of the marked bit may be deduced in a later step.

Algorithm 6 Computing the i-th Column Sum
Compute the indices of the five bits in the i-th column as i 0 , i 1 , · · · , i 4 .

3:
if the five bits match one of the representatives in Figure 5 The procedure for computing the i-th column sum is shown in Algorithm 6. The adversary finds the states of the 10 related bits from α 0 and β 0 . If the states of 10 related bits match one of the representatives in Figure 5, the column sum is known. The adversary computes the sum and updates the sum state Σ S [i] with 1. If the states of 10 related bits match one of the representatives in Figure 6, where one bit difference is missing, the adversary computes the sum of the corresponding blue bits. The adversary then records the sum and the index of the marked bit and updates the sum state Σ S [i] with 2. If the situation does not match any of the representatives in Figure 5 or Figure 6, the adversary labels the corresponding sum state as 0.
Given the column sums, the adversary can obtain new bit differences from the expression of θ. To be more specific, if there is only one variable is unknown in Equation 8, the value of it can be easily deduced. The procedure is shown in Algorithm 7. There are 5 situations in which new differences can be deduced. In the two situations shown in Figure 6, marked bits are obtained and corresponding column sums are updated.
Sieving prefix pairs with TDTT. The sieving procedure applying the TDTT of the Keccak S-box is shown in Algorithm 8. With β 0 and β S 0 , the adversary can deduce the truncated input difference for each S-box. Then, the adversary discards prefix pairs with no solutions in the connectivity problem according to the TDTT. For pairs that cannot be discarded, the adversary may obtain new bit differences of β 0 from the TDTT. Once a bit difference is obtained, the adversary checks the related CP-kernel equations to deduce new bit differences in α 0 and β 0 , as shown in Lines 13-15 in Algorithm 8.
Let us summarise the difference phase of the deduce-and-sieve algorithm in Algorithm 9. The adversary first initialises the difference phase by deducing bit differences through checking the CP-kernel equations. Then, she updates column sums and deduces new bit differences from the expression of L. Finally, the adversary checks the TDTT of the Keccak S-box and decides whether a certain prefix pair (M 1 , M ′ 1 ) should be discarded. If the pair should not be discarded, the adversary computes the number of new deduced bit differences from Lines 8-13 in Algorithm 9. If new bit differences are deduced, the adversary goes back to Line 8; otherwise, she accepts the prefix pair.

The value phase
In the value phase, the adversary uses another algebraic property of the Keccak round function to deduce new input bit values of χ in the first round. New values can be deduced using a new tool called Fixed Value Distribution Table (FVDT) and applying the CP-kernal Equalities.
Algorithm 7 Deducing New Differences from the Expression of L 1: procedure LinearTrans(α 0 , β 0 , α S 0 , β S 0 , Σ, Σ S , M arkedBit) 2: for each integer i ∈ [0, 1600) do 3: The FVDT is developed from an observation that with a truncated input difference and an output difference of the Keccak S-box, bit values in some positions are constants. Applying the FVDT of the Keccak S-box, the adversary can obtain new input bit values of χ. With these new values and the chaining values, the adversary applies the CP-kernal equations to deduce additional input bits of χ. With the new values, the adversary can derive new bit differences from the expression of χ. Then, she can continue with the difference phase. Table (FVDT). Before we delve into the details of the FVDT of an n-bit S-box, we first define a solution set of a truncated input difference ∆ T in and an output difference ∆ out as follows:

Fixed Value Distribution
Definition 4. The solution set of a truncated input difference ∆ T in and an output difference ∆ out is where ∆ T in ∈ F 2n 2 , ∆ in ∈ F n 2 and ∆ out ∈ F n 2 .
The solution set of the truncated input difference ∆ T in and the output difference ∆ out is a generalisation of the solution set of the input difference ∆ in and the output difference ∆ out of regular DDTs. From Definition 4, it is a union of solution sets of the input difference ∆ in and the output difference ∆ out , where ∆ in ⪯ ∆ T in . We observe that for the truncated input difference ∆ T in and the output difference ∆ out , some bits of the pairs in the solution set S T (∆ T in , ∆ out ) may be constants. For example,

4:
Deduce the truncated input difference ∆ T in from β 0 and β S 0 . Find the indices of the five bits in the S-box as i 0 , i 1 , · · · , i 4 .
Based on this observation, we develop a useful tool called Fixed Value Distribution Table. If the adversary can obtain fixed values in some bit positions with ∆ T in and ∆ out , we use a 2n-bit integer a||b, called as fixed point, to record constant values and their corresponding positions, where a, b ∈ F n 2 . If the i-th bit in the S-box is a constant, a i = 1 and b i is assigned to be a fixed value; otherwise, a i = 0 and b i = 0, where 0 ≤ i < n and a i and b i are the i-th bit of a and b, respectively. We define the FVDT of an S-box as follows: Definition 5. Given a truncated input difference ∆ T in and an output difference ∆ out , the entry FVDT(∆ T in , ∆ out ) of the S-box's FVDT is: where ∆ T in , v ∈ F 2n 2 , ∆ out ∈ F n 2 and v is the fixed point with respect to ∆ T in and ∆ out . The adversary uses the FVDT of the Keccak S-box to initialise the value phase. The process is shown in Algorithm 10.
Deducing new values from CP-kernel equations. Similar to the difference phase, new values can be deduced from the CP-kernel equations. For each column, the adversary just calls CPKernel (A, B, A S , B S , i) and CPKernel(A ′ , B ′ , A ′ S , B ′ S , i) corresponding to two prefixes M 1 and M ′ 1 , where i is the index of the column and 0 ≤ i < 320. It should be noted that more operations, including the column sum technique in the differential phase (even on a fraction of columns), can be applied similarly to deduce more bit values in the value phase. The technique might help to improve the filtering rate of the deduce-and-sieve algorithm but increase its complexity. It is, however, an open problem how to balance the complexity and filtering rate of the deduce-and-sieve algorithm.
Deducing new bit differences from bit values. New bit differences can be deduced from bit values obtained in the value phase. For example, if the adversary finds that the input Algorithm 9 Difference Phase of the Deduce-and-sieve Algorithm for each integer i ∈ [0, 320) do 3: return a =the number of new deduced differences bits in α 0 and β 0 13: The procedure of deducing new bit differences is done by checking the cases in Table 3 as shown in Algorithm 15 in Appendix A. The value phase of the deduce-and-sieve algorithm is shown in Algorithm 11.
The deduce-and-sieve algorithm is shown in Algorithm 12. With a prefix pair, the adversary first runs the difference phase. If there is a contradiction, then the adversary discards the pair (Line 11); otherwise, the adversary starts the value phase (Line 7). If new bit differences are deduced in the value phase, then the adversary runs the difference phase again; otherwise, she accepts the prefix pair (Line 9). As the size of the Keccak state is finite 1600 bits, the deduce-and-sieve algorithm will terminate after a finite number of steps when no new bit differences can be obtained. In this way, the deduce-and-sieve algorithm has a practical complexity, which is discussed further in Section 7.

SAT
Some of the generated prefix pairs have been filtered by applying the deduce-and-sieve algorithm. The connectivity problems of the remaining prefix pairs are determined by using a SAT-solver called CryptoMiniSAT [SNC09]. The recent version of CryptoMiniSAT accepts XOR clauses as input arguments to describe a SAT problem. Hence, there is no need to convert XORs in the Keccak round function into the conjunctive normal form (CNF) as it was done in [MS13].
The procedure of converting the connectivity problem of a prefix pair (M 1 , M ′ 1 ) into a SAT problem is shown in Algorithm 17 in Appendix D. It can be seen from Algorithm 17 that in order to convert the connectivity problem into a SAT problem the adversary just needs to assign chaining values as initial values and derive expressions of output differences of the first round. For example, the non-linear term v 0 = (v 1 + 1)v 2 in χ can be converted into CNF clauses as (¬v 0 ∨ ¬v where v 0 , v 1 and v 2 are internal variables (see Lines 28-30, 33-35).

Algorithm 10 Initialising the Value Phase
for each S-box do 3: Deduce the output difference ∆ out from α 1 .

4:
Deduce the truncated input difference ∆ T in from β 0 and β S 0 . Find the indices of the five bits in the S-box as i 0 , i 1 , · · · , i 4 .

10:
for each integer j ∈ [0, 5) do 11: if v j+5 = 1 then ▷ v j is the j-th bit of v. 12: Algorithm 11 Value Phase of the Deduce-and-sieve Algorithm

Experiments and complexity analysis
We verified our work by implementing the attack and analysing its complexity. We generated 2 41.3 prefix pairs fulfilling the conditions in Table 9 for one iteration. Most of these prefix pairs were filtered with the deduce-and-sieve algorithm. According to our experiments, the filtering rate is 2 −19.42 . Thus, about 2 21.88 prefix pairs remained after applying our deduce-and-sieve algorithm. If we run the deduce-and-sieve algorithm without the value phase, the filtering rate is only 2 −13.55 . It can be seen that the value phase helps to improve the efficiency of the deduce-and-sieve algorithm by a factor of 2 5.87 .
It is also interesting to note that when the capacity increases, the filtering rate of the deduce-and-sieve algorithm decreases sharply. For example, if the capacity is increased by 16 bits to 788 bits, the filtering rate of the deduce-and-sieve algorithm is 2 −26.1 . Thus, it is difficult to investigate the property of the remaining pairs using a statistical manner when the input space is even smaller. To build a collision attack against SHA-3-512, an improved deduce-and-sieve algorithm is needed.
The average running time of the deduce-and-sieve algorithm is 1.22 × 10 −5 s for a prefix pair on a single core of Intel(R) Xeon(R) CPU E5-2609 0 @ 2.40GHz. If we apply the SAT solver CryptoMiniSAT to determine the connectivity problem instead of using our deduce-and-sieve algorithm, the average running time of the SAT solver for every prefix pair is 0.31s on the same platform. In conclusion, our approach outperforms the SAT solver by a factor of 2.54 × 10 4 on this special type of SAT problems.
if f lag then 7: 8: if f lag = 0 then 9: return 1 ▷ Accept the prefix pair return 0 ▷ Discard the prefix pair As from the software performance figure 6 , approximately 2 21 calls for 24-round Keccak permutations can be implemented in 1 second on a single core of Intel(R) Sandy Bridge(R) Core i5-2400 @ 3.10GHz. It indicates that one execution of 4-round SHA-3-384 takes 4/24 × 2 −21 = 2 −23.58 s. Thus, our deduce-and-sieve algorithm on one prefix pair is approximately equivalent to 1.22 × 10 −5 /2 −23.58 = 2 7.25 SHA-3-384 operations from our experiments. The average running time of the SAT solver for each remaining prefix pair after the deduce-and-sieve algorithm is 3.93s on the same platform, which is equivalent to 3.93/2 −23.58 = 2 25.55 SHA-3-384 operations.
We use statistical methods to analyse the data complexity. Deriving the probability that there exists a solution for a connectivity problem in a non-statistical manner is an open problem. We define a semi-free n-bit internal collision attack in which situation the adversary is assumed to have the capacity of modifying n-bit chaining values for each suffix message, where n > 0. The corresponding connectivity problem is called semi-free n-bit internal connectivity problem. From our experiments, there are 11.07 suffix seed pairs on average for each iteration to construct semi-free 14-bit internal collision attacks.
To analyse the complexity of our attack, we show an important property of the connectivity problems in Observation 1.
Observation 1. The probability that a semi-free internal n-bit connectivity problem is still satisfiable as a semi-free (n − 1)-bit internal connectivity problem, denoted as p n , is approximately 1 2 , where n ≥ 1. Observation 1 is discovered through experiments. We estimate p n by computing the ratio of suffix seed pairs, which are still compatible for constructing a semi-free (n − 1)-bit internal collision attack, in the seed pairs for semi-free (n − 1)-bit internal collision attacks. The ratio is denoted asp n . The procedure of estimating p n for each n ∈ [1, 14] is shown in Algorithm 13. For each n ∈ [1, 14], we randomly generate 25600 internal state pairs (m, m ′ ) to the second block such that R(m) R(m ′ ) = α 1 . To be more specific, we randomly pick z ∈ F 1600 2 and obtain an internal state pair to the second block, which is (m, m ′ ) = (R −1 (z), R −1 (z α 1 )). If we assume that the last 1600 − (828 + n) = 772 − n bits of (m, m ′ ) are the chaining values, (m, m ′ ) is a suffix seed pair for constructing a semi-free n-bit internal collision attack. From line 6 to 8 in Algorithm 13, we modify the (827 + n)-th bits of m and m ′ and denote the new pair as (m 1 , m ′ 1 ). We call the SAT solver to check whether (m 1 , m ′ 1 ) is a suffix seed pair for a semi-free (n − 1)-bit internal collision attack. Finally, Algorithm 13 returns the ratiop n . The experimental results are shown in Table 5. It can be seen that eachp n is close to 0.5, where n ∈ [1, 14].
It follows from Observation 1 that to build a real collision attack, we need to collect 2 14 suffix seed pairs for the semi-free 14-bit internal collision attack. Therefore, we need Compute an internal input pair to the second block (m, m ′ ) = (R −1 (z), R −1 (z α 1 )) 5: for each integer j ∈ [0, 4) do 6:

7:
Add the (827 + n)-th bit of m 1 with the first bit of j 8: Add the (827 + n)-th bit of m ′ 1 with the second bit of j 9: Call CryptoMiniSAT to determine whether (m 1 , m ′ 1 ) is a suffix seed pair for a semi-free (n − 1)-bit internal collision attack 10: if (m 1 , m ′ 1 ) is a seed pair then 11: ctr = ctr + 1 returnp n = ctr/(4 × 256000) to generate 2 41.3 · 2 14 /11.07 = 2 51.83 prefix pairs. To generate these pairs, the adversary applies the hash table technique in Algorithm 2. As mentioned in Section 5.3, the time, data and memory complexity of the 1st block generation stage should be 2 45.92 . The whole procedure of the attack is summarised in Algorithm 14.
In the 1-round SAT-based connector stage, the adversary applies the deduce-and-sieve algorithm to filter the 2 51.83 prefix pairs. The time complexity is 2 7.25 · 2 51.83 = 2 59.1 and the memory complexity is negligible. Then, the adversary solves the connectivity problems for the remaining 2 51.83 · 2 −19.42 = 2 32.41 prefix pairs applying the SAT solver. The time complexity is 2 32.41 · 2 25.55 = 2 57.96 . The memory cost of applying the SAT solver is also negligible from our experiments. Thus, the complexity of the second stage is 2 59.1 + 2 57.96 ≈ 2 59.64 .
In the collision searching stage, the adversary solves the system of linear equations combining Equation 3 and Equation 4 with a prefix pair and a suffix seed pair gained from the previous stage, the time complexity of which is negligible. Then, the adversary searches the solutions of the linear equations for a suffix pair following the last 3-round differential characteristic of probability 2 −42 . From our experiments, the solution space of the linear equations is typically with a dimension of 100, which is sufficiently large to find a colliding pair. Thus, the complexity of this stage is 2 42 .
The time complexity of our collision attack is determined by the complexity of the second stage, which is 2 59.64 . The memory and data complexity are both 2 45.92 . Recall that the second stage includes two phases, which are applying the deduce-and-sieve algorithm to filter prefix pairs and solving the remaining connectivity problems with SAT solvers. It is also an open problem to find the optimal filtering rate of the deduce-and-sieve algorithm to balance the complexity of the two phases by choosing a proper α 0 .
As the size of available memory we have is insufficient to generate 2 51.83 prefix pairs in one run utilising a large hash core·hours. Recall that we can find 11.07 suffix seed pairs in each iteration for semi-free 14bit internal collision attacks. To find a collision pair of 4-round SHA-3-384, approximately 2 14 /11.07 = 1480 iterations are needed, which are 1.63 × 10 7 core·hours. To be more specific, the total running time is around 7.3 years on our platform. Up till now, we have run 106 iterations on our platform. In these iterations, the best result is a suffix seed pair for constructing a semi-free 4-bit internal collision attack, which is consistent with our estimation. We show our message pairs for a semi-free 4-bit internal collision in Table 6. We also show the suffix seed pairs in Table 7.

Conclusions
In this paper we describe a practical collision attack on 4-round SHA-3-384. Our attack outperforms the previous collision attack, of complexity 2 147 , proposed by Dinur et al. in [DDS13]. Currently, our result includes a semi-free 4-bit internal collision, but an adversary with slightly higher computing power can find a collision in practical time.
Although this work does not threaten the security of the full SHA-3 hash function, our results may be applied to analyse other sponge-based hash functions. The two crypt-analytic tools that we introduced in this work, namely, Truncated Difference Transform Table and Fixed Value Distribution Table, can be helpful in detecting non-random behaviour of S-boxes. These tools may be useful in analyses of other primitives with the sponge construction, for example, Keccak with smaller states [BDPA], Xoodyak [DHP + 20], Gimili [BKL + 17] and etc. The tools may be also useful for future designs of new secure non-linear layers in symmetric primitives.
With the deduce-and-sieve algorithm developed in this work, most of unsatisfiable cases in a class of SAT problems can be determined in a more efficient way than calling a SAT solver directly. The deduce-and-sieve algorithm may help to enhance the performance of a SAT solver for certain class of SAT problems.
A Linear expressions of the output differences of χ with different conditions   Table 3 13:  Table 3 15: return a =the number of new deduced differences bits in β 0 .

C Generating Multi-Block Prefix Pairs
Algorithm 16 Generating n b -Block Prefix Pairs Input: Chosen (n b − 1)-Block Prefixes (P, P ′ ) Output: Set of n b -Block Prefix Pairs S P 1: Constant XOR Σ=0x7e00000000 2: S P = ∅ 3: Initialise two arrays, Cntr A and Cntr B of length 2 39 with zeros. 4: for each integer i ∈ [0, 2 n ) do

D Converting the Connectivity Problem into a SAT problem
Algorithm 17 Converting a connectivity problem into a SAT problem   for each integer j ∈ [0, 5) do