Automating Collision Attacks on RIPEMD-160

. As an ISO/IEC standard, the hash function RIPEMD-160 has been used to generate the Bitcoin address with SHA-256 . However, due to the complex double-branch structure of RIPEMD-160 , the best collision attack only reaches 36 out of 80 steps of RIPEMD-160 , and the best semi-free-start (SFS) collision attack only reaches 40 steps. To improve the 36-step collision attack proposed at EUROCRYPT 2023, we explored the possibility of using different message differences to increase the number of attacked steps, and we finally identified one choice allowing a 40-step collision attack. To find the corresponding 40-step differential characteristic, we re-implement the MILP-based method to search for signed differential characteristics with SAT/SMT. As a result, we can find a colliding message pair for 40-step RIPEMD-160 in practical time, which significantly improves the best collision attack on RIPEMD-160 . For the best SFS collision attack published at ToSC 2019, we observe that the bottleneck is the probability of the right-branch differential characteristics as they are fully uncontrolled in the message modification. To address this issue, we utilize our SAT/SMT-based tool to search for high-probability differential characteristics for the right branch. Consequently, we can mount successful SFS collision attacks on 41, 42 and 43 steps of RIPEMD-160 , thus significantly improving the SFS collision attacks. In addition, we also searched for a 44-step differential characteristic, but the differential probability is too low to allow a meaningful SFS collision attack.


Introduction
As components of cryptographic primitives, hash functions are important for building secure systems.Generally, a hash function takes an arbitrarily long message as input and outputs a fixed-length hash value of size n bits.Three fundamental security properties of a hash function are collision resistance, preimage resistance and second-preimage resistance.Since 2005, many hash functions in the MD-SHA hash family have been broken, including MD4 [WLF + 05], MD5 [WY05], SHA-0 [WYY05b, BCJ + 05], SHA-1 [WYY05a, LP20, SBK + 17, LP19] and RIPEMD-128 [LP13].However, the security of RIPEMD-160 and SHA-2 has not been compromised.Especially, RIPEMD-160 is an ISO/IEC standard that is now used to generate the Bitcoin address with SHA-256.In this sense, further studying the security of RIPEMD-160 is meaningful.
RIPEMD-160 has a complex double-branch structure, which causes the slow progress of the collision attack.The first collision attack on RIPEMD-160 presented at ASIACRYPT 2017 [LMW17] only reached 30 steps with a time complexity of 2 70 .Subsequently at CRYPTO 2019 [LDM + 19a], two different collision attack frameworks were proposed, namely, the dense-left-and-sparse-right (DLSR) framework and the sparse-left-and-denseright (SLDR) framework.Based on the DLSR framework, the practical 30/31-step collision attacks and the theoretic 34-step collision attack were achieved for the first time.At EUROCRYPT 2023 [LWS + 23], a new strategy to choose the message differences was proposed, based on which the collision attack on 36-step RIPEMD-160 was achieved.
For the SFS collision attack on RIPEMD-160, the first SFS security analysis was presented at ISC 2012 [MNSS12], including practical examples of SFS near-collisions for 48 steps and SFS collisions for 36 steps, where the two attacks start from an intermediate step.
The first major improvement was achieved at ASIACRYPT 2013 [MPS + 13], where the authors presented two results: a 42-step SFS collision attack starting from an intermediate step with time complexity of 2 75.5 and a 36-step SFS collision attack starting from the first step with time complexity of 2 70.4 .Then, a 48-step SFS collision attack starting from an intermediate step was presented at ToSC 2017 [WSL17].Moreover, at the ASIACRYPT 2017 [LMW17], the complexity of the 36-step SFS collision attack in [MPS + 13] was further improved to 2 55.1 .Another major progress was made at ToSC 2019 [LDM + 19b], where the first practical SFS collision attack on 36/37-step RIPEMD-160 starting from the first step was achieved, and the best attack could reach 40 steps with time complexity of 2 74.6 .It is noted that the whole time complexity of the SFS collision attacks in [LDM + 19b] is almost dominated by the probability of the right-branch differential characteristics.However, the right-branch differential characteristics are deduced by hand and whether they are optimal is unknown.Thus, it becomes important to study this problem in order to further increase the number of attacked steps.
To mount (SFS) collision attacks on RIPEMD-160, or more generally, the MD-SHA hash family, it is essential to first search for signed differential characteristics.While this problem has been efficiently solved with the guess-and-determine technique [CR06, MNS11, MNS12, MNS13, MPS + 13, EMS14, DEM15], the corresponding tools are not open-source.This has motivated the authors of [LWS + 23] to create a new MILP-based tool that is both open-source and easy-to-use.In particular, all the details to write this MILP-based tool are provided in [LWS + 23], which makes it easy to re-implement it with other languages such as SAT/SMT.
To increase the diversity of automatic tools, we re-implement the MILP-based tool proposed at EUROCRYPT 2023 [LWS + 23] with SAT/SMT.This SAT/SMT-based tool will be used in all our attacks and it has excellent performance to search for RIPEMD-160 differential characteristics.We do not treat this as a main contribution, but it enriches the available tools pool.
Our contributions.The contributions of this paper are summarized as below: 1. We shed new insight into the collision attack on RIPEMD-160.Specifically, we are able to propose the first practical colliding message pair for 40-step RIPEMD-160, improving the previously best theoretic collision attacks at EUROCRYPT 2023 [LWS + 23] by 4 steps; 2. To improve the SFS attacks on RIPEMD-160, we utilize our SAT/SMT-based tool to find the most sparse differential characteristics for the right branch.In this way, we are able to find high-probability differential characteristics for the right branch, which allows us to address the above mentioned issue to improve the attacks.
3. Based on the newly found 41/42/43-step differential characteristics, we could obtain the first SFS collision attacks on 41, 42 and 43 steps of RIPEMD-160 with time complexity of 2 59.7 , 2 67.3 and 2 74.8 , respectively.This is the first time to mount SFS collision attacks on more than half of the total steps (80 steps) of RIPEMD-160 starting from the first step.
We also attempted to attack 44-step RIPEMD-160, but the probability of the right-branch differential characteristic is too low to allow a successful SFS attack in the classical setting.
Our results for RIPEMD-160 are summarized in Table 1.The source code to find the (SFS) collisions differential characteristics for RIPEMD-160 is available at https://github.com/Peace9911/ripemd160_attack.gitOrganization.This paper is organized as follows.The notation and description of RIPEMD-160 are given in Section 2.Then, we revisit the MILP-based method to search for signed differential characteristics for RIPEMD-160 in Section 3. Next, we describe the collision attack on 40-step RIPEMD-160 in Section 4. In Section 5, we show how to improve the SFS collision attack with newly discovered differential characteristics.Finally, the paper is concluded in Section 6.

Notations
For a better understanding of this paper, we introduce the following notations.
1. ⊞ and ⊟ represent modular addition and modular subtraction on 32 bits, respectively.

x[i]
denotes the i-th bit of x and x[0] is the least significant bit.
5. ∆x denotes the signed difference between x ′ and x.We use the notation as follows, 6. ϕ l j and ϕ r j represent the 32-bit Boolean function at the left and right branches for round j, respectively.7. K l j and K r j represent the constants used at the left and right branches for round j, respectively.

s l
i and s r i represent the rotation constants for the left and right branches of step i, respectively.9. π l i and π r i represent the index of the message word for the left and right branches of step i, respectively.For our collision attacks on RIPEMD-160, we aim to find two message blocks (M 0 , M 1 ) and (M 0 , M ′ 1 ) such that

Description of RIPEMD-160
1 and CV 0 is a prefixed constant.For the SFS collision for RIPEMD-160, we need to find a pair (M, M ′ ) satisfying where CV can be any 160-bit constant.

On the compression function H(CV, M
).The compression function H consists of 80 steps, divided into 5 rounds of 16 steps each in both branches.The input M is composed of 16 message words (m 0 , m 1 , . . ., m 15 ) and CV is divided into five 32-bit words (h 0 , . . ., h 4 ).Especially, we have For the prefixed constant CV 0 , the corresponding (h 0 , . . ., h 5 ) are as follows: The step function of RIPEMD-160 at step i is shown below: where i = (0, 1, 2, ..., 79) and j = (0, 1, 2, 3, 4).The details of the Boolean functions and round constants for RIPEMD-160 are displayed in Table 2.The other parameters can be found in the specification [DBP96].

Finding RIPEMD-160 Differential Characteristics
Recently, at EUROCRYPT 2023 [LWS + 23], a novel MILP-based method to search for signed differential characteristics has been proposed.The main motivation behind that work [LWS + 23] is to create an easy-to-use and open-source tool for the MD-SHA hash family.To increase the diversity of such open-source tools, we re-implement the MILPbased method with SAT/SMT, i.e. all constraints will be described with conjunctive normal form (CNF) rather than linear inequalities.The implementation details will be omitted from this paper as they generally follow the pseudo-code given for the MILP-based method in [LWS + 23].

The Automatic Method in [LWS + 23]
For completeness, we first briefly revisit the technique in [LWS + 23] to search for RIPEMD-160 differential characteristics with MILP.In our new implementation with SAT/SMT, we are following the same idea.Specifically, the form of the step function of RIPEMD-160 can be described as below: where (d i , . . ., d i+5 , m) are all 32-bit variables, c is a 32-bit constant, s ∈ [0, 31] is an integer and F is a Boolean function.
Denote the signed difference of (d i , . . ., d i+5 , m) by (∆d i , . . ., ∆d i+5 , ∆m).Then, each of (∆d i , . . ., ∆d i+5 , ∆m) can be represented with a vector of size 32.In this sense, it is only required to study the following step function because the rotation (≪ 10) only affects the order of variables: (2) With some intermediate 32-bit variables (b 0 , . . ., b 5 ), Equation 2 can be further decomposed as: In [LWS + 23], the authors described how to model the signed difference transitions through the step function, i.e. how to use constraints to describe the propagation: (∆a 0 , . . ., ∆a 4 , ∆m) → ∆a 5 .
In particular, the model can be briefly summarized as follows: • Model the deterministic signed difference addition ∆z = ∆x ⊞ ∆y.
Specifically, although we indeed have many possible ∆z for a given (∆x, ∆y), we only consider one valid ∆z.This is based on the feature of the step function of the MD-SHA hash family, and it indeed also follows the way to deduce such a differential characteristic by hand.
• Model the signed difference transitions for the Boolean function F , i.e.
(∆a 4 , ∆a 3 , ∆a 2 ) → ∆b 1 .This is captured by the so-called fast filtering model for F in [LWS + 23] • Model the signed difference transitions for ∆z = 0 ⊞ ∆z ′ , i.e. this is called modelling the expansion of the modular difference.In other words, for a given ∆z ′ , how to compute all possible ∆z such that they correspond to the same modular difference.
• Model the update a 5 = a 1 ⊞ b ≪s

3
. The authors [LWS + 23] introduced two different ways to model it, i.e. the first strategy and the second strategy, such that the model can handle as many cases as possible.However, simply using the above models is insufficient because contradictions easily occur, especially in the Boolean function.Hence, they introduced the so-called monitoring variable, which can be used to monitor whether contradictions occur in the difference transitions through the Boolean functions over different steps.Roughly speaking, by using three additional variables (a 4 , a 3 , a 2 ) and constructing another model only to capture the relations between (∆a 4 , ∆a 3 , ∆a 2 , ∆b 1 ) and (a 4 , a 3 , a 2 ), it is possible to detect the contradictions in the Boolean functions over different steps.In [LWS + 23], if (a 4 , a 3 , a 2 ) is involved, it is called the full model for F .
Another place where contradictions occur is at the operation a 5 = a 1 ⊞ b ≪s 3 , especially when the conditions on (a 5 , a 1 ) are dense.This is a special operation in RIPEMD-160 and makes it more difficult to find valid signed differential characteristics.Detecting the contradictions in this operation is a bit complex and we refer the interested readers to [LWS + 23] for more details.

New Collision Attacks on RIPEMD-160
With the automatic tool at hand, we first show how to use it to significantly improve the collision attacks on RIPEMD-160.In particular, the currently best collision attack [LWS + 23] only reaches 36 out of 80 steps of RIPEMD-160, and it has a time complexity of 2 64.5 .In what follows, we show a practical collision attack on 40-step RIPEMD-160 and give the corresponding colliding message pair.This is the first time to practically violate the collision resistance of half of the full-round RIPEMD-160.Note that the currently best SFS collision attack on RIPEMD-160 only reaches 40 steps with a time complexity of 2 74.6 [LDM + 19b].Consequently, our new collision attack also updates the best SFS collision attack on RIPEMD-160.

Choosing New Message Differences
Our new collision attack relies on a new way to choose the message differences.First, we revisit the collision attack on 36-step RIPEMD-160 in [LWS + 23], and we generalize their way to construct the 36-step differential characteristic.As shown in Figure 1, there are 3 places to construct local collisions: • the first local collision spans from X i0 to X i1 where i 0 < i 1 < 16; • the second local collision spans from X i2 to X i3 where 16 < i 2 < i 3 < 32; • the third local collision spans from Y i4 to Y i5 where 0 < i 4 < i 5 < 32; The pattern of the RIPEMD-160 differential characteristic.
In [LWS + 23], the differences are injected in (m 0 , m 6 , m 9 ), which results in Since (m 0 , m 6 , m 9 ) are not used to update the internal states X i and Y i where 32 ≤ i ≤ 35, a 36-step collision-generating differential characteristic can possibly be constructed by injecting message differences in these 3 message words.For such a way to construct a differential characteristic, the authors of [LWS + 23] also proposed an efficient message modification technique.In brief, the cost to fulfill all differential conditions on (X i ) 0≤i≤9 and (Y i ) 0≤i≤12 can be amortized, and the degrees of freedom in (Y 13 , Y 15 ) can be utilized to fulfill the remaining uncontrolled differential conditions.Roughly speaking, the number of differential conditions on (X i ) 16≤i≤35 and (Y i ) 16≤i≤35 dominate the whole time complexity of the attack.Based on the above analysis, if we aim to mount a collision attack based on a differential characteristic of a similar shape, we need to ensure • the differential characteristic of the second local collision should be as sparse as possible; • the message words to inject differences should be used to update the internal states (X i , Y i ) as late as possible where i ≥ 32; Our strategy for the second local collision.To mount a collision attack on r + 1 (r ≥ 36) steps of RIPEMD-160, we first utilize our SAT/SMT-based tool to find a better choice of the message differences.Specifically, we can build a very simple model to search for the differential characteristic of the second local collision where the message words to inject differences are not allowed to update X i and Y i where 32 ≤ i ≤ r.In this way, we find 3 possible ways to construct the second local collision, as shown in Table 3.Although the results indicate that a 42-step collision attack is possible, we could not find an efficient message modification for the corresponding differential characteristic due to a large number of differential conditions in the second local collision.Hence, we target the 40-step collision attack by injecting message differences in (m 0 , m 2 , m 11 , m 12 ).

Finding the 40-Step Differential Characteristic
To ensure the second local collision and the minimal number of differential conditions on it, (δm 0 , δm 2 , δm 11 , δm 12 ) should satisfy: where 0 ≤ i ≤ 31.
To further optimize the whole time complexity of the collision attack, i.e., we expect that 31 i=15 ∆H(Y i ) is also as small as possible, after several trials, we eventually determined the following message differences: With the above message differences, we give a high-level description of how to utilize our tool to search for the corresponding 40-step differential characteristic, as shown below: Step 1: Find a valid solution of (∆X i ) 0≤i≤12 to ensure (δX i = 0) 8≤i≤12 , and we minimize Step 2: Find a valid solution of (∆Y i ) 11≤i≤31 .
Step 3: Choose a sparse differential characteristic manually for (∆Y i ) 3≤i≤5 and fix it.
To improve the efficiency of the message modification technique, we have tried three strategies for the Step 2, as detailed below: Strategy 1: Directly find a valid solution of (∆Y i ) 11≤i≤31 to ensure (δY i = 0) 25≤i≤31 , and we minimize 31 i=11 H(∆Y i ).Strategy 2: First, find a valid solution of (∆Y i ) 16≤i≤31 such that (δY i = 0) 25≤i≤31 , and we minimize Then, find a valid solution of (∆Y i ) 11≤i≤15 to connect (∆Y i ) 16≤i≤31 , and we minimize 15 i=11 H(∆Y i ).Strategy 3: First, find a valid solution of (∆Y i ) 15≤i≤31 such that (δY i = 0) 25≤i≤31 , and we minimize 31 i=15 H(∆Y i ).Then, find a valid solution of (∆Y i ) 11≤i≤14 to connect (∆Y i ) 15≤i≤31 , and we minimize It is found that we can benefit more from Strategy 3. The corresponding 40-step differential characteristic is displayed in Table 4.Some extra conditions for the differential characteristic are shown in Table 5.

Finding Conforming Message Pairs
For our 40-step collision attack, two message blocks (M 0 , M 1 ) will be used.Specifically, our goal is to find a tuple (M 0 , M 1 , M ′ 1 ) where This is mainly because in our 40-step differential characteristic, there are some conditions on CV 1 .The general procedure to find the conforming message pair for the 40-step differential characteristic is summarized as follows: Step 1: Find a valid M 0 such that the conditions on CV 1 can hold, i.e., the conditions on (X −5 , X −4 , X −3 , X −2 , X −1 ) in the 40-step differential characteristic can hold.
Step 2: Similar to [MZ06], use an SAT/SMT model to describe the value transitions for RIPEMD-160.By adding the differential conditions on the internal states to the model, we can then find a valid solution of (X i ) 0≤i≤9 and (Y i ) 0≤i≤12 satisfying all the corresponding conditions by solving the model.For convenience, this solution is called the starting point1 for the collision attack.
Step 3: Reuse the degrees of freedom of (Y 10 , Y 11 ) to generate more starting points.Specifically, although (Y 10 , Y 11 ) have been fixed at Step 2, we can traverse all their possible values, recompute (X 8 , X 9 , Y 12 ), and check whether the conditions on them hold.In this way, we can reduce the workload of the SAT/SMT solver to generate many such solutions at Step 1−2.For each starting point, move to the next step.Return to Step 1 if all starting points are used up.
Step 4: Traverse all possible values of Y 13 , compute Y 14 and check the differential conditions on (Y 14 , LQ 10 , LQ 11 ).If they hold, move to the next step.
Table 5: Some extra conditions for the 40-step differential characteristic Conditions on LQ i and RQ i : Step 5: Traverse all possible values of Y 15 and compute the corresponding m 12 .Then, all message words (m i ) 0≤i≤15 are fixed.Check the remaining uncontrolled differential conditions.If all of them hold, a colliding message pair is found and exits.Otherwise, move to Step 4.
Based on the above procedure, we found the first colliding message pair for 40-step RIPEMD-160, as shown in Table 6.The whole procedure takes about 16 hours with 115 threads.A theoretic analysis of the time complexity is given below.

Table 6:
The colliding message pair (M 0 , M 1 ) and (M 0 , M ′ 1 ) for 40 steps of RIPEMD-160 M 0 4b1de304 f52a5a3e bbd7d814 6454a1d6 a5571007 6c4151f5 8970f768 32c48fd1 54c428ea 113b00cf 3db1bb85 1d2b2de6 89157118 89157118 d22f990b 6db9f321 M 1 0a179ed0 582e9fee 8c68cd3d 0d120a6e de43af57 df2e7a6f 2b40967e df302947 ee7f066f d7b7707d 9f1cc8a9 eaecfcb8 0b449f1a ec058b69 996ee0d2 994ef6b1 M ′ 1 0a159ed0 582e9fee 8c48cd3d 0d120a6e de43af57 df2e7a6f 2b40967e df302947 ee7f066f d7b7707d 9f1cc8a9 eaecfd38 0b451f1a ec058b69 996ee0d2 994ef6b1 hash a76b7982 e39826f9 52eb6b63 6b48ecdd 4ddca6c5 Complexity evaluation.There are only 4 bit conditions on (X i ) −5≤i≤−1 and hence Step 1 takes about time 2 4 .Furthermore, it is found that the cost to find a starting point by simply using the SAT/SMT solver is equivalent to about 2 32.6 calls of RIPEMD-160.We found in total 100 such starting points with the SAT/SMT solver.Then, for each such starting point, we reuse the degrees of freedom in (Y 10 , Y 11 ) to generate more starting points.We randomly chose 80 out of 100 starting points and generated in total about 1000 starting points with the degrees of freedom in (Y 10 , Y 11 ) in a few minutes on a single core.Based on each of these starting points, we further utilize the degree of freedom in (Y 13 , Y 15 ) to satisfy the remaining uncontrolled differential conditions.Note that the time complexity to fulfill the conditions on (Y 14 , LQ 10 , LQ 11 ) can be amortized because there are sufficiently many free bits in Y 15 .The total time complexity is almost dominated by the conditions on (X i , Y i ) where i ≥ 16, which hold with a probability of about 2 −49.9 .Theoretically, the time complexity of our attack is 2 49.9 .

Improved SFS Collision Attacks on RIPEMD-160
After improving the collision attacks on RIPEMD-160 by 4 steps, we feel interested whether it is possible to further utilize this automatic tool to improve the SFS collision attack on RIPEMD-160.In particular, we aim to improve the SFS collision attack published at ToSC 2019 [LDM + 19b], where the authors could only attack at most 40 steps of RIPEMD-160 with their technique.

Finding New Differential Characteristic for SFS Collision Attacks
Our improved SFS collision attack on RIPEMD-160 still follows the attack framework proposed in [LDM + 19b], as shown in Figure 2.
In this framework, the message difference is only injected at the message word m 12 , and the right-branch differential characteristic should be as sparse as possible because the time complexity of the SFS collision attack is almost dominated by its probability.However, in the previous SFS collision attacks on RIPEMD-160 [LDM + 19b], the rightbranch differential characteristics are deduced by hand and whether they are optimal is unknown.In particular, in their 40-step SFS collision attack, the probability of the right-branch differential characteristic is 2 −74.6 , which makes it infeasible to further extend the attack for more steps.Intuitively, if better right-branch differential characteristics can be found, the SFS collision attack can be improved.Hence, we feel interested whether it is possible to find such differential characteristics with the new automatic tool to increase the number of attacked steps because it is relatively easy to solve optimization problems with these tools.
In the following, we give the details of how we use the automatic tool to find left/branch differential characteristics for t + 1 steps of RIPEMD-160 where t ≥ 40.
Finding right-branch differential characteristics.Finding (∆Y 15 ,• • • , ∆Y t ) is simply done with the SAT/SMT-based tool.Specifically, for each possible t, we set the objective function as minimizing i=t i=15 H(∆Y i ).The authors of [LDM + 19b] pointed out that the conditions on the right branch will influence the whole time complexity.To ensure a valid attack, the probability of the rightbranch differential characteristic should be higher than 2 −80 .Experimental results indicate that the minimal values of t i=15 H(∆Y i ) are 15, 18 and 20 for t = 40, t = 41, and t = 42, respectively, and the corresponding right-branch differential characteristics hold with probability higher than 2 −80 .However, when t = 43, the minimal value of Finding left-branch differential characteristics.After determining the right-branch differential characteristics, we need to find the corresponding left-branch differential characteristics such that the differences (∆X t−5 , . . ., ∆X t ) should cancel the differences (∆Y t−5 , . . ., ∆Y t ) to allow an SFS collision attack on t + 1 steps of RIPEMD-160.For this purpose, we first find the solutions of (∆X 12 , . . ., ∆X 20 ) and (∆X 35 , . . ., ∆X t ), respectively, and make them as sparse as possible.This can be achieved by setting the objective functions of the SAT/SMT-based tool as minimizing

The General Message Modification Technique
We mainly use a similar strategy proposed in [LDM + 19b] to perform the message modification, as the shape of the 41/42/43-step differential characteristics is almost the same as the 40-step one in [LDM + 19b].
Specifically, it consists of two phases, and a graphic illustration for the strategy is given in Figure 3.
Phase 1: Find a valid solution of (X 12 , . . ., X 40 ).For convenience, this solution is called a starting point for the SFS collision attack.For this starting point, all message words except m 7 are fixed.We present partial information of the message expansion, as illustrated in Figure 4.
Phase 2: Verify the remaining uncontrolled parts by exhausting all valid values of X 11 .The underlying reason is that after X 11 is fixed, m 7 will be fixed for each starting point.This phase can be more efficient via an early-abort strategy.Repeat this phase with another starting point if all valid values of X 11 are used.
The details of Phase 1 and Phase 2 are specified below.
Efficiently generating more starting points.We observe that in the 41/42/43-step differential characteristics, the conditions on (X 24 , . . ., X 38 ) are dense.With this observation in mind, we can generate an initial starting point as follows.For better understanding, we refer the readers to Figure 4 when reading this part.Step 1: Find a solution for (X 24 , . . ., X 38 ) such that all the conditions on them hold.This can be similarly done with the model to describe the value transitions.After this step, the message words (m 11 , m 8 , m 3 , m 10 , m 14 , m 9 , m 15 ) are fixed.
Step 2: Then, we utilize the available degrees of freedom in (m 2 , m 5 ) to fulfill the conditions on (X 23 , X 22 , X 21 ).It can be found that there are only a few conditions on (X 23 , X 22 , X 21 ) and hence there are many possible choices of (m 2 , m 5 ).
Step 3: Next, we use (m 0 , m 12 ) to fulfill the conditions on (X 20 , X 19 , X 18 , X 17 ) and there are also many possible values of (m 0 , m 12 ) due to the sparsity in these 4 states.
Step 4: Then, we use m 6 to fulfill the conditions on (X 16 , X 15 ), and use m 1 to fulfill the conditions on (X 14 , X 39 , X 40 ).
Step 5: Finally, we use m 13 to fulfill the conditions on (X 13 , X 12 ).
The main reason to give such a detailed procedure to find the initial starting point is to better understand the available number of initial starting points, which will be important to the 43-step attack.Especially, from what follows, it will become clear that we are interested in the possible number of solutions for (X 15 , . . ., X 38 ) because we can efficiently generate new starting points from it.Due to the sparsity in (X 15 , X 23 ), i.e. the sufficiently many available degrees of freedom in (m 2 , m 5 , m 0 , m 12 , m 6 ), we can expect to generate many solutions of (X 15 , . . ., X 38 ) with the above method, and this number will be much larger than 2 32 by simply counting the conditions on (X 23 , X 22 , X 21 ).After the initial starting point is generated, with the technique in [LDM + 19b], we can generate more starting points from it in a much more efficient way: Step 1: Keep (X 15 , X 16 , X 17 ) unchanged.Randomly choose a valid value of (X 13 , X 14 ) and recompute X 12 as follows: Then, check the conditions on (X 12 , LQ 16 , LQ 17 , LQ 18 ).If they do not hold, randomly choose a new valid value of (X 13 , X 14 ) and repeat until they hold.
Step 2: Modify m 13 and m 1 to keep X 18 and X 19 unchanged: In this way, (X 15 , . . ., X 38 ) will be kept the same as in the initial starting point.However, (X 39 , X 40 ) should be updated as X 39 is computed from m 1 .Therefore, we need to further check whether the conditions on (X 39 , X 40 ) hold.If not, we need to move to Step 1 to use a different value of (X 13 , X 14 ).
Verifying the uncontrolled part.For Phase 2, we utilize the available degrees of freedom in m 7 to fulfill the remaining uncontrolled conditions.The details are as follows: Step 1: Assume that there are n 1 bit conditions on X 11 .In this way, we can exhaust 2 32−n1 possible values of X 11 in total.For each possible value of X 11 , compute m 7 as follows: In this way, all message words are fixed.Then, we first verify the remaining uncontrolled conditions on the left branch via the early-abort strategy.Specifically, we first check the conditions on (X 41 , . . ., X t ) since X 41 is updated with m 7 .Then, compute backward until X 8 and check the conditions on (X 11 , X 10 , LQ 12 , LQ 13 , LQ 14 , LQ 15 ).If these conditions hold, move to the next step.Otherwise, choose another possible value for X 11 and repeat.
Step 2: Compute backwards to obtain (X −5 , . . ., X −1 ).Then, compute all the internal states on the right branch.If the conditions on the right branch do not hold, move to Step 1. Otherwise, an SFS collision is found.

Differences between this work and [LDM
The general idea of this work is the same as [LDM + 19b].However, the right-branch differential characteristics are not optimal in [LDM + 19b] as they are deduced by hand, i.e. by experience.We addressed this issue by using the recently proposed MILP/SAT/SMT-based tools to search for optimal differential characteristics for the right branch.As the number of attacked steps increases, the general message modification also slightly differs from [LDM + 19b].Specifically, to efficiently generate more starting points, we further need to check the conditions on (X 39 , X 40 ).When verifying the remaining uncontrolled conditions on the left branch, we also additionally need to check the conditions on (X 41 , . . ., X t ).It is unclear whether these 2 extra probabilistic parts will affect the whole time complexity, and hence it should be carefully analyzed.

Evaluating the Time Complexity
First, we emphasize that generating the initial starting point can be finished in practical time.In our attacks, it is expected that only a few initial starting points are sufficient because we can generate many more starting points from one initial starting point in an efficient way.Hence, the time complexity to generate the initial starting points is negligible.Second, we evaluate the cost to generate new starting points from the initial starting point.In this procedure, we will exhaust all possible values of (X 13 , X 14 ) and then check the conditions on (X 12 , LQ 16 , LQ 17 , LQ 18 , X 39 , X 40 , LQ 39 , LQ 40 ).when t > 40.When t = 40, we only need to check the following conditions (X 12 , LQ 16 , LQ 17 , LQ 18 , X 39 , LQ 39 , LQ 40 ) since we only need to ensure the modular difference of X 40 in this case.Denote the probability of these conditions by 2 −p1 and the number of bit conditions on (X 13 , X 14 ) by n 2 .Then, we can expect to generate 2 64−n2−p1 new starting points from the initial starting point.The time complexity to generating each new starting point is 2 p1 .Third, we estimate the cost to verify the remaining uncontrolled conditions on the left branch.As already stated, we denote the number of conditions on X 11 by n 1 and there will be 2 32−n1 possible values for X 11 .For each starting point, we first verify the conditions on (X 11 , X 10 , LQ 15 , LQ 14 , LQ 13 , LQ 12 , X 41 , . . ., X t−1 , LQ 41 , . . ., LQ t ).
Note that we only need to ensure the modular difference of X t and therefore we only care about whether LQ t satisfies its condition.Denote the probability of these conditions by 2 −p2 .In this way, for each starting point, we expect to find 2 32−n1−p2 many values of X 11 such that all the conditions on the left branch hold.The cost to find each such solution is then estimated as 2 p2 times of evaluations of 4 + (t − 41 + 1) = t − 36 steps of the step function.
Finally, we need to verify the conditions on the right branch.Denote the probability of the right-branch differential characteristic by 2 −p3 .In this way, we need in total 2 p3− (32−n1−p2)   starting points.This indicates that we need T 1 = max{1, 2 p3−(32−n1−p2)−(64−n2−p1) } initial starting points.Denote the time complexity to generate one initial starting point by T s .In this way, the whole time complexity of the attack is then estimated as

Application to 41/42/43-Step Differential Characteristics
Apart from the conditions specified in Table 7, Table 8, Table 9, we will further list some other conditions that will affect the whole time complexity.
On 41-step RIPEMD-160.As discussed above, we list the conditions in On 42-step RIPEMD-160 Similarly, we list the extra conditions in Table 12 that affect the performance of SFS collision attacks and are not present in Table 8.The conditions on all (LQ i , RQ i ) can be referred to Table 16.Consequently, we can obtain it means we need to generate T 1 = 2 initial starting points and hence the whole time complexity of the 42-step SFS collision attack is about 2 67.3 .

Since
it implies that we need to generate T 1 = 2 11 initial starting points and hence the whole time complexity of the 43-step SFS collision attack is about 2 74.8 .As already mentioned, we can generate much more than 2 11 different initial starting points and this is not a problem for our 43-step attack.
On the initial starting points.As already mentioned, the initial starting points can be efficiently found.For evidence, we provide the initial starting points for the 41/42/43-step SFS collision attacks in Table 14.Indeed, in our experiments, we could generate one initial starting point in 30 seconds.Hence, the time complexity T 1 • T s is negligible given that we only need a few initial starting points.

Conclusion
With the automatic SAT/SMT-based tools, we have significantly improved the (SFS) collision attacks on RIPEMD-160.In particular, we found the practical colliding message pair for 40-step RIPEMD-160 for the first time, and it practically breaks half of full-round RIPEMD-160 after more than 20 years of its publication.For the SFS collision attack, by searching for new right-branch differential characteristics with minimal Hamming weight, we successfully improved the best attack 3 steps.It is interesting to investigate whether it is possible to further improve the (SFS) collision attacks on RIPEMD-160 by using other strategies different from [LDM + 19b, LWS + 23] since we seem to have reached the best of the strategies proposed in these two papers.In addition, it may be possible to mount a valid SFS collision attack on 44-step RIPEMD-160 in the quantum setting based on our 44-step differential characteristic.However, this is not our interest as the most important step in the dedicated quantum collision attack is still to search for a high-probability differential characteristic, and we have addressed this issue in this work.

Figure 2 :
Figure 2: SFS collision attack framework i=43i=15 H(∆Y i ) is 22, and the corresponding differential characteristic holds with a probability smaller than 2 −80 .Therefore, we could possibly perform SFS collision attacks on 41, 42 and 43 steps of RIPEMD-160 under the attack framework[LDM + 19b].This is because the final time complexity is indeed affected by several factors and they should be analyzed in a more careful way.

Figure 4 :
Figure 4: Partial information of the message expansion of RIPEMD-160

Table 1 :
Summary of attacks on RIPEMD-160 *An attack starts at an intermediate step.
two 512-bit message blocks, respectively.11.X i and Y i represent the 32-bit internal state of the left and right branches updated during step i for compressing M , respectively.
12. LQ i and RQ i represent the 32-bit temporary states of the left and right branches updated in step i for compressing M , respectively.13.The Hamming weight of the signed difference ∆x is denoted by H(∆x) and H(∆x) is the number of indices i such that ∆x[i] ∈ {n, u} [LWS + 23].

Table 2 :
Boolean functions and round constants in RIPEMD-160

Table 3 :
Three ways to construct the second local collision

Table 12 :
Extra conditions influencing the attack for the 42-step differential characteristic

Table 13 :
Extra conditions influencing the attack for the 43-step differential characteristic

Table 15 :
Some extra conditions for the 41-step differential characteristic Conditions on LQ i and RQ i :(LQ i ⊞ in l i ) ≪s l i = LQ i

Table 16 :
Some extra conditions for the 42-step differential characteristic Conditions on LQ i and RQ i :(LQ i ⊞ in l i ) ≪s l i = LQ i

Table 17 :
Some extra conditions for the 43-step differential characteristic Conditions on LQ i and RQ i :(LQ i ⊞ in l i ) ≪s l i = LQ i