Improved MITM Cryptanalysis on Streebog

. At ASIACRYPT 2012, Sasaki et al. introduced the guess-and-determine approach to extend the meet-in-the-middle (MITM) preimage attack. At CRYPTO 2021, Dong et al. proposed a technique to derive the solution spaces of nonlinear constrained neutral words in the MITM preimage attack. In this paper, we try to combine these two techniques to further improve the MITM preimage attacks. Based on the previous MILP-based automatic tools for MITM attacks, we introduce new constraints due to the combination of guess-and-determine and nonlinearly constrained neutral words to build a new automatic model. As a proof of work, we apply it to the Russian national standard hash function Streebog , which is also an ISO standard. We find the first 8.5-round preimage attack on Streebog-512 compression function and the first 7.5-round preimage attack on Streebog-256 compression function. In addition, we give the 8.5-round preimage attack on Streebog-512 hash function. Our attacks extend the best previous attacks by one round. We also improve the time complexity of the 7.5-round preimage attack on Streebog-512 hash function and 6.5-round preimage attack on Streebog-256 hash function.


Introduction
The cryptographic hash function is one of the fundamental building blocks in modern cryptography.It is a mathematical algorithm that takes a message of arbitrary length and outputs a bit string of fixed length.Hash functions play important roles in modern cryptography and have been used in many important applications, such as authentication, digital signatures, and message integrity.For hash functions, collision resistance, preimage resistance and second-preimage resistance form the three main security requirements.
The Meet-in-the-Middle (MITM) approach was first introduced by Diffie and Hellman [DH77] in 1977 to attack DES.The MITM attack has always received the attention it deserves in a key-recovery scenario, but it has only more recently been applied to preimage attacks [AS09b,AMM09,SA08].Since then, many MITM preim-age attacks on kinds of hash functions or their round-reduced variants have been proposed, including MD4 [GLRW10], MD5 [SA09], Tiger [GLRW10, WS10], SHA-0 [AS09a], SHA-1 [AS09a, EFK15, KK12], SHA-2 [AGM + 09], HAVAL [SA08,GSY15], BLAKE [EFK15], RIPEMD [WSK + 11], HAS-160 [HKS10], Streebog [AY14, MLHL15b, ZWW13, MLHL14], Whirlpool [SWWW12], Grøstl [WFW + 12], and AES hashing modes [Sas11, WFW + 12, BDG + 19, BDG + 21].Meanwhile, many techniques are proposed to enhance and improve the MITM attacks on hash functions, such as splice-and-cut [AS09b], initial structure [SA09], (indirect-)partial matching [AS09b,SA09], biclique [BKR11], sieve-in-the-middle [CNV13], and match-box [FM15].The core of a MITM preimage attack on a hash function is generally a MITM preimage attack on its compression function.In the attack, the compression function is divided into two sub-functions so that a portion of bits of the input message only affect one sub-function and another portion affects the other sub-function, which allows attackers to mount the MITM attacks.The subfunction computed forward is named forward chunk and the subfunction computed backward is named backward chunk.The bits affecting only one chunk are called neutral words.At EUROCRYPT 2021, Bao et al. [BDG + 21] built an MILP-based automatic tool of MITM preimage attack and applied it to AES hashing modes and Haraka v2 [KLMR16].Later on, Bao et al. [BGST21] improved the model by introducing the technique of guess-and-determine and applied it to Whirlpool and Grøstl.At CRYPTO 2021, Dong et al. [DHS + 21] extended the automatic model into MITM key-recovery attacks and collision attacks.In 2022, Schrottenloher and Stevens [SS22] studied a simpler MILP modeling which allows to find both classical and quantum attacks on a broad class of cryptographic permutations.Besides, another automatic tool was introduced by Derbez and Fouque [DF16] for MITM and DS-MITM attacks [DS08, DKS10, DFJ13, DF16] on block ciphers.The tool is not based on MILP and wasn't used to attack hash functions.
Streebog [ISO18] is a cryptographic hash function defined in Russian national standard GOST R 34.11-2012 [GOS12].It was created to replace the old GOST R 34.11-94 hash function [GOSan] which was theoretically broken in 2008 [MPR08a, MPR + 08b].The hash function is widely used in Russia, and it is also included as RFC 6896 [DD13] by IETF and standardized by ISO/IEC 10118-3:2018 [ISO18].Streebog is an iterated hash function based on HAIFA framework [BD07] as a domain extension algorithm.It consists of two members: Streebog-256 and Streebog-512 which output 256-bit and 512-bit hash digest respectively.Streebog-256 uses a different initial state than Streebog-512, and truncates the output hash, but is otherwise identical.The compression function operates in Miyaguchi-Preneel (MP) mode with an AES-like block cipher, the internal state is represented as an 8 × 8 matrix of bytes and it is updated 12 times with the round function, followed by an XOR operation with a whitening key.In the past few years, several cryptanalysis results on Streebog have been reported, including preimage attacks, second preimage attacks, and collision attacks.Wang et al. [WYW13] focused on the compression function and they gave collision attacks on 4.5, 5.5, 7.5, and 9.5 rounds compression function of Streebog by using the rebound attack [MRST09].In 2013, Zou et al. [ZWW13] presented collision attacks on 5-round Streebog-256 and Streebog-512 hash function with the Super-Sbox technique [GP10, LMR + 09] and the multi-collision technique [Jou04].Additionally, they constructed a preimage attack on 6-round Streebog-512 hash function by combining the guess-and-determine MITM attack [SWWW12] with multi-collision.At AFRICACRYPT 2014, AlTawy and Youssef [AY14] also proposed a preimage attack on 6-round Streebog-512.At ACNS 2014, Ma et al. [MLHL14] improved the preimage attacks on 6-round Streebog-512 hash function, and they presented collision attacks on 6.5-round Streebog-256 and 7.5-round Streebog-512.In addition, they constructed a distinguisher on 9.5-round Streebog using the limited-birthday distinguisher [IPS13] 1.We list the notations below.
• S ENC : starting state in the encryption data path (contains n w-bit cells) • S KSA : starting state in the key schedule data path (contains n w-bit cells) • E + /E − : ending state of the forward/backward computation the initial degrees of freedom for the backward chunk • DoM: the degrees of matching ), l − constraints on the neutral words for the backward chunk • DoF + : DoF + = λ + − l + , the degrees of freedom for the forward chunk • DoF − : DoF − = λ − − l − , the degrees of freedom for the backward chunk From (S ENC , S KSA ) leading to E + is the forward computation and from (S ENC , S KSA ) leading to E − is the backward computation.The cells of (S ENC , S KSA ) are partitioned into different subsets with different meanings which satisfy Besides, the values of l are fixed to an arbitrary constant, and for any fixed , the neutral words for the forward computation and backward computation paths fulfill the following systems of equations: The computations for deriving E + [M + ] and E − [M − ] can be carried out independently.Usually, Equation (1) and (2) are linear equations (i.e., the neutral words are linearly constrained) in previous MITM preimage attacks [Sas11,SWWW12].Therefore, the attackers can solve the linear equations to derive the solution spaces for the neutral words with ease.However, Dong et al.
) induced by Equation (1) and (2).If there are 2 w•(λ + −l + ) and 2 w•(λ − −l − ) solutions of Equation (1) and (2) respectively, then DoF + = λ + − l + and DoF − = λ − − l − are the degrees of freedom for the forward and backward computations.In addition, if For different α, c + and c − , the above process can be repeated many times and each time is called one MITM episode.

MITM Attack with Guess-and-Determine and Linearly Constrained Neutral Words
The guess-and-determine approach was introduced by Sasaki et al. [SWWW12] to extend the MITM preimage attack on Whirlpool.In their attack, some cells may be guessed to be Blue/Red in different states in the forward/backward computation.To explain, we introduce some new notations: • Y ENC + /Y KSA + : the set of cells guessed to be Blue for the encryption/key schedule path • Y ENC − /Y KSA − : the set of cells guessed to be Red for the encryption/key schedule path the number of cells guessed to be Red In Sasaki et al.'s attack [SWWW12], the neutral words are linearly constrained, i.e., Equation (1) and (2) are linear, so the solution spaces of the neutral words can be easily obtained.Their MITM preimage attack is shown in Algorithm 1.

Algorithm 1: Sasaki et al.'s MITM preimage attack with guess-and-determine
Get the solution of (S ENC [B ENC ], S KSA [B KSA ]) by solving the Equation (1) and store the values in a table T1.

5
Get the solution of (S ENC [R ENC ], S KSA [R KSA ]) by solving the Equation (2) and store the values in a table T2.Complexity.From Line 6 to Line 16 of Algorithm 1, we test 2 w•(DoF + +DoF − +σ + +σ − ) messages and expect 2 w•(DoF + +DoF − +σ + +σ − −m) of them to pass the m-cell filter.We need to verify the correctness of these partial matchings.In Line 13, the probability that the guessed cells in the forward and backward computations are correct is 2 −w•(σ + +σ − ) .Hence, there will be 2 w•(DoF + +DoF − −m) valid partial matchings that pass the check of Line 13. Suppose we are finding a preimage of the h-cell target, the overall time complexity is In the attack, we need to store the tables T 1 , T 2 and L, so the memory complexity is

MITM Attack with Guess-and-Determine and Nonlinearly Constrained Neutral Words
If the neutral words are nonlinearly constrained, i.e., Equation (1) and ( 2) are nonlinear, it will be difficult to get the solution spaces of the neutral words by solving the nonlinear equations directly.At CRYPTO 2021, Dong et al. [DHS + 21] introduced a table-based method to compute the solution spaces of the neutral words.However, Dong et al. did not consider the case when the guess-and-determine is included in the MITM attack.In this section, we propose a unified MITM model combining nonlinearly constrained neutral words and guess-and-determine.
Since the guess-and-determine is introduced in the MITM attacks, the guessed cells may be involved in the l 1).In order to compute their values, we need to know not only the values of cells and cells in the starting states, but also the values of the guessed cells in the computation path.So we define the l + functions by Similarly, we define the , the neutral words for the forward computation and backward computation are constrained by the following systems of equations: Algorithm 2: Computing the solution spaces of the neutral words with guessand-determine Firstly, Algorithm 2 is given to combine the nonlinearly constrained neutral words and guess-and-determine.Algorithm 2 obtains the solution spaces of the neutral words for all c + and c − together with each guess of ( and its memory complexity is (2 w ) λ + +σ + + (2 w ) λ − +σ − .Then, we apply Algorithm 2 to the unified MITM preimage attack in Algorithm 3.
Algorithm 3: The MITM preimage attack with nonlinearly constrained neutral words and guess-and-determine Complexity.From Line 7 to 20 of Algorithm 3, we test 2 w•(DoF + +DoF − +σ + +σ − ) messages and expect 2 w•(DoF + +DoF − +σ + +σ − −m) of them to pass the m-cell filter.In Line 16, we need to verify the correctness of these partial matchings.The probability that the guessed cells are correct is 2 −w•(σ + +σ − ) , so there will be 2 w•(DoF + +DoF − −m) valid partial matchings that pass the correctness test.Suppose we are going to find a preimage of the h-cell target.Therefore, there are about 2 w•(DoF + +DoF − −h) preimages passing the check at Line 19 for each episode.We need at least to repeat the process 2 w•(h−(DoF + +DoF − )) times to produce one preimage.The time complexity to perform one MITM episode is Depending on the number of available degrees of freedom, the loop at line 1 in Algorithm 3 does not necessarily need to try all values for all the gray cells.We assume the size of G in Line 1 of Algorithm 3 is | G |= (2 w ) x , then we can know x = h − (λ + + λ − ).Hence, we consider two situations depending on λ + + λ − .
• λ + + λ − ≥ h: In this case, we set x = 0, then | G |= 1.At Line 3 and Line 4 of Algorithm 3, we only need to traverse (2 w ) h−(DoF + +DoF − ) values of (c + , c − )∈ , where h − (DoF + + DoF − ) ≤ l + + l − due to λ + + λ − ≥ h, to find the preimage.Then, together with Equation (6), the overall time complexity is about: ), and we need to build 2 x V and U in Line 2 of Algorithm 3. Hence, the overall complexity is about: Moreover, the memory complexity for both situations is about Firstly, the ith cell of a state S is encoded by a pair of 0-1 variables (x S i , y S i ) as the following rule: Gray, (x S i , y S i ) = (1, 1), predefined constant, it is known in both forward and backward chunks.Blue, (x S i , y S i ) = (1, 0), dependent on Gray cells and neutral words for forward chunk, it is known for forward chunk but unknown for backward chunk.Red, (x S i , y S i ) = (0, 1), dependent on Gray cells and neutral words for backward chunk, it is known for backward chunk but unknown for forward chunk.White, (x S i , y S i ) = (0, 0), dependent on both neutral words for forward and backward computations, it is unknown for both forward and backward chunks.For the starting states, we introduce variables α i and β i for each cell of (S ENC , S KSA ), where α i = 1 if and only if the cell is and β i = 1 if and only if the cell is .Therefore, we can compute the initial degrees of freedom for forward and backward chunks by For the ending states, we assume the matching only happens at the MixRows in the actual attacks on Streebog, for each pair of rows of E + and E − , we introduce a variable m i to indicate the degree of matching in row i which can be constrained by the number of , and cells.The total degrees of matching DoM can be computed by DoM = 7 i=0 m i .For more details, we refer to [BDG + 21].Then we build attribute propagation rules for each operation of the attacked hash function and record the consumption of the degrees of freedom.The process of adding constraints on neutral words consumes the degrees of freedom of neutral words.We assume the accumulated consumed degrees of freedom of forward and backward chunks are l + and l − respectively.We can compute the remaining degrees of freedom for forward and backward chunks by DoF + = λ + − l + , DoF − = λ − − l − .The rules XOR-RULE and MC-RULE introduced in [BDG + 21] are used to build the rules of AddRoundKey and MixColumns of AES-like hashing.For more details of these rules see Section B. In the MILP model of attacking Streebog, we can use XOR-RULE to build the rules of AddRoundKey and use MC-RULE to build the rules of MixRows.In addition, we can easily build the rules of Transposition because it just permutes the color scheme of the input state.As for SubBytes, we can ignore it because it does not change the color of the input state.
In addition, we need to build some constraints to get the values of σ + and σ − which are the number of guessed cells in the forward and backward chunks.In general, guessand-determine is often used before the diffusion operations because one unknown cell in the input of diffusion operation may make many cells in the output unknown.Taking MixColumns for example, we assume the input state and output state of MixColumns are S in and S out .We introduce another state Sin and let MixColumns link Sin and S out .Then we introduce an operation named Guess to link S in and Sin , as shown in Figure 2. In the forward chunk, we build the rule named GUESS + -RULE for Guess operation.Concretely, the GUESS + -RULE keeps the cell unchanged if the input cell is , or , while it keeps the cell unchanged or changes the to .We introduce a variable γ + i for each cell of the state, γ + i = 1 if and only if the is changed into .The GUESS + -RULE is shown in Figure 3(a).Then we need to convert the GUESS + -RULE to linear inequalities to get Figure 3: The rule of Guess in forward and backward chunks the constraints, the set of rule GUESS + -RULE restricts (x Sin , y Sin , x Sin , y Sin , γ + i ) to subsets of F 5 2 , which can be described by a system of linear inequalities by using the convex hull computation method [SHW + 14].Similarly, we can build the rule named GUESS − -RULE in the backward chunk.As shown in Figure 3(b), GUESS − -RULE keeps the cell unchanged if the cell of S in is , or , while it keeps the cell unchanged or changes the to .We also introduce a variable γ − i for each cell of the state.γ − i = 1 if and only if the are changed into .We use the same method to convert it to linear inequalities.
In order to distinguish the guessed cells obviously, we unifiedly use to represent these guessed cells of Sin in the forward and backward chunks.Therefore, Sin [i] is if γ + i = 1 in the forward chunk or γ − i = 1 in the backward chunk.In addition, we can compute the number of guessed cells in the forward and backward chunks σ + and σ − by σ + = γ + i , σ − = γ − i .Finally, since the time complexity is given by Equation ( 7) and (8), we introduce an auxiliary variable v obj , impose the constraints Our objective function is to maximize the value of v obj .Besides, additional constraints should be added to the model according to the value of λ + + λ − .
Let ini r , ini k and match r denote the round number of S ENC , S KSA and E + respectively.For searching N-round attacks, we enumerate all possible combinations of ini r , ini k and match r , where 0 ≤ ini r < N, 0 ≤ ini k < N, 0 ≤ match r < N and generate an MILP model for each (ini r , ini k , match r ).Then we use the MILP solver Gurobi to search the optimal attack for each MILP model.Once a solution is found, we can draw it in a figure according to the values of pair variables of each cell.
Remark.Our model is different from the one in [BGST21].Firstly, we employed the similarity of the encryption and key-schedule data paths.We considered two situations where AddRoundKey is placed before or after MixColumns, which can be implemented by the indicator constraints in Gurobi as mentioned in [BGST21].However, we did not use the "relaxed model" proposed by Bao et al. [BGST21], the solution space of the "relaxed model" is larger than ours.Consequently, the optimal solution of their model should be better than ours.However, the search space of the "relaxed model" is too large and the corresponding MILP model cannot be solved in practical time.Therefore, they employ round-dependent modeling, symmetry and similarity techniques to reduce the search space.It seems that the solution space of the reduced model covers some different MITM trails than our models and at the same time misses some trails covered by our model.
Besides, they built the rule for the combination of MC and Guess, in detail, they introduced another variable for each cell to indicate if the cell is guessed.Hence, they have to rewrite all the rules for each cell by considering the additional variable.In our model, we make MC and Guess totally separate by introducing a new operation Guess and an auxiliary state, which will not affect other rules.Then we just need to build the rule for Guess and it is simple and intuitive.The total size of our model is smaller.In addition, in comparison to the MILP built in [BDG + 21] without guess-and-determine, our method will not have a significant increase in the size of the MILP model and it will also not increase too much the time needed to solve it.We used Gurobi 9.0.3 to solve all the MILP models.It took about one week on a PC with Fedora Linux 30 and 128 GB memory to find the attacks on 8.5/7.5-roundStreebog-512 and 7.5-round Streebog-256.As for the 6.5-round Streebog-256, it just took about several hours to find the attack because the key is fixed in this model.The source code is provided at https://github.com/dongxiaoyang/streebog-mitm.

Application to Streebog
In this section, we give a brief description of Streebog, and then show our preimage attacks on round-reduced compression functions of Streebog-512 and Streebog-256.

Specifications of Streebog
Streebog is a family of two hash functions, Streebog-256 and Streebog-512.They both accept message blocks size of 512 bits and output 256-bit and 512-bit hash digest respectively.As shown in Figure 4, Firstly, the input message M is padded into a multiple of 512 bits.The bit "1" is appended to the end of the message, and followed by 512−1−(|M | mod 512) 0bit, where |M | denotes the length of the message.Then the padded message can be divided into t + 1 512-bit blocks m 1 ||m 2 || • • • ||m t+1 .The three variables Σ, N, h 0 are assigned to 0,0 and IV respectively.Secondly, each block m i (1 ≤ i ≤ t + 1) is processed iteratively according to the following operations: Finally, the output chaining value of the last message block h t+1 goes through the output transformation by: For Streebog-512, H(M ) is the hash digest.The MSB 256 (H(M )) is the hash digest of Streebog-256.(MSB 256 means the 256 most significant bits).The compression For more details, we refer to the original paper [GOS12].
In this section, we show the attack on 8.5-round Streebog-512 compression function and the attack on 7.5-round is given in Appendix A. The preimage attack on the 8.5-round Streebog-512 compression function is shown in Figure 5, K i and K i represent the states in the key schedule path, X i , Y i , Z i and W i represent the states in the encryption path, The "X" operation on the key schedule path means XORing a round-dependent constant.
The starting states are W 3 and K 5 , the ending states are Z 6 and W 6 .In W 3 , there are 36 cells, 4 cells and 24 cells.In K 5 , there are 16 cells and 48 cells.Therefore, the initial degrees of freedom for forward and backward chunks are λ + = 36 and λ − = 16 + 4 = 20, respectively.The matching happens between Z 6 and W 6 , which forms a 16-cell filter.In addition, there are 12 guessed cells which are represented by in Y 1 .

   
a 1 a 3 a 5 a 7 a 9 a 11 a 13 a 15 a 2 a 4 a 6 a 8 a 10 a 12 a 14 a 16 Firstly, we consider the reduction of degrees of freedom for the cells.From Y 3 to Z 2 , the constraints in Equation ( 12) are applied, where (a 1 , a 2 , • • • , a 16 ) are constants marked in Z 2 .These constraints can ensure that the cells of Y 3 have no impact on the first two columns of Z 2 , so the first two columns of Z 2 only depend on cells in K 3 and Y 3 .The constraints introduce a 16-cell reduction of degrees of freedom for cells, so the remaining degrees of freedom for cells is DoF + = λ + − l + = 36 − 16 = 20.Then we call Algorithm 2 to compute and build the table V which stores the solution spaces of cells, i.e., for fixed in W 3 , traverse the cells in W 3 to compute a i (1 ≤ i ≤ 16).Note that, Algorithm 2 is more like a generic case in which the guessed cells are also involved.However, for the   5, the guessed cells are not involved in the procedure of building table V , so we do not need to traverse the guessed cells.
Then we consider the reduction of degrees of freedom for the cells.From Z 5 to W 5 , the first two columns of W 5 are constant.Hence, we have Equation (13) with constants Algorithm 4: The MITM preimage attack on 8.5-round Streebog-512 compression function 1 Fix all cells of K5 to 0 and arbitrary 16 cells of W3 to 0. 2 for All 8 cells that are not fixed in W3 do

3
Call Algorithm 2 to build V and U .
Compute forward to get the full state of Z6 and store it in a table L. By Algorithm 2, given fixed constant cells in starting states W 3 and K 5 , we traverse λ − = 4 + 16 = 20 cells in W 3 and K 5 to compute the solution space of cells.In detail, we compute K 4 and K 6 from K 5 .Then, compute cells in Y 4 by W 3 and K 4 .Compute W 4 and then Y 5 and Z 5 .Finally, compute b i (1 ≤ i ≤ 16) with Equation (13).We can know l − = 16 and c 2 .The remaining degrees of freedom for cells is DoF − = λ − − l − = 20 − 16 = 4. Therefore, we can call Algorithm 2 to build a table U which stores the solution spaces of cells.Similarly, we do not need to traverse the guessed cells because they are not involved in the procedure of building U .
The whole preimage attack on Streebog-512 compression function is shown in Algorithm 4. We are going to find a 512-bit preimage attack, the state of encryption data path and key schedule path are both 512-bit so that we have enough freedom degrees to find the preimage.Therefore, we can fix some cells of K 4 and W 3 to zero in the whole attack.Note that the guessed cells are only in Y 1 in the backward computation, so will not appear in this attack.
We assume the computation in step 11 is 1 encryption, so there will be 2 480 encryptions.
In step 13, the computation is one encryption, but it is repeated 2 384 times.Therefore, the overall time complexity of step 10-13 is about 2 481 encryptions.

Preimage Attack on Reduced Streebog-256's Compression Function
We find a preimage attack on 7.5-round Streebog-256 compression function, which is shown in Figure 6.The starting states are W 2 and K 4 , we can know λ + = 30 and λ − = 24 + 6 = 30.From Y 2 to Z 1 , it consumes 16-cell degrees of freedom for cells, so DoF + = 30 − 16 = 14.From Z 4 to W 4 , it consumes 24-cell degrees of freedom for cells, so DoF − = 30 − 24 = 6.The matching point is between Z 5 and W 5 and we get a filter of DoM = 16 cells.In addition, we guess 8 cells represented by in Y 7 .Because the target is Streebog-256, the time complexity of exhaustive search to find a preimage is just 2 256 .If we use Algorithm 2 to build the tables V and U , the total size of V and U are 2 240 and 2 304 , which will lead to a total time complexity higher than exhaustive search.However, Algorithm 2 is just a generic case and we can tweak it in kinds of attacks according to the specific situations.In the attack on Streebog-256, we give a procedure (Algorithm 5) to build the table V .(a 1 , • • • , a 16 ) are constants, which are marked in Z 1 shown in Figure 6.

5
Compute backward to get the values of cells in Y3 and K3.
Next we give Algorithm 6 to build the table U which stores the solutions of cells.Note that in Algorithm 2, the guessed cells are considered when we build the table U .However, there are no guessed cells involved in the computation of U of the attack in Figure 6, so we do not need to traverse the guessed cells in the process of building U .
Given fixed constant in W 2 , together with the constants (b 1 , b 2 , • • • , b 24 ) which are constants marked in W 4 , we traverse the 24 cells in Z 4 to compute the solution space of .In detail, we compute the of K 5 by Equation (15).Then we compute the of K 4 and K 3 from K 5 .Compute Y 4 and then W 3 and Y 3 .Finally, we need to check whether 2] (i = 2, 3, 4, 5, 6, 7) hold or not.The probability that the equations hold is about 2 −144 , so there are about 2 48 elements in U [c − ] for a given c − .The memory to store U is 2 192 .Finally we give the MITM preimage attack on 7.5-round Streebog-256 compression function in the Algorithm 7. The time complexity is about 2 209 , and the memory complexity is bounded by 2 192 to store U .
Remark.In the attack on Streebog-256, we do not use Algorithm 2 to compute the solution spaces of neutral words because the time complexity will be greater than exhaustive search if we use Algorithm 2 directly.In the process of searching for attacks, we firstly Step 2 Step 4 Step 3 Step 1  For Streebog-256, we give an improved preimage attack on 6.5-round Streebog-256 hash function.We use a better preimage attack on 6.5-round Streebog-256 compression function and then apply Ma's [MLHL15b] method to find the preimage attack on the 6.5-round hash function with lower time complexity.As shown in Figure 8, the attack consists of three phases.

A Preimage Attack on 7.5-round Streebog-512's Compression Function
We find a preimage attack on 7.5-round Streebog-512 compression function as shown in Figure 11.The starting state are Y 3 and K 4 .The matching point is between Z 5 and W 5 , and there are 24 cells matching, so DoM = 24.In Y 3 , there are 64 cells.In K 4 , there are 16 cells and 48 cells, so λ + = 64 and λ − = 16.In the backward chunk, there are 15 guessed cells which are represented by in Y 1 .
From Y 3 to Z 2 , the contraints in Equation ( 16) are applied, where (c 1 , c 2 , • • • , c 40 ) are constants which are marked in Z 2 .It consumes 40-cell degrees of freedom for cells, so DoF + = 64 − 40 = 24.While the cells are not consumed in this attack.We can easily know the contraints on cells are linear, so we can use Algorithm 1 to mount the MITM attack.The procedure is shown in Algorithm 10.

)4
Automatic MITM Preimage Attacks At EUROCRYPT 2021, Bao et al. [BDG + 21] proposed an automatic method to search the MITM preimage attacks by using Mixed-Integer-Linear-Programming (MILP).At CRYPTO 2021, Dong et al. [DHS + 21] extended the automatic model into MITM keyrecovery and collision attacks.In [BGST21], Bao et al. enhanced the MILP model of MITM preimage attack by introducing the guess-and-determine [SWWW12], relaxed model and independent linear layer into the automatic tool.We based on their model to further introduce the constraints for both the guess-and-determine technique and nonlinearly constrained neutral words.Although Bao et al.'s [BGST21] model already contained the constraints for the guess-and-determine technique, we include the guess-and-determine into our model by a more simple and direct way.

Figure 2 :
Figure 2: Introduce Guess operation before MC

for 12 ifTest the full preimage. 14 if
all values in U [c − ] do 10 Compute backward to get the first two columns of W6 and search L to find matching.11 Use the matching pairs to compute and check if the guessed values Y ENC − are correct.The guessed values Y ENC − are correct then 13 The full preimage is found then 15 Output and stop.

Table 1 :
[AY14]AC 2014, Guo et al. [GJL + 14] exploited the misuse of the counter in the HAIFA mode of Streebog and presented generic second preimage attacks on the full Streebog-512 hash function.At IWSEC 2015, Ma et al. [MLHL15b] proposed a 6.5-round preimage attack on Streebog-256 and a 7.5-round preimage attack on Streebog-512.At EUROCRYPT 2016, Biryukov et al. [BPU16] reverse-engineered the S-Box of Streebog and recovered two completely different decompositions of the S-Box.At FSE 2019, Perrin [Per19] identified a third decomposition of the S-Box and exposed a very strong algebraic structure.At ASIACRYPT 2012, Sasaki et al. [SWWW12] introduced the guessand-determine technique to improve the MITM preimage attack on Whirlpool.Since then, this technique has been applied to many hash functions, such as Grøstl [WFW + 12], Streebog [AY14, MLHL15b, MLHL14], Whirlwind [MLHL15a], etc.At EUROCRYPT 2021, Bao et al. [BDG + 21] built an MILP-based automatic tool of MITM preimage attack.Later, Bao et al. [BGST21] proposed an improved automatic model for MITM preimage attack, which takes the guess-and-determine technique into consideration.At CRYPTO 2021, Dong et al. [DHS + 21] discovered the neutral words can be nonlinearly constrained, while the previous MITM attacks [Sas11, SWWW12] usually adopt linearly constrained neutral words, and their solution spaces are calculated by solving these linear equations.When the neutral words are nonlinearly constrained, one may have to calculate the solution spaces for the neutral words by solving a higher-order equation system, which is usually hard.To deal with the problem, Dong et al. [DHS + 21] proposed a table-based technique to precompute the solution spaces before the MITM process instead of solving a nonlinear equation system directly.Finally, they succeeded in extending the initial structure and then the total number of rounds covered by the MITM approach.However, in Dong et al.'s [DHS + 21] MITM attack framework, the guess-and-determine technique is missing.Summary of preimage attack results on Streebog Therefore, it is very meaningful to study the situation where the neutral words are nonlinearly constrained in MITM attacks.In this paper, we propose a new MITM preimage attack model by combining Sasaki et al.'s guess-and-determine technique [SWWW12] and Dong et al.'s [DHS + 21] nonlinearly constrained neutral words.In addition, based on previous automatic tools [Sas18, BDG + 21, DHS + 21, BGST21]for MITM attacks, we introduce a new automatic model to search optimal parameters for the updated MITM attack.As a proof of work, we apply the new techniques to Streebog-256 and Streebog-512 hash functions.Finally, we find an 8.5-round preimage attack on Streebog-512's compression function and a 7.5-round preimage attack on Streebog-256's compression function.Then, we give a preimage attack on 8.5-round Streebog-512 hash function with a method proposed by AlTawy et al.[AY14]to convert the preimage attack on compression function to hash function.In addition, we also improve the 7.5-round preimage attack on Streebog-512 and 6.5-round preimage attack on Streebog-256.The summary of preimage attacks on Streebog is shown in Table1.
Our Contributions.As shown in [DHS + 21], nonlinearly constrained neutral words extend the initial structure a lot, and then extend the whole MITM attack.In fact, nonlinearly constrained neutral words describes a new way to build initial structure.When putting this technique into the MILP model, it will cover more possible MITM trails that may lead to better attacks.− Public or Oracle computation S KSA Figure 1: A high-level overview of the MITM attacks [DHS + 21] coloring system is introduced to visualize these subsets and the attack.The cells (S ENC [B ENC ], S KSA [B KSA ]), which are visualized by cells, are the neutral words for the forward computation.The cells (S ENC [R ENC ], S KSA [R KSA ]), which are visualized by cells, are the neutral words for the backward computation.λ + and λ − are the number of and cells in the starting states.The cells S ENC [G ENC ] and S KSA [G KSA ] are visualized as cells.The matching is between E + and E ENC [G ENC ], S KSA [G KSA ]) and c + , we firstly compute four rows of cells of Z 2 and compute four columns of cells of Y 2 .Then we use Equation (14) to compute the last two unknown columns cells of Y 2 and then compute the last two rows of cells of Z 2 .Finally we need to check whether W row6 2 so there are about 2 112 elements in V [c + ] in average.
7 Preimage attack on Round-Reduced Streebog-256 Figure 10: 6.5-round preimage attack on Streebog-256 compression function because λ + = 30 and it will be the lower limit of the whole attack.Therefore, we use the method which is similar to Algorithm 5 to compute the solution space of cells and the procedure is shown in Algorithm 8.The computation of Algorithm 8 between line 3 In [DHS + 21], Dong et al. introduced the table-based technique to solve the problem of nonlinearly constrained neutral words in the MITM preimage attacks.Based on their work, we further consider the complex situation which Sasaki et al.'s [SWWW12] guessand-determine approach is used in the MITM preimage attacks.Moreover, based on previous automatic tools for MITM preimage attack, we propose a new one taking the two techniques into consideration.Finally, we improve the preimage attacks against Streebog-512 by one more round and also reduce the time complexity of the 7.5-round preimage attack on Streebog-512 and 6.5-round preimage attack on Streebog-256.
if The guessed values Y ENC − are correct then 12 Test the full preimage.13if The full preimage is found then 14 Output and stop.8Conclusion