Partitions in the S-Box of Streebog and Kuznyechik

. Streebog and Kuznyechik are the latest symmetric cryptographic primitives standardized by the Russian GOST. They share the same S-Box, 𝜋 , whose design process was not described by its authors. In previous works, Biryukov, Perrin and Udovenko recovered two completely different decompositions of this S-Box. We revisit their results and identify a third decomposition of 𝜋 . It is an instance of a fairly small family of permutations operating on 2 𝑚 bits which we call TKlog and which is closely related to finite field logarithms. Its simplicity and the small number of components it uses lead us to claim that it has to be the structure intentionally used by the designers of Streebog and Kuznyechik. The 2 𝑚 -bit permutations of this type have a very strong algebraic structure: they map multiplicative cosets of the subfield GF(2 𝑚 ) * to additive cosets of GF(2 𝑚 ) * . Furthermore, the function relating each multiplicative coset to the corresponding additive coset is always essentially the same. To the best of our knowledge, we are the first to expose this very strong algebraic structure. We also investigate other properties of the TKlog and show in particular that it can always be decomposed in a fashion similar to the first decomposition of Biryukov et al., thus explaining the relation between the two previous decompositions. It also means that it is always possible to implement a TKlog efficiently in hardware and that it always exhibits a visual pattern in its LAT similar to the one present in 𝜋 . While we could not find attacks based on these new results, we discuss the impact of our work on the security of Streebog and Kuznyechik. To this end, we provide a new simpler representation of the linear layer of Streebog as a matrix multiplication in the exact same field as the one used to define 𝜋 . We deduce that this matrix interacts in a non-trivial way with the partitions preserved by 𝜋 .


Introduction
Many symmetric primitives rely on S-Boxes as their unique source of non-linearity, including the AES [AES01].Such objects are small functions mapping F  2 to F  2 which are often specified via their look-up tables.
Their choice is crucial as both the security and the efficiency of the primitive depends heavily on their properties.For example, a low differential uniformity [Nyb94] implies a higher resilience against differential attacks [BS91a,BS91b].On the other hand, the existence of a simple decomposition greatly helps with an efficient bitsliced or hardware implementation [LW14,CDL16].Thus, algorithm designers are expected to provide detailed explanation about their choice of S-Box.Each cipher that was published at a cryptography or security conference has provided such explanations.
There are two prominent S-Boxes for which this information has not been provided.The first is the so-called "F-table" of Skipjack [U.S98], a lightweight block cipher designed by the American National Security Agency (NSA).The second is , the 8-bit permutation used by the Russian standard hash function (nicknamed Streebog [Fed12]) and block cipher (nicknamed Kuznyechik [Fed15]), as well as the first version of the CAESAR candidate STRIBOBr1 [Saa14] (which later changed its components for those of Whirlpool [SB15]).
While Streebog and Kuznyechik were first published as national standards in Russia (GOST), they have since been included in other standards.For instance, both have been included by the IETF as RFC 6986 [DD13] and RFC 7801 [Dol16] respectively.ISO/IEC is also in the process of adding Kuznyechik to their list of standard block ciphers, namely standard 18033-3 1 .S-Boxes have been found to be potential vehicules for the insertion of a backdoor in a symmetric algorithm.In 1997, Rijmen and Preneel suggested an S-Box generation strategy which ensured that a high probability linear transition existed [RP97].The idea was that only the designer would be able to know about this linear approximation but this claim was later proven wrong [WBDY98].Later, Paterson designed a backdoored variant of the DES with modified S-Boxes [Pat99].His overall approach was recently refined by Bannier et al. [BBF16] to build a block cipher which preserves a partition of the plaintext space independently from the key.
In this context, cryptanalysts have tried to reverse-engineer the structure of poorly specified S-Boxes.The first such attempt occurred in the late 1970's, shortly after the publication of the DES: Hellman et al. identified some patterns in the S-Boxes of this block cipher [HMS + 76].Much more recently, Biryukov et al. devised new tools for this purpose.For example, a statistical analysis of the differential and linear properties allowed them to show that the S-Box of Skipjack diplayed a higher resilience against linear attacks [Mat94] than expected [BP15].
More importantly in our case, they provided the first decomposition of the Russian S-Box, , in [BPU16a,BPU16b].The corresponding structure operates on two branches, much like a Feistel or Misty structure.It is however much more complex than both of them as it involves finite field multiplications in GF(2 4 ) and a multiplexer.Later, Perrin and Udovenko found discrete logarithm-based decompositions of this component [PU16].They are very different from the previous decomposition but remain somewhat unsatisfactory due to the complex "arithmetic layer" they use.Their authors concluded that "We could not find sensible explanations for using a structure from any of our decompositions as an S-Box."In fact, the existence of these new structures raised more questions than it answered, although they "strengthen the idea that  has a strong algebraic structure hardly compatible with the claims of randomness of the designers" [PU16].In the end, Perrin and Udovenko conjectured the existence of a "master decomposition" of which the decompositions of both [BPU16a] and [PU16] would be mere side effects.
Our Contribution.We show that the intuition in [PU16] was correct and present what we claim to be said "master decomposition".It holds that , the S-Box used by the last two Russian standards, operates as follows: where  is a root of the primitive polynomial defining the finite field GF(2 8 ), where  is a permutation of Z/15Z, and where  : F 4 2 → GF(2 8 ) is an affine function such that any element  ∈ GF(2 8 ) can be written as  =  4 ⊕ (  ) ⊕ (0) with  4 ∈ GF(2 4 ) and   ∈ F 4 2 .We also use "+" and "−" to denote integer addition and substration, and "⊕" denotes the addition in the finite field.
We generalize this new structure by allowing  and  to be picked in appropriate sets and call the resulting class of permutation TKlog.This new structure allows us to easily explain a particular property of : it maps the partition of GF(2 8 ) into multiplicative cosets of GF(2 4 ) * to the partition of GF(2 8 ) into additive cosets of GF(2 4 ) * .2Furthermore, the restriction of  to each independent multiplicative coset is always the same simple function.Thus, not only does it map a simple partition of GF(2 8 ) to another one, it does so in a very straightforward way.
We also prove that such permutations can always be written in a fashion similar to the decomposition of [BPU16a] so our new decomposition provides the missing link between the first decomposition of [BPU16a] and the logarithm-based decomposition of [PU16].Using counting arguments, we show that the size of the set of the TKlogs operating on 8 bits and the number of affine 8-bit permutations are of comparable magnitudes, meaning that the probability that a random permutation is a TKlog instance is negligible.We therefore claim that the presence of this structure in  is a deliberate choice by its designers.Using experimental arguments, we propose a simple generation algorithm of which  would be a typical output.
Finally, we remark that the linear layer of Streebog is an MDS matrix with coefficients in GF(2 8 ) where the primitive polynomial used to define the representation of its elements is actually the same as the one used to define  as a TKlog.Thus, the cosets interacting with  also interact with the linear layer of Streebog.We provide a first discussion of the consequences of the new structure in  in terms of security but leave their exploitation in a cryptanalysis as an open problem.
Outline.Section 2 recalls the background necessary, both in terms of mathematics and in terms of previous results.The TKlog, its relationship with  and its partition-preserving property are presented in Section 3. We show that the TKlog is the "missing link" between [BPU16a] and [PU16] and list several consequences of this fact in Section 4.Then, we investigate the consequences of the fact that  is a TKlog for the higher level primitives themselves in Section 5. We also present a new representation of the linear layer of Streebog which can be of independent intrerest.Finally, Section 6 concludes this paper.

Notations and Basic Definitions
Finite Fields.There exists, up to isomorphisms, a unique finite field with 2  elements which we denote GF(2  ).We use "⊕" to denote the addition in the field and  ⊙  or  to denote the product of ,  ∈ GF(2  ).For all  ∈ GF(2  ), it holds that  2  ⊕  = 0.
Binary Strings as Ring Elements.The field GF(2  ) can be identified with F 2 []/() for some irreducible polynomial  of degree .If  is a root of  then we can represent all the elements of GF(2  ) as ∑︀ −1 =0     , where   ∈ F 2 .A binary string ( 0 , ...,  −1 ) ∈ F  2 is therefore naturally interpreted as ∑︀ −1 =0     ∈ GF(2  ) in much the same way that it can also be interpreted as In this case, the binary representation of  ⊕  for ,  ∈ GF(2  ) is indeed the XOR of the binary representations of  and .
Logarithms.For all  ∈ GF(2  ) * , the logarithm log  () is the integer of Z/(2  − 1)Z such that  log  () = .Such a function is not a permutation because it is not defined in 0. This problem can be solved in different ways.In [HN10], Hakala and Nyberg study the function which we denote log HN  while Feng et al. in [FLY09] introduced another variant, a special case of which is a permutation of GF(2  ) and which we call log FLY  .These two functions map GF(2  ) to Z/2  Z and are defined by In other words log FLY

𝛼
, is a variant of log HN  where the outputs of 0 and 1 are swapped.We remark that the built-in discrete logarithm in SAGE [Dev17] actually implements log FLY  on GF(2  ) * .Another variant of the logarithm called "pseudo-logarithm" was introduced in [PU16] when investigating  but we will not use it here.

Boolean
The maximum value of   (, ) for  ̸ = 0 is the differential uniformity of  .If  and  are affine permutations of F  2 and F  2 respectively, then  is affine equivalent to  =  ∘  ∘ .Furthermore, let   and   be the linear parts of  and .Then the LAT of  is: Affine-equivalence can be generalized into CCZ-Equivalence [CCZ98]: two functions  : This form of equivalence is known to preserve, among other things, the differential uniformity and the linearity.

On the S-Box 𝜋
While the specification of both Streebog and Kuznyechik have always been public, a complete design rationale has not been provided.In particular, its designers gave very little information about the design method of their S-Box.Its look-up table is given in Table 2 in Appendix 8.This lack of information prompted academics to try and reverse-engineer this S-Box.We present the two decompositions that were found by Biryukov, Perrin and Udovenko further below but first we summarize the information that the designers did provide.

From the Designers
At RusCrypto'13 [Shi13], Shishkin gave a talk presenting the design principles of their upcoming block cipher (Kuznyechik was standardized in 2015).While they considered using S-Boxes from a known class of good S-Boxes, their prefered design approach was different.It is summarized in the following (translated) quote from their slides.
[The properties of S-Boxes designed via a] Random search with a specified parameter restriction • are not optimal when considering the aggregate of the values of the basic cryptographic properties • do not have a pronounced analytical structure In other words, S-Boxes randomly generated trade their non-optimal cryptographic properties (differential uniformity, etc.) for the absence of an analytical structure which could be used by an attacker.Further in their presentation, they state that the number of bit operations needed to implement the S-Box should be minimized so as to help with both hardware and vectorized implementations.Those design criteria make sense; in fact, many algorithms use S-Boxes chosen with a similar rationale such as, for example, CLEFIA [SSA + 07].However, up until the decomposition found by Biryukov et al. [BPU16a] (see Section 2.2.2) no efficient implementation strategy was known for .Furthermore, this permutation was shown to be somewhat close to a finite field logarithm [PU16] (see Section 2.3), which seems at odd with the claimed lack of analytical structure.

TU-Decomposition
Because the design method of  was not published, Biryukov et al. tried to look for additional design criteria or even for hidden structure using (and improving) techniques from [BP15].Their results are presented in [BPU16a,BPU16b].They showed that  has a TU-decomposition, i.e. that it is affine-equivalent to a permutation (, ) ↦ → (︀   (),  () () )︀ where both   and   are 4-bit permutations for all (, ) ∈ (F 4 2 ) 2 .They further provided a decomposition of both  and  , so that the overall structure of  is as described in Figure 1a where  0 ,  1 and  are 4-bit permutations,  is a 4-bit function such that () ̸ = 0 and ℐ is the multiplicative inversion in GF(2 4 ).The multiplexer selects the output of  0 if the right branch is equal to 0 and the output of  1 otherwise.The last components are  and , two 8-bit linear permutations which are linked by the following relation: Biryukov et al. started by composing  with an 8-bit linear layer  * applied at both the input and the output.The fact that the same function is applied in both cases is a consequence of the relation in Equation (1).They recovered  * using visual patterns in the LAT of .
Another remarkable property was noted in [PU16]:  0 is affine-equivalent to a discrete logarithm in GF(2 4 ).

Discrete Logarithm
While investigating the S-Box of the Belarussian standard block cipher BelT [Bel11], Perrin and Udovenko found a completely different decomposition of  [PU16].Indeed, they showed that it had the structure summarized in Figure 1b, i.e. that it was the composition of: • a "pseudo-logarithm", i.e. a permutation obtained by inverting a "pseudo-exponential" which is itself built as a sequence [ 0 ,  1 ,  2 , ...,  −1 , 0,   , ...,  2  −2 ] for a generator  of GF(2 8 ) * and some preimage  for 0; • a layer of modular arithmetic operations which they could not simplify, • a 4-bit permutation  ′−1 , and • an 8-bit linear permutation  ′ .
This decomposition uses the primitive polynomial  min () =  8 ⊕  4 ⊕  3 ⊕  2 ⊕ 1 to define the finite field used in its logarithm.It is the first primitive polynomial of degree 8 in the lexicographic order as can be seen e.g. in Table C of [LN97].It is also the default polynomial used when building a finite field of size 2 8 in SAGE [Dev17].

Relations Between the Decompositions
These decompositions are functionally equivalent since they both correspond to the same permutation , and yet they have little to nothing in common.When evaluating the TU-decomposition, the input first goes through a linear layer mapping F 8 2 to (GF(2 4 )) 2 and then undergoes (, ) ↦ → (/, ), except if  = 0. On the other hand, evaluating the log-based decomposition first requires using a variant of the discrete logarithm.What is the relation between these operations?
We thus find it surprising that both decompositions exist.Furthermore, none of them seems like a natural choice for building an S-Box.These observations led Perrin and Udovenko to the following conclusion [PU16]: we think it more likely that [the TU-decomposition and log-based decomposition are] a consequence of a strong algebraic structure used to design [], probably one related to a finite field exponential.Still this "master decomposition", from which the others would be consequences, remains elusive.Unfortunately, unless the Russian secret service release their design strategy, their exact process is likely to remain a mystery, if nothing else because of the existence of alternative decompositions: which exists by design and which is a mere side-effect of this design?
In the next section, we present what we believe to be this "master decomposition".It relies on a discrete logarithm, which relates it to the second decomposition, and it turns out that a TU-decomposition identical to the one of [BPU16a] is always possible for permutations with a similar structure.We argue that this new decomposition is likely to be the one intended by the designers later in Section 5.1.

The TKlog
In this section, we introduce a new type of permutation which we call TKlog of which  turns out to be a particular case.As Stribog and Kuznyechik were both designed by the "ТК-26"3 , we used the letters "TK" to name this structure.It has log-like properties, though mapping GF(2 2 ) to itself rather than to Z/2 2 Z, hence the "log" part of the name.More precisely, it maps the partition of GF(2 2 ) into multiplicative cosets of GF(2  ) * to its partition into additive cosets of GF(2  ) and its restriction to each multiplicative coset is essentially the same for all cosets.We define this structure in Section 3.1 and present the details of this partition-preserving property in Section 3.2.

The TKlog Permutation Structure
The TKlog.Let GF(2 2 ) = F 2 []/() be a finite field of even degree defined by a primitive polynomial .The multiplicative subgroup GF(2 2 ) * is cyclic and generated by  which is such that () = 0.In this context,  2  +1 is a generator of the multiplicative subgroup of the subfield GF(2  ).
To convince ourselves that a TKlog instance as defined above is indeed a permutation, we define its functional inverse, TKexp.It uses Φ  : GF(2 2 ) → F  2 × GF(2  ) which is the affine permutation such that ).The TKexp works as follows: Alternatively, it can be evaluated using Algorithm 4 (in Appendix 9).
On the Substraction.The integer substraction in the input of  is made necessary by the fact that  ∈ {1, ..., 2  }, so that the binary representation of  does not always fit in  bits.We therefore need a small function that maps {1, ..., 2  } to {0, ..., 2  − 1} since the case  = 0 is handled separately.A natural choice would obviously be  ↦ →  − 1 but, in the case, we would need a different function when  = 0. Indeed, since a FLY logarithm is used, we have  ∈ {1, ..., 2  − 1} when  = 0. Thus, we would need to compose  with two different functions depending on whether  = 0. On the other hand, the function  ← 2  −  maps both {1, ..., 2  } to {0, ..., 2  − 1} and {1, ..., 2  − 1} to itself, meaning that it can be used in both cases.We could not find such a simple function when log HN  is used.
The Particular case of .The S-Box  is a TKlog operating on 8 bits.For implementation purposes, we identify GF(2 8 ) with F 8 2 using the method we described in Section 2.1.When written as TKlog instance,  uses the following components: and its root ; • an affine function  mapping F 4 2 to F 8 2 such that (0) = 0xFC with a linear part Λ defined by where the numbers are written in hexadecimal form and where the linear function Λ verifies • a permutation  of Z/15Z defined in Table 1.The evaluation of  using this structure is summarized in Algorithm 1.An implementation of  based on this algorithm is given as a SAGE [Dev17] script in Appendix 12.It is noteworthy that the default logarithm function in SAGE is log FLY  , which simplifies the implementation of the function.
We actually obtained the TKlog structure by first decomposing  and then generalizing the structure we found.A summary of our reverse-engineering process is given in Appendix 10. end if 13: end function

A Partition-Preserving Property
Recall that GF(2  ) * is the field of size 2  minus 0 and that it is contained in GF(2 2 ) * .Let  be a multiplicative generator of GF(2 2 ) * so that  2  +1 is a multiplicative generator of GF(2  ) * .The field GF(2 2 ) can be written in two different ways using the multiplicative cosets of GF(2  ) * on one hand and the additive cosets of GF(2  ) * on the other.
• All the elements in GF(2 2 ) * can be written as )︃ .
• As GF(2  ) is a vector space of dimension , there exists a vector space  of elements of GF(2 2 ) of dimension  such that GF(2 2 ) is the direct sum of  and GF(2  ).In this case, we can write Both  and GF(2  ) are vector spaces of dimension  and, in each decomposition, 2  cosets of GF(2  ) * are used.It is therefore possible to map these decompositions from one to the other.It is precisely what a TKlog does.More formally, the following theorem holds.
Then the following equalities are always true: Corollary 1 (Vector spaces to affine spaces).Let T , : GF(2 2 ) → GF(2 2 ) be a valid TKlog instance.Then it always holds that where all the spaces involved in these equalities are of dimension .Furthermore, }︀ , so a TKlog maps two vector spaces of dimension  spanning GF(2 2 ) to two affine spaces of dimension  spanning GF(2 2 ).
As  is a TKlog instance, Theorem 1 implies that it verifies the following set equalities and applying Corollary 1 yields ( 16 ⊙ GF(2 4 )) = (0) ⊕ GF(2 4 ).These equalities are summarized in Figure 2 where relationships between complete affine spaces are represented by dashed and thick arrows while those linking sets of size 2  − 1 are represented by plain thin ones.

The Simplicity of the TKlog Properties
The partitions dealt with in Theorem 1 have simple algebraic descriptions.Indeed, ,  ∈ GF(2 2 ) * are in the same additive coset of GF(2  ) * if and only if Tr  () = Tr  ().Then, we remark that Furthermore,  ∈ GF(2  ) * if and only if  2  −1 = 1.We deduce the following corollary of Theorem 1.
Corollary 2. Let T , : GF(2 2 ) → GF(2 2 ) be a valid TKlog instance.Then we always have that As a consequence, for any constant  ∈ GF(2 2 )∖{0, 1}, we have the following implication involving only linear equations: The interaction of a TKlog with these two partitions goes beyond mapping one to the other.Indeed, consider a more general structure corresponding to permutations  such that: where  0 and  1 are permutations of F  2 , and where   is a permutation of Z/(2  − 1)Z for all  ∈ Z/(2  + 1)Z.Any such  is a permutation with the exact partition-preserving property described in Theorem 1 but its contributions on GF(2  ) and on (F  2 ) each depend on both  and , even when we restrict ourselves to  > 0. It is not the case for TKlogs; these permutations are far simpler.
Lemma 1 (Separation Property).Let T , : GF(2 2 ) → GF(2 2 ) be a valid TKlog instance.Then, for any ,  such that 0 <  ≤ 2  and 0 ≤  < 2  − 1, it holds that , so that the contribution of  is restricted to (F  2 ) and that of  is restricted to GF(2  ).In other words, a TKlog interacts with each multiplicative coset other than GF(2  ) * in the exact same way, even though this property is in no way implied by Theorem 1.
We could not find any attack leveraging these surprising properties of .However, we did find that these partitions interact in a non-trivial way with the linear layer of Streebog.We discuss the consequences of the presence of this structure in  in Section 5.

The Missing Link
The TKlog structure is the missing link between the two previous decompositions of .Its relationship with the logarithm-based decompositions of [PU16] is natural since both consist in a variant of the discrete logarithm followed by some arithmetic.The fact that  has a TU-decomposition remains a priori surprising but, in Section 4.1, we show that it is always the case for TKlogs.We also list some of the consequences of this property in Section 4.2.

A TU-Decomposition Always Exists
TKlogs can always be expressed in a fashion very similar to the first decomposition of Biryukov et al. [BPU16a].In order to establish it, we first derive the following lemma.
• The permutation  captures the way the logarithm and the arithmetic operation  ↦ → 2  −  operate on / to return the correct input for : • The permutation  corresponds to the permutation  applied on : )︀ when  ̸ = 0 .
• The function  corresponds to the function  introduced because of Lemma 2. It is defined by where  ∈ F  2 is interpreted as an element of Z/(2  + 1)Z.Note that () ̸ = 0 for all .Using that (/) = 2  − , we have and, using Equation 3 as well, we obtain that Combining this result with the definition of , we obtain that )︁ () .
We can then write for any  ̸ = 0 that .
When  = 0, the term on the right of the XOR cancels out because (0) = 0. Therefore, when  = 0, we can write: which is the same expression as when  ̸ = 0 except that the input of  is changed from (/) to  ().As a consequence, it is possible to evaluate this function in two stages.

Return 𝜅
This process is exactly the one described in Algorithm 2, an observation which concludes the proof.
Ironically, the only "unsurprising" sub-component of , namely the inverse function, is one of the reasons why we were able to reverse-engineer this S-Box in the first place.Had [the inversion] been replaced by a different (and possibly weaker!) S-Box, there would not have been any of the lines in the LAT which got our reverse-engineering started.
We can now see that, rather than luck, all of these events are direct consequences of the TKlog structure of .
Incidentally, it provides an excellent method for spotting such structure should someone else decide to use one-and try to keep this fact hidden.In [PU16], Perrin and Udovenko suggested to plot the variance of the absolute value of the coefficients in each row/column of the LAT.They noticed that, in the case of , there was a sharp drop for some columns.Because of Lemma 3, we can see that this pattern is an inherent property of the TKlog which will therefore always betray the presence of such a structure.Furthermore, should the TKlog instance be obfuscated by composing it with affine layers, this pattern would remain and in fact provide some information about the linear part of said affine layers.
Furthermore, since a TKlog always has a TU-decomposition, it will always have a vector space of dimension  = 2 inside the set of the coordinates of the zeroes of its LAT [CP19].This other pattern can also be detected and thus identicate a TKlog instance to be such.

CCZ-Equivalence. Because of Theorem 2, we know that a TKlog is always affineequivalent to a permutation
where [ = 0] is a Boolean function mapping 0 to 1 and  ̸ = 0 to 0 and where [ As explained in [CP19], the existence of such a decomposition where   is a permutation for all  ∈ F  2 is equivalent to the possibility of a so-called -twist, an operation which preserves the CCZ-equivalence class but a priori does not preserve the AE-equivalence class.As a functional inversion also preserves the CCZ-Equivalence class, we deduce the following corollary of Theorem 2. Corollary 3. Let  and  be as defined above.Both the corresponding TKlog instance and its inverse are CCZ-equivalent to the function  : Proof.By definition (see [CP19]), the -twist maps a function  : . The corollary follows directly.
In the case of , we generated the function   which is CCZ-equivalent to it as specified in Corollary 3. Its lookup-table is provided in the Supplementary Material for the sake of completeness (see Table 3).It of course has the same differential and extended Walsh spectra as  and, again like , all of its coordinates have degree 7.However, it is not a permutation: 15 elements in its image have 3 preimages, 75 have 2 and the 61 remainding ones have 1.

New Information on the Russian Primitives
In this section, we discuss the consequences of the fact that  is (up to a translation) a TKlog instance for the primitives using it.First, we argue that the presence of a TKlog structure must be a deliberate choice from the designers (Section 5.1).We then introduce a new representation of the binary matrix used in Streebog in Section 5.2 which we use to highlight some interactions between the partitions preserved by  and this linear component.Finally, we discuss our findings and their consequences in Section 5.3.

The Likely Design Process of 𝜋
In light of our results, we can deduce some information about the design process of .First, we establish that the number of TKlog instances is extremely small, meaning that the choice of this structure must have been deliberate.Then, using some experimental results, we obtain a design process which yields results extremely similar to .
Our point with these estimates is to give an intuition of how small the number of TKlogs is.A random permutation generator returning an affine permutation would be assumed to deliberately generate such object.In much the same way, the generation process that led the designers of Streebog to choose  can be assumed to have deliberately returned a TKlog instance.
Claim.Given how small the number of TKlog is, we are confident that the designers of  deliberately chose to use this structure.
Experimental Results.How good are the differential and linear properties of TKlog instances compared to those expected from a random permutation?To answer this question, we build upon the analysis of the S-Box of Skipjack in [BP15] to introduce the following concepts.
Definition 2 (Anomaly of an S-Box).Let  : F  2 → F  2 be a permutation, ( ) be its differential uniformity, and   ( ) be the number of occurrences of  in its DDT.The differential anomaly of  is equal to where the probability is taken over all permutations .If ℓ( ) is the linearity of  and  ′  ( ) is the sum of the number of occurrences of  and − in the LAT of  , then the linear anomaly of  is equal to where the probability is taken over all permutations .
An S-Box with a differential anomaly close to 0 has differential properties close to those of a random S-Box or worse.The differential anomaly behaves as we would expect: when the differential uniformity decreases under its expected value, the anomaly increases.As it contains more information than the differential uniformity, it allows a comparison of S-Boxes for which this quantity is the same.From a cryptographic standpoint, the higher the anomaly the better as it means that the S-Box will provide a better security against differential attacks.The same can be said for the linear anomaly.
In [BP15], Biryukov and Perrin provided formulas for computing the differential and linear anomalies based on the statistical distribution of the DDT and LAT coefficients presented in [DR07].They also showed that the linear anomaly of the S-Box of Skipjack was equal to 55.4 so that this component could not have been generated randomly.
To try and gather more information about the design process of , we generated 10 6 random 8-bit TKlog instances.We plotted the differential and linear anomalies of each of them in a two dimensional graph given in Figure 3.Each instance corresponds to a light gray point; darker points are obtained when multiple instances have the same differential and linear anomalies.We also put the anomalies of , log FLY  and log HN  in the same graph for comparison.As we can see, the differential and linear anomalies of  are somewhat good but not exceptional compared to those of a random TKlog.More precisely, an 8-bit TKlog instance has both a differential and linear anomaly at least as high as that of  with probability about 2 −10.6 and it is not hard to obtain much better instances.They are also lower than those of both log FLY  and log HN  .However, none of our random instances have a better differential uniformity or linearity than  (including log FLY  and log HN  ).Furthermore,  is in the area of Figure 3 containing most instances with the same differential uniformity and linearity.Thus, its anomalies are on par with those of a random TKlog instance with the same differential uniformity and linearity.

Design Process Outline.
In light of the experimental results above, we can see that the following design process would yield a result very similar to .
1. Figure out that the best possible differential uniformity for an 8-bit TKlog instance is 8 and the best linearity is 56, for example via extensive computer simulations.
Partitions in the S-Box of Streebog and Kuznyechik 2. Pick a TKlog instance at random among those with said differental uniformity and linearity-without taking the anomaly into account.
This strategy is natural as long as there is a reason to impose the use of a TKlogthough we cannot think of one.Since both the differential and linear anomalies of  are inferior to those of log FLY  and of log HN  , the purpose of the use of a TKlog in this case could not be an improvement of the cryptographic properties of the discrete logarithm.More importantly, the very strong algebraic properties of such components which we described in Section 3.2 would a priori invite caution; even more so in the case of Streebog.Indeed, as we explain below, its linear layer interacts non-trivially with the corresponding partitions.

On the Linear Layer of Streebog
The binary matrix corresponding to the L operation of Streebog is given in Figure 4 where a black pixel corresponds to 1 and a white one to 0. As we can see, it has a strong structure.In [KK13], Kazymyrov and Kazymyrova showed that it could be written as the composition of: • a layer of 8-bit linear permutations ℓ which simply inverts the order of the bits in each byte, • the multiplication by an 8 × 8 MDS matrix of GF(2 8 ) = F 2 []/ KK () where  KK () =  8 ⊕  6 ⊕  5 ⊕  4 ⊕ 1 is a primitive polynomial of degree 8, and • the inverse of the layer ℓ.
We used a very direct approach to try and simplify this structure: by setting each byte in a row to 1 one after the other, multiplying it by L, and then writing its bytes as elements of GF(2 8 ) = F 2 []/ min (), we generated the matrix L F such that The polynomial used by Kazymrov and Kazymrova is the reciprocate of  min , i.e.  KK (1/) =  min ()/ 8 .In hindsight, it was obvious that the reversal of the bit order at the byte level in their expression of the linear layer could be removed by considering this polynomial instead.
In the end, if we let  be the 8 × 8 matrix of elements of GF(2 8 ) corresponding to the internal state of Streebog and let P denote the transposition of  (as is done in the specification of Streebog), then applying the whole linear part of the round function of Streebog can be written where "×" denotes the usual matrix multiplication.
Additive to Multiplicative Cosets.The subfield GF(2 4 ) * has a particular interaction with L. Indeed, applying the matrix multiplication of Streebog to the vector   = [0, ..., 0, , 0, ..., 0] of GF(2 8 ) 8 such that    =  and . it maps the subfield to its multiplicative cosets.However, it is unclear what happens when multiple cells of the input vector are active.
Kuznyechik.The linear layer of Kuznyechik is specified as an LFSR made of 16 cells, each of which is an element of GF(2 8 ), which is clocked 16 times.It can also be represented as a multiplication by a 16 × 16 matrix.However the representation of the field elements uses a different polynomial, namely  kuz () =  8 ⊕  7 ⊕  6 ⊕  ⊕ 1.While  min () =  8 ⊕  4 ⊕  3 ⊕  2 ⊕ 1 is the first primitive polynomial of degree 8 in lexicographic order,  kuz is the last such polynomial of weight 5 (see Table C of [LN97]).
Unlike the matrix multiplication in Streebog, the one in Kuznyechik cannot be written as a matrix multiplication in F 2 []/ min (), so that the coset propagation described above for the hash function does not seem applicable to the block cipher.

Discussion
The consequences of the partition-preserving properties of  and its non-trivial interaction with the linear layer are hard to assess.
In the literature, we can find other S-Boxes mapping cosets to cosets.For example, monomials map multiplicative cosets of the subfield to multiplicative cosets of the subfield: if  :  ↦ →   is a permutation of GF(2 2 ), then If we remove their affine components, the S-Boxes of the AES [AES01] and Misty1 [Mat97] (among many others) exhibit this behaviour.Yet, despite their appearance in some very prominent targets, multiplicative cosets have never been used in symmetric cryptanalysis.Note that algorithm designers always compose the inverse with unrelated affine layers so as to break its algebraic structure.This conservative decision is likely to prevent the use of multiplicative cosets to attack these ciphers in practice.It is not the case for additive cosets.In fact, the authors of [BBF16] purposefully built an S-Box mapping additive cosets to additive cosets with the explicit purpose of using this pattern as a backdoor.They show that such a partition can be preserved if the linear layer is chosen carefully and can thus hold for an arbitrary number of rounds.In the cipher of Bannier et al., the key schedule can therefore be arbitrarily complex without hindering the backdoor.This property is not shared by the partition into multiplicative cosets.
The only case (other than the TKlog) we can think of where a partition into cosets is mapped to a different partition is that of (plain) discrete logarithms.Indeed, a Hakala-Nyberg type of logarithm operating on GF(2 2 ), which maps  2  −1 to 0, always maps multiplicative cosets of the subfield to additive cosets of Z/(2  − 1)Z.In this case, the multiplication is in the finite field and the addition over the integers.Since these two operations are completely different, we deem it unlikely that a cryptanalysis is aided by this property.
In the end, when looking at the impact of cosets on symmetric primitives, we have one of the following situations: 1. the partition into cosets cannot be iterated since the input partition and output partition are over completely different structures (case of the logarithm); 2. although the S-Box and linear layer are defined over similar structures, a small function was added with the explicit purpose of breaking this similarity (case of the AES and the affine permutation used in its S-Box); or 3. the S-Box and the linear layer were chosen with aligned structures that preserve the same partition so as to purposefully introduce a backdoor in a block cipher (backdoored cipher of of [BBF16]).
Kuznyechik seems to be in the second situation.While the designers did not disclose their security analysis, it would make sense for them to choose the polynomial used to define the finite field in which the linear layer operates so as not to "align" it with the structure used to construct .
However, Streebog falls in neither category.The input and output partitions are defined over the same structure (the finite field) so it is not in the first situation.The S-Box could have been composed with an affine layer breaking its relationship with GF(2 8 ) (like in the AES) or the linear layer could have been defined over a different finite field (like in Kuznyechik) but neither is the case so it does not fall in the second category either.Still, while the linear layer is defined over the same structure as the partitions preserved by the S-Box, these partitions are different and it is unclear how they may interact with the matrix multiplication.It is therefore not obvious that Streebog fits into the third category and the following question remains open.
Open Problem 1.Is there a way to leverage the partition-preserving property of  to mount an attack against Streebog?

Conclusion
We have extracted a new structure from  which we claim to be the one originally intended by its designers.Its generalization, the TKlog, is obtained by composing a discrete logarithm with a simple layer of arithmetic.The TKlog explains both previous decompositions of , thus providing the missing link between these two results.
The knowledge of this decomposition allowed us to explain a very specific partitionpreserving property of .Surprisingly, we also found a new expression of the linear layer of Streebog expressed in the same finite field as .While we cannot leverage these properties to attack this hash function, we question the wisdom of this design choice.Indeed, when dealing with components defined over identical mathematical structures, academic designers break this alignment e.g. by composing their S-Boxes with unrelated affine permutations.We are of the opinion that it would have been more cautious to do the same in Streebog.

Algorithms
An algorithm evaluating a TKlog is provided in Algorithm 3 and one evaluating its inverse, a TKexp, is given in Algorithm 4.
Algorithm 3 A TKlog permutation.As we can see in Figure 5, if we apply  to all the elements of   we obtain 16 elements out of which 15 are in   .Furthermore, as we recalled in Equation (1), it holds that {︀  −1 (, 0), ∀ ∈ F 4 2 }︀ = {︀ (0, ), ∀ ∈ F 4 2 }︀ .We then have that   = ( 1 (), 0) ⊕  0 and that   is somehow related to  0 using a finite field multiplication.We also noticed that these sets had a special relationship with the matrix multiplication used in Streebog.
Let L be the 64 × 64 binary matrix used in Streebog and let [, 0, ..., 0] × L = [ ′ 0 , ...,  ′ 7 ].If  takes all values in   for some  ∈ F 4 2 then  ′  takes all values in   for some   .This property holds regardless of the position of  in the initial vector.Since we knew from the work of Kazymrov and Kazymrova [KK13] that L is somehow related to an MDS matrix with coefficients in GF(2 8 ), we deduced that the sets   had to have a particular relation with this field.
These observations, in combination with the fact that  is somehow related to a logarithm [PU16], gave us the intuition that the vector spaces   and the affine spaces   were in fact respectively the multiplicative and additive cosets of a unique vector space of dimension 4 which we quickly identified as the subfield.
This intuition allowed us to write a first very crude decomposition of  which we improved iteratively by re-writing its subcomponents in progressively simpler ways.The final result of this long and tedious process was the decomposition of  as a TKlog we then generalized.
The log-based decomposition.

Figure 1 :
Figure 1: The decompositions of  in the literature.

Figure 3 :
Figure 3: The differential and linear anomalies of random 8-bit TKlog instances, , log FLY  and log HN  .

Figure 5 Figure 5 :
Figure 5: The propagation of particular vector spaces through .

linearity of 𝐹 . The Difference Distribution Table (DDT) of 𝐹 is the 2 𝑛
Functions.Let  : F  2 → F  2 be a function.The Linear Approximations Table (LAT) or Walsh transform of  is the 2  × 2  matrix   such that

Table 1 :
The look up table of the permutation  of Z/15Z.
) }︀ must contain a representative of each equivalence class modulo 2  + 1 as the contrary would imply that some elements  +(2  +1) with  ̸ = 0 could not be written ( ⊕ ).Decomposition of the TKlog).The permutation T , of GF(2 2 ) has a TU-decomposition involving three -bit permutations ,  and , and an -bit function .It is given in Algorithm 2.The specification of the subcomponents is given in the proof of this theorem.

[
WBDY98] Hongjun Wu, Feng Bao, Robert H. Deng, and Qin-Zhong Ye.Cryptanalysis of Rijmen-Preneel trapdoor ciphers.In Kazuo Ohta and Dingyi Pei, editors, Advances in Cryptology -ASIACRYPT'98, volume 1514 of Lecture Notes in Computer Science, pages 126-132.Springer, Heidelberg, October 1998.Box  was only specified via its look-up table.It is provided in Table 2.The function   is obtained from  via Corollary 3. Its look-up table is provided in Table 3.

Table 3 :
The look-up table of   , a function CCZ-equivalent to .