Classification of Balanced Quadratic Functions

S-boxes, typically the only nonlinear part of a block cipher, are the heart of symmetric cryptographic primitives. They significantly impact the cryptographic strength and the implementation characteristics of an algorithm. Due to their simplicity, quadratic vectorial Boolean functions are preferred when efficient implementations for a variety of applications are of concern. Many characteristics of a function stay invariant under affine equivalence. So far, all 6-bit Boolean functions, 3and 4-bit permutations have been classified up to affine equivalence. At FSE 2017, Bozoliv et al. presented the first classification of 5-bit quadratic permutations. In this work, we propose an adaptation of their work resulting in a highly efficient algorithm to classify n × m functions for n ≥ m. Our algorithm enables for the first time a complete classification of 6-bit quadratic permutations as well as all balanced quadratic functions for n ≤ 6. These functions can be valuable for new cryptographic algorithm designs with efficient multi-party computation or side-channel analysis resistance as goal. In addition, we provide a second tool for finding decompositions of length two. We demonstrate its use by decomposing existing higher degree S-boxes and constructing new S-boxes with good cryptographic and implementation properties.


Introduction
For a variety of applications, such as multi-party computation, homomorphic encryption and zero-knowledge proofs, linear operations are considered to have minimal cost.Nonlinear operations on the other hand cause a rapid growth of implementation requirements.Therefore, it becomes important to create cryptographically strong algorithms with minimal nonlinear components.A recent study in this direction called MiMC [AGR + 16], which is based on some relatively old observations [NK95], uses the simple quadratic function x 3 in different fields as the only nonlinear block of the algorithm.Another work that minimizes the number of multiplications is the LowMC design [ARS + 15], where a quadratic 3-bit permutation is used as the only nonlinear component of a Substitution-Permutation-Network (SPN).
We also see the importance of minimizing the nonlinear components in the field of secure implementations against side-channel analysis.Efforts to decompose the S-boxes of existing algorithms, such as the DES and AES S-boxes, into a minimum number of lower degree nonlinear components (AND-gates, field multiplications or other quadratic or cubic functions), have produced more than a handful of papers.Some of these decomposition tools are generic and work heuristically [CGP + 12, RV13, CRV14, CPRR15, GR16, PV16] whereas others focus on enumerating decompositions of all permutations for a certain size [BNN + 12, KNP13].In general, they all make it clear that there is a significant advantage in considering side-channel security during the design process and hence using low degree nonlinear components.As a reaction to this line of research, a variety of novel symmetric-key designs use simply a quadratic permutation [ABB + 14, BDP + 14, BDP + 15, DEMS15].Examples include Keccak [BDPA13], one instance of which is the new hash function standard, and several candidates of the CAESAR competition.Generating strong, higher degree S-boxes using quadratic functions has also been shown useful in [BGG + 16].These works demonstrate the relevance of our research, which focuses on enumerating quadratic n × m functions for n < 7.
A valuable tool for the analysis of vectorial Boolean functions, which are typically used as S-boxes, is the concept of affine equivalence (AE).AE allows the entire space of n × m functions to be classified into groups with the same cryptographic properties.These properties include the algebraic degree, the differential uniformity and the linearity of both the function and its possible inverse in addition to multiplicative complexity.Moreover, the randomness cost of a first-order masked implementation is also invariant within a class if countermeasures such as threshold implementations are used [Bil15].With similar concerns in mind, our research relies on this affine equivalence classification.

Classification of (Vectorial) Boolean Functions
The classification of Boolean functions dates back to the fifties [Gol59].The equivalence classes for functions of up to five inputs were identified by 1972 [BW72] and Maiorana [Mai91] was the first to classify all 6-bit Boolean functions in 1991.Fuller [Ful03] confirmed in 2003 that this classification was complete.
For vectorial Boolean functions, only n-bit permutations for n ≤ 4 have been completely classified so far [Can07, Saa11, BNN + 15].Most of these classifications use the affine equivalence (AE) tool introduced by Biryukov et al. in [BCBP03].This algorithm computes a representative of the affine equivalence class for any n-bit permutation.In [Can07], De Cannière classifies all 4-bit permutations by transversing a graph of permutations connected by single transpositions and reducing them to their affine equivalence class representative.As this method is unpractical for larger dimensions (n > 4), no classification of the complete space of 5-bit bijective permutations exists.A classification of the APN classes (which have the best cryptographic properties) does exist by Brinkmann et al. [BL08].The authors build a tree of LUTs of 5-bit permutations, in which each level of the tree specifies one more output of the function.The tree is pruned using on the one hand an APN filter function and on the other hand an affine equivalence filter, which is also based on the algorithm of [BCBP03].The quadratic 5-bit permutations have been classified by Bozilov et al. [BBS17].Their approach consists of two stages: First, they generate an exhaustive list of 5-bit permutations from quadratic ANF's.Then, they use the affine equivalence algorithm of Biryukov et al. [BCBP03] to find the affine representatives of all the candidates in this list.Eliminating the doubles results in 75 quadratic classes.This approach uses the AE algorithm ≈ 2 23 times, resulting in a runtime of a couple of hours, using 16 threads.Again, extending this approach to higher dimensions is not feasible.
Vectorial Boolean functions from n to m < n bits have been used as S-boxes as well (e.g. the 6 × 4 DES S-boxes), yet their classification has been largely ignored.They are also used in the construction of larger 8-bit S-boxes by Boss et al.

Decompositions of Higher-Degree Functions
The authors of [BNN + 12, KNP13] decompose all 4-bit permutations in order to provide efficient implementations against side-channel analysis.The decompositions in both works benefit from the affine equivalence classification of permutations.The main difference between them is that [BNN + 12] only focuses on decompositions using quadratic and cubic components.It is shown that not all cubic 4-bit permutations can be composed from quadratics.This work has been extended in [KNP13], in which decomposition of all permutations is enabled by including additions and compositions with non-bijective quadratic functions.The decompositions provided in both these papers have been proven to have the smallest length with the given structure.A possible decomposition for all 6 × 4 DES S-boxes jointly using 4-bit permutations is also provided as an output of the aforementioned research [BKNN15].
A complementary work which decomposes a function into other quadratic and cubic functions is [CPRR15].This work starts from a randomly chosen low-degree function.They iteratively enlarge their set of functions using addition and composition.Finally, the generated set of functions is used to get a decomposition for a target function.This approach is not unlike the logic minimization technique of [BP10].The tool is heuristic and the decompositions provided do not necessarily have the smallest length.The theoretical lower bounds are not necessarily achieved for a randomly selected function decomposition for bigger sizes.However, it performs well for small functions.

Our Contribution
In this work, we explore the extension of Biryukov's AE algorithm to non-bijective n × m functions with m < n and analyse its performance.We propose an algorithm that does not only classify all n-bit permutations, but also all balanced n × m-bit functions for m ≤ n.Our complexity is significantly lower than that of previous algorithms known to date.This allows us to generate all quadratic vectorial Boolean functions with five inputs in merely six minutes, which makes the search for even 6-bit quadratic functions feasible.We also provide the cryptographic properties of these functions and their inverses if possible.
Our work focuses on quadratic functions, since they tend to have low area requirements in hardware, especially for masked implementations.We also introduce a tool for finding length-two quadratic decompositions of higher degree permutations and we use it to decompose the 5-bit AB and APN permutations.Furthermore, we find a set of high quality 5-bit permutations of degree 4 with small decomposition length that can be efficiently implemented.
Our list of quadratic 6-bit permutations is an important step towards decomposing the only known 6-bit APN permutation class as an alternative to [PUB16].

Preliminaries
We consider an n × m (vectorial) Boolean function F (x) = y from F n 2 to F m 2 .The bits of x and the coordinate functions of F are denoted by small letter subscripts, i.e. x = (x 0 , . . ., x n−1 ) where x i ∈ F 2 and F (x) = (f 0 (x), . . ., f m−1 (x)) where f i (x) is from F n 2 to F 2 .We use '•' to denote the composition of two or more functions, e.g.
We use |.| and • for absolute value and inner product respectively.

(Vectorial) Boolean Function Properties
In this paper, we focus on balanced vectorial Boolean functions F (x) = y, i.e. each output y ∈ F m 2 is equiprobable for all inputs x ∈ F n 2 .When n = m, F is thus bijective and typically called an n-bit permutation.
A Boolean function f : F n 2 → F 2 can be uniquely represented by its algebraic normal form (ANF) The algebraic degree of f is The algebraic degree of a function F = (f 0 , f 1 , . . ., f m−1 ) is simply the largest degree of its coordinate functions, i.e.Degr(F ) = max 0≤i<m Degr(f i ).
Definition 1 (Component [NK95]).The components of a vectorial Boolean function F are the nonzero linear combinations β • F of the coordinate functions of F , with β ∈ F m 2 \ {0}.
Definition 2 (DDT [BS90,Nyb93]).We define the Difference Distribution Table (DDT) δ F of F with its entries The differential uniformity Diff(F ) is the largest value in the DDT for α = 0, β: An n-bit permutation F is said to be almost perfect nonlinear (APN) if ∀α = 0, β ∈ F n 2 , the DDT element δ F (α, β) is equal to either 0 or 2. The DDT frequency distribution or differential spectrum ∆ F of F is a histogram of the elements occuring in the DDT: Definition 3 (LAT and Walsh Spectrum [O'C94, CV94]).We define the Linear Approximation Table (LAT) λ F of F with its entries The Walsh spectrum of a Boolean function f : A function's LAT is directly related to its two-dimensional Walsh transform F (α, β) = as follows: Any column in a function's LAT (λ F (α, β) for β fixed) is thus the scaled Walsh spectrum of a component of F .The linearity Lin(F ) is the largest value in the LAT for β = 0, α: An n-bit permutation F is said to be almost bent (AB) if ∀β = 0, α ∈ F n 2 , the LAT element λ F (α, β) is equal to either 0 or ±2 (n−1)/2 .It is known that all AB permutations are also APN.The LAT frequency distribution Λ F of F is a histogram of the absolute values occuring in the LAT: Remark 1.In some works, the linearity is expressed in terms of the Walsh spectrum instead of the LAT table as The two definitions differ by a factor of two, i.e.L(F ) = 2 • Lin(F ).

Affine Equivalence
Functions with algebraic degree 1 are called affine.We use them to define affine equivalence relations that classify the space of all n × m functions.
Definition 4 (Extended Affine Equivalence [CCZ98]).Two n × m functions F 1 (x) and F 2 (x) are extended affine equivalent if and only if there exists a pair of n-bit and mbit invertible affine permutations A and B and an n × m linear mapping L such that The algebraic degree and DDT and LAT frequency distributions are invariant over extended affine equivalence.
Definition 5 (Affine Equivalence [CCZ98]).Two n × m functions F 1 (x) and F 2 (x) are affine equivalent (F 1 ∼ F 2 ) if and only if there exists a pair of n-bit and m-bit invertible affine permutations A and B such that Clearly, affine equivalent functions are always extended affine equivalent but not vice versa.Note that the affine equivalence relation also covers linear equivalence, where A and B are linear permutations (i.e.A(0) = B(0) = 0).Moreover, also affine equivalence preserves algebraic degree and DDT and LAT frequency distributions.In the case of Boolean functions (m = 1), affine equivalence and extended affine equivalence are the same.
It is common practice to take the lexicographically smallest function in an affine equivalence class as the representative, which we denote by R.An efficient algorithm for finding the affine equivalent (AE) representative of any n-bit permutation S was proposed by Biryukov et al. in [BCBP03].In short, it computes the linear representatives of S(x ⊕ a) ⊕ b for all a, b ∈ F n 2 and chooses the lexicographically smallest among them as affine equivalent representative.Since, we rely on this algorithm and modify it according to our needs, we provide a detailed description below.

Finding the Representative of a Permutation
This recursive algorithm described in [BCBP03] finds for a given permutation S the smallest affine equivalent R = B −1 • S • A by guessing some of the output values of the affine permutations A and B and determining the others using the linearity property.Throughout the algorithm, the numbers n A and n B record logarithmically for how many input values the outputs of A and B have been defined.For example, A(x) is defined for all x < 2 n A −1 .It is possible to fix A(0) (resp.B(0)) at the beginning of the algorithm which implies n A (resp.n B ) being initialized to 1.The number of defined values for R(x) is N R , i.e.R(x) will be defined for all x < N R .
The computation starts with x = y = 0 from the ForwardSweep described in Algorithm 1, which serves as the outer loop of the algorithm.The ForwardSweep enumerates all inputs x for which affine transformation A(x) has already been defined and determines the representative output y = R(x).Either there already exists an output y such that S • A(x) = B(y) or we choose y as the next smallest unused power of 2. When the ForwardSweep is complete, we continue with the BackwardSweep in Algorithm 2. Note that when n A = 0 (the very first iteration), there are no inputs to enumerate yet and the computation actually starts with a BackwardSweep.
At the start of Algorithm 2, x is typically a power of 2 which means A(x) cannot be determined from linear combinations and can be chosen freely.If the BackwardSweep is successful (i.e. it finds a suitable A(x) such that S • A(x) = B • R(x)), we recurse on the ForwardSweep.If the BackwardSweep fails, we need to guess A(x).This is for example the case in the very first iteration when n B = 0.
The Guess function is described by Algorithm 3. It fixes R(x) using Algorithm 4 to the smallest unused y and then loops over all available assignments of A(x).For each guess, we try recursion on the ForwardSweep.We need to try all because any guess can result in a lexicographically smaller representative R.
Algorithm 4 builds the representative R and only changes previously determined outputs if they are smaller than the current one.This whole procedure of finding the representative of an n-bit permutation is exemplified in Figure 1 for clarification.Note that even though the S-box we use and the one in [BCBP03] are the same, the representative we obtain is different since we focus on the lexicographically smallest one by assigning, for example, R(0) = 0.Moreover, for the same reason, the representative on the right side of Figure 1 is favored over the left side.

Finding the Representative of a Non-invertible Function
It has been suggested in [BCBP03] that the algorithm in Section 2.3 can be extended to find representatives for non-bijective functions S : F n 2 → F m 2 , but that this is only efficient when n − m is small.When S is not invertible (but still balanced), instead of one single solution to the equation S(x) = y, there are 2 n−m possible x candidates for each y.The additional complexity of enumerating these candidates during BackwardSweep grows larger as m decreases.Therefore, in [BCBP03] the total complexity of finding the affine representative for an n × m function where n > m is estimated as: In what follows, we describe an extension of the algorithm in Section 2.3 which has a non-monotonous complexity behavior as m decreases as can be observed in Figure 3.Note that Figure 3 depicts experimental runtimes whereas Figure 2 depicts an asymptotic complexity estimation.Their scales are thus very different and should not be compared in magnitude.Instead, we are considering only the difference in trends.For m = n, the algorithm is identical to [BCBP03].The runtimes are calculated using a random selection of 500 5 × m functions for each m.Note that since no pseudo-code is provided in [BCBP03]  and the description is very brief, we can not conclude whether this is due to a complexity estimation error or having a slightly different algorithm.Moreover, the real runtimes might approximate the asymptotic complexity better for n → ∞.
One of the changes caused by the non-invertability of S is that we can no longer compute the inverse S −1 and thus we cannot obtain x in Algorithm 2. We propose Algorithm 5 as an alternative in which we loop over all possible x for which S • A(x ) = B(y).
Another difference is in the assignment of y which is the smallest element in F m 2 that does not yet have a corresponding input x such that R(x) = y.Note that y decides the representative output R(x) in the BackwardSweep and Guess runs.The representative R of a balanced function S has the same output distribution as S, which implies each y = R(x) can only occur once in a bijective permutation.This is why Algorithm 2 immediately increments y after using it.In a non-bijective function on the other hand, y can be reused 2 n−m times.Algorithm 5 therefore does not immediately increase y after each BackwardSweep but only when it runs out of candidates x for which S • A(x ) = B(y).The complete procedure for finding the representative of a balanced non-injective function is illustrated in Figure 4.
This second feature actually makes the new algorithm very efficient in finding the smallest representative when n−m is not too large.Instead of guessing A(x), which implies a loop over approximately 2 n guesses, now the list of 2 n−m candidates x immediately gives us the guesses A(x ) that result in the smallest output value R(x).The more often we can reuse an output value y, the less often we need to guess.This can also be observed by comparing the examples in Figure 1 and 4. As a result, the algorithm to find a representative becomes more efficient for n × m functions with m < n.If m becomes very small, the complexity increases again since the enumeration of 2 n−m candidates, which is used also in [BCBP03], becomes the dominant factor.That the complexity first decreases and then increases with m corresponds to our initial observation in Figure 3.In this section, we first describe how all 5×m balanced quadratic functions can be classified iteratively using our algorithm.Even though all 5-bit quadratic Boolean functions and permutations have already been classified in [BW72] and [BBS17] respectively, this is the first time such an analysis is performed for m / ∈ {1, 5}.Moreover, we introduce novel optimizations using the (non-)linearity of the components to perform this classification much faster.We then compare the performances of finding all quadratic permutations using the method in [BBS17] with ours.

Naive Iteration
There exist 2 15 different 5-bit quadratic Boolean functions.Since we target balanced functions, we consider only the balanced 18 259 out of 2 15 as candidate coordinate functions f i : F 5 2 → F 2 .In iterative stages for m = 1 to 5, we systematically augment all balanced 5 × (m − 1) functions with these 18 259 candidates to form a set of 5 × m functions.We then use the adapted AE algorithm to reduce these functions to their affine equivalent representative.This reduction step is the key feature of the classification algorithm, since it not only provides us with all 5 × m representatives, but also significantly lowers the workload of the next stage.The search procedure is described by Algorithm 6.Table 1 shows the number of representatives we obtain for m = 1, . . ., 51 .Our results for m ∈ {1, 5} align with those from previous works and require 50 minutes of computation time, using 4 threads on a Linux machine with an Intel Core i5-6500 processor at 3.20GHz.The comparison of this timing alone with a couple of hours, using 16 threads given in [BBS17] shows the impact of using an iterative approach, made possible by the new AE algorithm of Section 3. Nevertheless, in this section we will describe two ways to further optimize the complexity.

Impact of the Order of Coordinate Functions on Efficiency
Optimizing our classification algorithm comes down to reducing as much as possible the number of functions to which the AE algorithm must be applied.Ideally, given two intermediate functions F 1 and F 2 which are affine equivalent (F 1 ∼ F 2 ), we only want to find the representative of one of them.Affine equivalence is not so easily detected in all cases.However, we can focus on a simpler case.If F 1 and F 2 have the same coordinate functions, but in a different order, then they are naturally affine equivalent.Hence, we will try to fix the order of the coordinate functions of each intermediate function F and in that way eliminate all functions that are equal to F up to a reordering of the coordinates.To do this, we need a property to base the ordering on.Consider the three Boolean quadratic function classes Q (5,1) 0 , Q (5,1) 1 and Q (5,1) 2 for which representative ANF's and nonzero Walsh coefficient distributions are provided in Table 2.
where ĝi (ω) is the Walsh spectrum of g i (x).
Since this property is unique for each class of Boolean functions, we will use it to fix the order of coordinate functions during the classification algorithm.Using Lemma 1, in each intermediate step m, we will only allow new coordinate functions f m for which the linearity is not smaller than the linearity of f m−1 .

Impact of Linear Components on Efficiency
The runtime for finding the affine representative of a function depends on the accuracy of guesses.That is, the algorithm searches the smallest representative for each guess of A(x).As a result, we notice that, the more nonlinear components the function has, the more dead ends the algorithm encounters and the more quickly it finishes.On the other hand, the more linear components the function exhibits, the more valid solutions for the affine transforms and thus the longer the algorithm needs to search through them.Therefore, the algorithm for finding the linear representative becomes less efficient as the number of linear components of the function increases.
In order to illustrate this significant difference, we choose five 5-bit representatives with a different number of linear components.We use the same class enumeration as in [BBS17] and represent the i th quadratic permutation with Q (5,5) i .From each class, we randomly choose 100 permutations and observe the average runtime of the AE algorithm.The results of this experiment are shown in Table 3.

Table 3: Average runtimes of the AE algorithm [BCBP03] for some 5-bit representatives
Class # Linear Components Av.Runtime (s.) Moreover, we further analyze the runtime of the AE algorithm by removing coordinates to derive n × m functions with less linear components.The result is illustrated in Figure 5.
We introduce the following definition of a linear extension in order to define our optimization for the classification.
Any balanced n × m function can be linearly extended with n − m linear components into a balanced n-bit permutation.Correspondingly, each balanced n-bit permutation with 2 n−m − 1 linear components can be generated as a linear extension of some balanced n × m function with zero linear components.We therefore initially eliminate all linear coordinate functions from our search, generating 5 × m functions with only nonlinear coordinates in each step.In the very last stage, we obtain a list of 5-bit bijections without linear components.Finally, we add to this list all the linear extensions of the 5 × m representatives found so far (for m = 1, . . ., 4) to also obtain the 5-bit bijections with 2 n−m − 1 linear components.This optimization increases the efficiency of the search in three ways.Firstly, it reduces the number of f i candidates inserted in each stage (|F| ).Secondly, it discards functions for which finding the AE representative is slow.Finally, it reduces the number of n × m representatives that each stage starts from (|R| ).

Performance Comparison
Table 4 summarizes the results of the optimized search that takes a mere 6 minutes, using 4 threads.This significant increase of performance enables us to classify for the first time also all 6-bit functions, which is described in Section 6.Note that the first column of Table 4 (m = 0) corresponds to the classes of affine 5 × i functions.The last column holds the total number of 5 × i functions, which is the sum of each row and corresponds to Table 1.
Each column starts with the number of "purely nonlinear" 5 × m representatives (only nonlinear coordinates).The rows below the diagonal hold the number of classes that result from linearly extending the classes in previous rows.The last row shows the number of linear components in the corresponding 5 × 5 bijections and is equal to 2 5−m − 1.We find 22 quadratic 5-bit equivalence classes without any linear components.Adding to this the linear extensions of smaller functions, we obtain all the 75 quadratic and the one affine 5-bit representatives.
Note that the number of classes obtained from linearly extending all 5 × m functions can be much smaller than the number of 5 × m classes itself (for example 23 93 for m = 4).This can be explained by the fact that linearly extending two extended affine but not affine equivalent functions can result in affine equivalent permutations (i.e. a collision in the linear extension).Consider for example the following two 5 × 3 functions that are It is straightforward to verify that linearly extending both functions with coordinate functions x 2 and x 3 results in two affine equivalent 5-bit permutations.

Decomposing and Generating Higher Degree Permutations
We now adapt our algorithm to (de)compose higher degree functions into/from quadratics.This leads to area efficient implementations especially in the context of side-channel countermeasures and masking, where the area grows exponentially with the degree of a function.Below we describe length-two decompositions and constructions of cryptographically interesting permutations.

Length-two Decomposition
We are trying to decompose a higher degree function H : If a quadratic decomposition of length two exists, then we can state that Suppose we fix R 2 (to one of the known n-bit representatives) and we want to find the representative R 1 and the affine permutation A such that (1) As with the classification of Boolean functions, we perform this search iteratively, starting from n × 1 Boolean functions f for which f • R 2 : F n 2 → F 2 is extended affine equivalent to a component function of H.We thus select the candidates for f using the following criteria: Starting from this list of candidates, we proceed in a similar manner as in Algorithm 6.We only slightly have to tweak this algorithm to obtain the decomposition Algorithm 7.Each time we augment an n × (m − 1) function with one of the Boolean function candidates Algorithm 7: decompositions of length two for all quadratic n-bit representatives R 2 do Initialize R = {0}, S = ∅ and m = 1; F ← all quadratic Boolean functions f satisfying above criteria (C1-C3); The quadratic function classification algorithm (Algorithm 6) is very efficient because it reduces the lists of intermediate functions to their affine equivalent representatives at each step m.However, we cannot do that in this case as this would change the affine transformation A in the decomposition (see Eqn. (1)).Let S 1 = B • R 1 • A be a candidate for which S 1 • R 2 ∼ H and let R 1 be its affine representative.Reducing S 1 to R 1 would discard the affine transformation A. In that case, we would only be able to decompose functions that are affine equivalent to the composition of two representatives: R 1 • R 2 .In other words, if there is another candidate S 1 affine equivalent to S 1 , we do not want to discard it as it will not necessarily result in affine equivalent compositions.
However, without any reductions in the intermediate steps of the algorithm, the search becomes very inefficient as the list of candidate functions grows exponentially.There is still a redundancy in our search because of the affine output transformation B that is included in S 1 .If S 1 is only left affine equivalent to S 1 , then their compositions are affine equivalent: We therefore adapt the AE algorithm to find the lexicographically smallest function R L 1 that is left affine equivalent to S 1 : R L the left affine representative of S 1 .The algorithm to find R L 1 is identical to finding the affine equivalent representative with the input affine transformation constrained to the identity function.This constraint removes the need for guesses and makes the algorithm very efficient.An example is shown in Figure 6.Algorithm 7 summarizes the resulting decomposition method.a 0 1 2 3 4 5 6 7 8 9 A B C D E F S(a) 1 B 9 C D 6 F 3 E 8 7 4 A 2 5 0 x 0 1 2 3 4 5 6 7 8 9 A B C D E F R L (x) 0 1 2 4 8 5 7

Almost Bent Permutations
There are five APN 5 × 5 permutations up to affine equivalence [BL08], all corresponding to a power map over F 2 5 .Two of these are quadratic and AB and correspond to classes Q (5,5) 74 (∼ x 3 ) and Q (5,5) 75 (∼ x 5 ).We demonstrate Algorithm 7 by decomposing the inverses of these classes (resp.x 11 and x 7 ), which are cubic and (naturally) also AB.The algorithm is very efficient in this case because the properties of AB's are so well defined: Firstly, all components of the cubic AB's have the same algebraic degree (=3).We also know that the DDT of an AB function contains only zeros and twos and its LAT contains only zeros and elements with absolute value 4. It immediately follows that also the Walsh transform of each coordinate function of the AB is equal to either 0 or ±8.
Moreover, when we look at all 5 × m subfunctions H , there is only one permitted differential spectrum and LAT frequency distribution for each m.It is indeed known that all coordinate functions of the AB are (extended) affine equivalent.
We enumerate all 75 candidates for R 2 and perform the search for S 1 = R 1 • A using Algorithm 7. When R 2 is the representative of classes Q (5,5) 1 to Q (5,5) 74 , the algorithm finds no 5-bit bijections that compose with R 2 to a cubic AB.The search only ends with non-empty R when we perform it with R 2 the representative of Q (5,5) 75 , which is itself the quadratic AB permutation x 5 .The resulting R 1 is equal to R 2 and their composition forms the AB class that holds the inverse of Q (5,5) 75 (corresponding to power map x 7 ).This decomposition is easily verified using power maps.Indeed, x 5 • x 5 • x 5 = x 125 = x 1 mod 31 .
Without the constraint that the AB needs to be cubic, we also find a decomposition for class Q (5,5) 75 74 .A length-two decomposition for the odd cubic AB permutations (x 11 ) is not found.Since the algorithm is exhaustive, this means it does not exist.Indeed, it is shown in [NNR19] that the shortest decomposition of x 11 has length three.

The Keccak χ Inverse
The nonlinear transformation χ used in the Keccak [BDPA13] sponge function family χ (Figure 7) is a quadratic 5-bit permutation from class Q (5,5) 68 with a cubic inverse.For the possibility of implementing an algorithm using χ −1 , we decompose this cubic permutation (see Table 6).The Keccak inverse does not have the same strong properties as the AB permutations.Each coordinate function is still cubic but the differential and linear properties are naturally weaker.Firstly, apart from zeros we find both ±4 and ±8 in the LAT.For the DDT, there are multiple differential spectra for the intermediate 5 × m sub functions.As explained above, we generate the list of possible DDT and LAT frequency distributions for each m = 1, . . ., 4 and feed this as input to the search algorithm.We filter out all intermediate functions F : F n 2 → F m 2 for which the DDT and LAT frequency distributions of F • R 2 do not occur in this list.
While the search finds many classes with the same cryptographic properties as the Keccak inverse, a decomposition of length 2 for χ −1 itself does not appear to exist.

Towards Higher-Degree Permutations
When it comes to choosing a nonlinear permutation for use in a cryptographic primitive, the designer will sooner go to those with higher degree as they provide more resilience against higher-order differential and algebraic attacks [DEM15].With masked implementations in mind, we thus want to find strong n-bit permutations with high algebraic degree for which a decomposition into quadratic blocks exists.Our decomposition algorithm can be used for this purpose.If instead of searching for specific functions H, we define a set of more general but strong criteria, we can use the algorithm 7 to generate a list of favorable permutations.In particular, we use the following criteria to perform a search for 5-bit permutations S with optimal algebraic degree and near-optimal cryptographic properties: -S is balanced -Degr(S) = 4 -Lin(S) ≤ 6 -Diff(S) ≤ 4 The first three criteria are easily translated for intermediate 5 × m functions (the bound on the LAT table stays the same).As we are not looking for a known class, this is more difficult for the bound on the DDT.We use the fact that the upperbound on the values in the DDT at most doubles every time we discard one output bit (see Theorem 1).This upperbound is not tight, but can be used to filter some of the unusable intermediate functions F .
Our search delivers 17 quartic affine equivalence classes with very good cryptographic properties, shown in Table 7.One of those is the APN class, which contains the permutation formed by the inversion x −1 in F 2 5 .Indeed, it was shown in [NNR19] that this permutation has decomposition length two.Each of these very strong 5-bit S-boxes have an efficient masked implementation, as they can be decomposed into only two quadratic components.This list is only a sample of the functions that can be found using this method.

Classifying × m Quadratic Functions
The efficiency of Algorithm 6 makes it feasible to extend the search for quadratic permutations to n = 6 bits for the first time.There are 2 21 different 6-bit quadratic Boolean functions, of which there are 914 004 nonlinear balanced ones.This is our list F of candidate coordinate functions f i : F 6 2 → F 2 .Generating all classes of 6 × m functions for m < 6 without linear components takes 8.5 hours on 24 cores.The total number of classes found for each m is shown in Table 8 2 .Tables 9 to 12 show histograms of the New cryptographic algorithms should be designed with resistance against side-channel attacks in mind.When it comes to choosing S-boxes, designers can use our classification to pick quadratic components and use our (de)composition tool to create cryptographically strong S-boxes with efficient masked implementations.After the classifications of 4and 5-bit permutations in previous works, this work expands the knowledge base on both classification and decomposition, bringing us one step closer to classifying 8-bit functions and decomposing the AES S-box using permutations instead of tower field or square-and-multiply approaches.

Figure 1 :
Figure 1: Example of finding the linear representative for a 4-bit bijective S

Figure 2
Figure 2 depicts how the predicted complexity (for fixed n = 5) increases monotonously as m decreases.In what follows, we describe an extension of the algorithm in Section 2.3 which has a non-monotonous complexity behavior as m decreases as can be observed in Figure3.Note that Figure3depicts experimental runtimes whereas Figure2depicts an asymptotic complexity estimation.Their scales are thus very different and should not be compared in magnitude.Instead, we are considering only the difference in trends.For m = n, the algorithm is identical to[BCBP03].The runtimes are calculated using a random selection of 500 5 × m functions for each m.Note that since no pseudo-code is provided in[BCBP03]

Algorithm 6 :
Generate Quadratic Functions Initialize R = {0}, S = ∅ and m = 1; Let F contain all balanced quadratic Boolean functions; while m < 5 do for all S = (S 1 , . . ., S m−1 ) ∈ R do for all candidates f ∈ F do if S = (S, f ) is balanced then S ← S ∪ {S }; end end end R ← ∅; for all S ∈ S do Find affine equivalent representative R of S; R ← R ∪ R; end Sort and eliminate doubles from R. S ← ∅; m ← m + 1; end

Figure 5 :
Figure 5: Actual runtimes observed for some 5-bit functions

Figure 6 :
Figure 6: Example for finding the left representative R L of S. Input transformation A is fixed to the identity function: A(x) = x

Figure 8 :
Figure 8: Number of quadratic n × n classes for growing n

Table 4 :
Number of affine equivalence classes for 5 × i functions for i = 1, . . ., 5 with 2 f ) is balanced and (∆ S •R2 , Λ S •R2 ) ∈ D then × m function F , we verify that the DDT and LAT of F • R 2 have the same frequency distributions as those of some function H : F n 2 → F m 2 , that has as its coordinate functions a subset of the components of H. Let L (F n 2 → F m 2 ) be the set of all balanced linear mappings from F n 2 to F m 2 .Then, we can describe H as L • H for someL ∈ L (F n 2 → F m 2 ).In order to eliminate false candidates as early as possible, we verify for each intermediate n × m candidate F if the composition F • R 2 can be affine equivalent to some subfunction H .In particular, we check that

Table 5 :
Look-up-tables for the even cubic AB function F and its decompositionF = S 1 •R 2 with S 1 ∼ R 2