Provable Security of SP Networks with Partial Non-Linear Layers

. Motivated by the recent trend towards low multiplicative complexity blockciphers (e.g., Zorro, CHES 2013; LowMC, EUROCRYPT 2015; HADES, EURO-CRYPT 2020; MALICIOUS, CRYPTO 2020), we study their underlying structure partial SPNs , i.e., Substitution-Permutation Networks (SPNs) with parts of the substitution layer replaced by an identity mapping, and put forward the ﬁrst provable security analysis for such partial SPNs built upon dedicated linear layers . For diﬀerent instances of partial SPNs using MDS linear layers, we establish strong pseudorandom security as well as practical provable security against impossible diﬀerential attacks. By extending the well-established MDS code-based idea, we also propose the ﬁrst principled design of linear layers that ensures optimal diﬀerential propagation. Our results formally conﬁrm the conjecture that partial SPNs achieve the same security as normal SPNs while consuming less non-linearity, in a well-established


Introduction
Blockciphers are one of the most prominently used cryptographic primitives. The classical approaches to the design of blockciphers include Feistel networks and substitutionpermutation networks (SPNs), with DES and AES as well-known examples. A Feistel round applies a domain-preserving function (sometimes non-invertible, as in DES) on half of the data, and then executes XOR and swap operations, see Fig. 1 (a). This can be generalized along multiple axes, e.g., employing other group operations instead of XOR, employing contracting or expanding round functions, and employing more than 2 data chunks to constitute the so-called multi-line Type-II generalized Feistel networks (see Fig.  1 (b)). An SPN round, on the other hand, consists of parallel application of many instances of small S-box on the "full" data (divided equally into many chunks), composed with a (typically linear) transformation T , see Fig. 1 (c). In fact, as stated in [KL15,Chapter 6.2], the design of modern blockciphers is dominated by (generalized) Feistel networks (including the Lai-Massey structure of the cipher IDEA [LM91], which may be viewed as a sophisticated variant of Feistel [YPL11]), and SP networks.

Our Contribution
We provide systematic analyses of partial SPNs, regarding strong pseudorandom permutation (SPRP) security, provable security against Impossible Differential (ID) and Zero-Correlation linear (ZC) attacks, and diffusion. Our results are as follows.
• In Sect. 3, we prove that a 5-round P-SPN with rate 1/2 is an SPRP, where the cost of 5w/2 S-box calls is less than that of a normal (linear) SPN (3w calls [DKS + 17, CDK + 18]). This P-SPN construction relies on an MDS linear layer that fulfill some additional requirements.
• In Sect. 4, we show that 4-round P-SPNs with rate at least 3/4 and MDS linear layers are secure against ID and ZC attacks. This saves one round compared to the AES-like structure, which needs 5 rounds for the same security [SLG + 16].
• For P-SPNs with rate r < 1/2, r −1 ∈ N, we propose the first principled linear layers constructed from MDS codes. Our proposal consists of r −1 −1 different transformations, and achieve a minimum security criteria, i.e., no r −1 -round differential with probability one. See Sect. 5 for details.
In all, our results (and the comparisons to existing results on AES-like SPNs) have justified the soundness of P-SPNs: as approaches to constructing efficient blockciphers, P-SPNs could be comparable to, or even surpass the normal SPNs, in some well-defined sense. Below we will elaborate in detail.

Small-box cryptography, and SPRP security with rate 1/2
With a model recently put forward by Dodis et al. [DKS + 17, CDK + 18], i.e., modeling the Sboxes as small ideal primitives and the linear layers as efficient functions, it turns possible to study the security of P-SPNs from a theoretical point of view. The S-boxes act as the only source of cryptographic hardness. This methodology was termed "small-box cryptography" by Dodis [Dod18], to highlight the deviation from the classical practice-oriented provable security based on large-domain primitives (e.g., based on the AES). Actually, in the past decades, various structures, including the standard SPNs [IK01, MV15,DSSL16] and the multi-line generalized Feistel networks (GFNs) [ZMI90, IK01, MV00, SM10, HR10, BFMT16], have been studied in this model, enabling comparisons. In light of this, assuming that each round calls a public random n-bit permutation as the S-box and a strong linear layer, we prove that a 5-round rate 1/2 P-SPN is a strong pseudorandom permutation (SPRP), up to 2 n/2 queries the classical birthday security (like the Luby-Rackoff result [LR88]). To ensure this result, the linear layer shall achieve stronger diffusion than a general MDS transformation. This indeed matches intuitions.
Our SPRP results on P-SPNs not only provide support for its reliability-as reliable as the more common SPNs and GFNs, but also enable comparisons. For clarity, we list known wide SPRP constructions in Table 1. Here we focus on the so-called "linear structures" of Nandi [Nan15], in which block functions/S-boxes constitute the only source of non-linearity. It is easy to make a fair comparison between linear structures: relative multiplicative complexity (MC) is reflected by the total number of S-boxes, while relative AND Depth is reflected by the maximum number of S-boxes on any path from an input data chunk to Table 1: Comparison to existing wide SPRP structures. The Rounds column presents the number of rounds sufficient for birthday-bound security, where λ(w) = log 2 1.44w . For Type-II GFN (i.e., GFNs with w/2 block functions per round, see Fig. 1 (b)), note that 2λ(w) = 2 log 2 1.44w ≥ 6 when w ≥ 4. Parameters in the MC and AND Depth columns are relative w.r.t. the S-box. The mode XLS [RR07] is not included due to attacks [Nan14,Nan15] an output chunk, see Table 1. 1 Note that classical blockcipher structures GFNs and linear SPN are all linear structures. On the other hand, the structures CMC, EME and EME * were designed as wide SPRP encryption modes rather than blockcipher structures-indeed, CMC is sequential, as indicated by its huge AND Depth. Regarding classical blockcipher structures, the relative MC 5w/2 of rate 1/2 P-SPNs is less than that of the normal linear SPN (which is 3w), and this confirms the conjecture of less non-linearity. Also, rate 1/2 P-SPNs outperform the best GFNs definitively.
In detail, ID attacks were introduced in the 1990s [BBS99], and exhibited differentials with probability 0 to distinguish the cipher from random. ZC attacks were introduced in 2011 [ BR14,BW12], and leveraged linear hulls with correlation zero for distinguishing. Both have become major cryptanalysis techniques. For a dedicated iterated blockcipher, there always exist IDs and ZCs for any rounds with some keys. Though, being effective for only a small set of weak keys, such distinguishers are useless. To remedy this and to retain generality, we follow [SLR + 15, SLG + 16] and concentrate on truncated IDs and ZCs on P-SPN structures E P-SPN . Such models capture IDs and ZCs that are independent of the secret keys as well as the concrete S-boxes, and we refer to Sect. 2.4 for formal definitions.
As results, we prove that for P-SPN structures E P-SPN with MDS linear layers and rate at least 3/4, there do not exist 4-round truncated ID distinguishers. In other words, no 4-round impossible differential exists in such P-SPNsunless the details of the S-boxes are taken into account. As complement, we also show that 3-round IDs always exist as long as the rate is less than 1, thus 4-round is optimal. By the links between cryptanalytic techniques, security against ZC attacks is also established. These demonstrate insights to the longest possible ID and ZC distinguishers on P-SPNs.
In [BDD + 15, Sect. 6.2], it was conjectured that trading the amount of non-linearity for stronger linear layers mitigates "structural attacks", which in that particular context refers to ID, zero-correlation linear, and integral attacks. This is confirmed by our results, since 4-round AES-like structures do admit ID distinguishers. For AES-like structures, provable security against generic IDs is only achieved with ≥ 5 rounds [SLG + 16], which is one more round than rate 3/4 MDS-based P-SPNs.
On the other hand, we stress that this does not mean P-SPNs are stronger than SPNs in general. Indeed, AES-like SPNs are using composed linear layers that are much weaker than huge MDS transformations, and if the latter are adopted, [SLG + 16, Theorem 2] implies that even 3 rounds are already sufficient for generic ID security. Though, it could indeed be beneficial to use stronger linear layers and less S-boxes.
We also attempted for better provable bounds against differential and linear attacks. Yet, our conclusions are mostly negative, admitting the difficulty to establish them by pencil and paper. This is in accordance with [BDD + 15], in which an automated searching tool was developed for provable differential bounds. For the sake of space, we include these results in Appendix D (we don't view this as our main results).

Linear layers for small rate P-SPNs
Another important question is whether the P-SPN approach could be pushed towards low rates and what would be the corresponding design guideline for the linear layer(s). Typically, the design principle of linear layers is to ensure a maximum number of active Sboxes in differential/linear characteristics. It has been known that an MDS transformation M achieve this in normal SPNs: the idea is to connect {x M · x} x∈{0,1} wn ,x =0 with a set of MDS codewords. Though, this idea can only ensure properties within the differences in two consecutive rounds. For a P-SPN with rate r, r −1 ∈ N, this appears insufficient: it has been noticed that for r −1 − 1 rounds, there always exist differential paths with probability 1 [BDD + 15]. By this, the very least requirement for a good linear layer is to ensure that no r −1 -round probability-1 differential path exist. But this requires to address dependencies between differences in consecutive r −1 rounds, which seems quite intricate. Moreover, classical blockciphers typically employ the same linear layer in all rounds, and it is extremely difficult to identify a linear layer that ensures complicated properties as mentioned. Due to this gap, LowMC employed "independent random linear layers" to simplify the security analysis, and the designers have left dedicated linear layers with solid theory foundation as an open problem [ARS + 15, Conclusion].
We address this question. Our idea is a natural extension of the above MDS idea: for rate r, we construct r −1 − 1 linear transformations T M 1 , . . . , is linked to a long MDS code. The MDS property ensures at least (r −1 − 1)w + 1 active chunks in x . . . ( 1 i=r −1 −1 T M i ) · x, which implies at least 1 active S-box in r −1 rounds.
Of course, the above proposals need refinements as well as more validations before being used in real blockciphers. Though, it is important to make this first step. In addition, this shows instead of using independent linear layers to simplify the situation, we can indeed use dependent ones and leverage the dependence for the security arguments. Further improved designs probably require optimized searching algorithms or heavy coding theory tools.

Related Work
A concurrent and independent work of Grassi, Rechberger, and Schofnegger (GRS) [GRS20] exhibited conditions on P-SPN linear layers that are sufficient and necessary for the existence of iterative subspace trails with probability 1. These in particular include truncated differential trails, which creates strong resemblance between GRS and our linear layers. While both results imply the non-existence of "obvious" differential attacks on infinite rounds, we remark that regarding differential trails with no active S-boxes, our linear layers ensure stronger security than GRS, since non-existence of r −1 -round probability-1 differential path Our security goal =⇒ non-existence of infinite probability-1 differential path =⇒ non-existence of iterative probability-1 differential path

One of GRS's goals
Actually we identify sufficient conditions for the best possible differential security within 1/r rounds, which might be the first step towards lower bounds on the number of active S-boxes. In addition, we also provide a solid approach towards constructing a series of linear layers with desirable properties.
The advantages of GRS's work are as follows.
• First, GRS's proposal uses only a single linear permutation T ∈ F w×w that provides full diffusion after a finite number of rounds. This is simpler than our r −1 − 1 transformations. In particular, they showed that the MDS property is not needed for their goals. 2 • Second, GRS also studied preventing iterative truncated differentials with active S-boxes, which is an important issue not addressed by us.
In summary, the results of GRS and ours are somewhat incompatible and complementary. We are currently unable to extend our treatment to (the more practical case with) more than r −1 rounds, while GRS result does not ensure lower bounds on the number of active S-boxes. Both results could be starting points for future works.

Organization
We establish notations and models in Sect. 2. Then in Sect. 3, we study the SPRP security of rate 1/2 P-SPNs; in Sect. 4, we study the security of P-SPNs against generic IDs and ZCs; in Sect. 5, we present our extended MDS code-based linear layers. We finally conclude in Sect. 6.
For any positive integer m, we write P(m) for the set of permutations of {0, 1} m . We view n as a cryptographic security parameter and let F := GF(2 n ), which is identified with {0, 1} n . The zero entry of F is denoted by 0 (the sans serif typestyle). Following the cryptographic convention, a wn-bit string x ∈ {0, 1} wn is also viewed as a column vector in F w . Hence, x T is a row vector obtained by transposing x. Indeed, bit strings and column vectors are just two sides of the same coin. Throughout the remaining, depending on the context, the same notation, e.g., x, may refer to both a bit string and a column vector, without additional highlight. In the same vein, the concatenation x y is also "semantically equivalent" to the column vector x y .
In this respect, for x ∈ F w , we denote the j th entry of x (for j ∈ {1, . . . , w}) by Let T ∈ F w×w , then the branch number of T (from the viewpoint of differential cryptanalysis) is defined as min x∈F w ,x =0 {wt(x) + wt(T · x)}. A matrix T ∈ F w×w reaching w + 1, the upper bound on such branch numbers, is called Maximum Distance Separable (MDS). MDS matrices have been widely used in modern blockciphers including the AES, since the ensured lower bounds on weights typically transform into bounds on the number of active S-boxes (i.e., S-boxes with non-zero input differences).

P-SPN: SP Networks with Partial Non-linear Layers
To ease a comparison, we first recall the standard Substitution-Permutation Networks (SPNs). An SPN defines a keyed permutation via repeated invocation of three transformations: 3 addition of a round key, blockwise computation of a public, cryptographic permutation called an "S-box", and application of a linear permutation. Formally, a λround SPN taking inputs of length wn where w ∈ N is the width of the network, is defined by a distribution K over . This is close to the practice of keyalternating ciphers such as the AES. Given round keys k 0 , . . . , k λ ∈ K 0 × . . . × K λ and input x ∈ {0, 1} wn , the computation of the SPN is described in Fig. 2 (left). One may also see Fig. 3 (left) for an illustration.
A partial SP-network P-SPN is very similar to an SPN, except that its S-box layer contains less than w S-box evaluations, as shown in Fig. 2 (right). We call the proportion of S-box evaluations its rate. E.g., if each round consists of w/2 S-box evaluations, then the rate is r = 1/2. If S 1 , . . . , S λ are efficiently invertible and each T i is efficiently invertible, then both computations in Fig. 2 are reversible given the round keys k 0 , . . . , k λ . Also see Fig. 3 (right) for illustration. k3 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 S1 , which essentially allows for non-linear permutations instead of the linear T 1 , ..., T λ−1 . In this paper we only consider the above specific models using linear permutations, both for simplicity and for consistency with the very motivation of using P-SPNs (i.e., to reduce the amount of non-linearity). We refer to [CDK + 18] for a complete discussion on the models.

SPRP Security of P-SPNs, and the H-coefficient Technique
Following [DKS + 17, CDK + 18], we consider P-SPN constructions that are defined by linear permutations {T i ∈ F w×w } λ−1 i=1 and a distribution K, and that take oracle access to λ public, random permutations S = {S i : {0, 1} n → {0, 1} n } λ i=0 ; we write this as P-SPN S k , where k = (k 0 , . . . , k λ ). We then analyze security of the construction against unbounded-time attackers making a bounded number of queries to the construction and to S. Formally, we consider the ability of an adversary D to distinguish two worlds: the "real world", in which it is given oracle access to S and P-SPN S k (for unknown keys k sampled according to K), and an "ideal world" in which it has access to S and a random permutation P : {0, 1} wn → {0, 1} wn . By default, we always allow D to make forward and inverse queries to all its oracles (though we do not write this explicitly). With these, for a distinguisher D, we define its strong-PRP advantage against the construction C as where the maximum is taken over all distinguishers that make at most q C queries to their left oracle and q S queries to their right oracles. We use Patarin's H-coefficient technique [Pat09] to prove SPRP security of P-SPNs. We provide a quick overview of its main ingredients here. Our presentation borrows heavily from that of [CS14]. Fix a distinguisher D that makes at most q queries to its oracles. As in the security definition presented above, D's aim is to distinguish between two worlds: a "real world" and an "ideal world". Assume wlog that D is deterministic. The execution of D defines a transcript that includes the sequence of queries and answers received from its oracles; D's output is a deterministic function of its transcript. Thus, if µ, ν denote the probability distributions on transcripts induced by the real and ideal worlds, respectively, then D's distinguishing advantage is upper bounded by the statistical distance where the sum is taken over all possible transcripts τ . Let T denote the set of all transcripts such that ν(τ ) > 0 for all τ ∈ T . We look for a partition of T into two sets T 1 and T 2 of "good" and "bad" transcripts, respectively, along with a constant 1 ∈ [0, 1) such that It is then possible to show (see [CS14] for details) that is an upper bound on the distinguisher's advantage.
Let sgn : F → {0, 1} be defined as which, in some sense, summarizes the "pattern" of the vector x.
Let α, x ∈ {0, 1} wn , and let α, x be the inner product between α and x. Then, given a function G : F w → F w , the correlation cor of the linear approximation for an output mask α 2 and an input mask α 1 is defined by If cor G (α 1 , α 2 ) 2 −wn , then α 1 , α 2 constitute a good linear approximation of G and can be used for linear cryptanalysis [Mat94]. On the other hand, if cor G (α 1 , α 2 ) = 0, then (α 1 → α 2 ) is called a Zero Correlation (ZC) linear hull of G. Such linear approximations without any bias also enable distinguishing [BR14, BW12].

Structures and their Differential/Linear Properties
Cryptanalytic practice usually focuses on detecting IDs and ZC linear hulls that are independent from the concrete S-boxes and keys. Concretely, attacks try to determine whether there is a difference (mask) of an S-box or not, regardless of the value of this difference (mask). The model of structures was proposed by Sun et al. [SLR + 15, SLG + 16] to characterize the intuition of "being independent of the choices of S-boxes". Below we present [SLR + 15, Definition 2] adapted to our notations.
Definition 1 (Structures). Let f : F w → F w be a cryptographic function defined upon bijective S-boxes on F.

1.
A structure E f on F w is defined as a set of functions f which are exactly the same as f except that the S-boxes can take all possible bijective transformations on F.

Let
In fact, truncated ID and ZC attacks against word oriented blockciphers typically focus on ID and ZC distinguishers on the corresponding structures. Notable examples following this strategy include attacks against the AES [BR14,MDRMH10] and Camellia [BGW + 14]. The structure-based approach is thus of some practical relevance, and has motivated researches on provable security w.r.t. IDs/ZCs of structures. To our knowledge, this structure-based approach remains the only method to investigate provable security against ID and ZC attacks on general blockcipher constructions ("unconditional" ID/ZC security proofs are limited to certain blockciphers such as the AES [WJ18]).

Rate 1/2: Birthday SPRP Security at 5 Rounds
In this section, we focus on the SPRP security of P-SPNs with rate 1/2. For simplicity, we assume that the width w is even. We will frequently write M ∈ F w×w in the block form of 4 submatrices in F w/2×w/2 . For this, we follow the convention using u, b, l, r for upper, bottom, left, and right resp., i.e., We use brackets, i.e., (M −1 ) xx , xx ∈ {ul, ur, bl, br}, to distinguish submatrices of M −1 (the inverse of M ) from M −1 xx , the inverse of M xx . We will first introduce a useful operator on the linear transformation T in Sect. 3.1. Then, in Sect. 3.2 we prove security for 5 rounds. Nandi's idea [Nan15] gives rise to a simple chosen-plaintext attack against 3 rounds. For completeness, we present a description adapted to our context in Appendix A.

A Useful Operator on the Linear Layer
As per our convention, we view u, v ∈ F w as column vectors. During the proof, we will need to derive the "second halves" u 2 := u[w/2 + 1..w] and v 2 := v[w/2 + 1..w] from the "first halves" u 1 := u[1..w/2], v 1 := v[1..w/2], and the equality v = T · u. To this end, the By this, we define an operator on T as follows: It can be seen that, u 2 , v 2 can be written as u 1 , v 1 multiplied by T , i.e., This operator will be useful in both Sect. 3.2 and Sect. 4.
. This implies the following interesting property.

SPRP Security at 5 Rounds
We will prove security for 5-round P-SPNs built upon 5 "S-boxes"/random permutations S = {S 1 , S 2 , S 3 , S 4 , S 5 } and a single linear layer T . Formally, Using a single linear layer simplifies both the construction and the notations. Recall from our convention that T ul , . . . , (T −1 ) br constitute the eight submatrices of T and T −1 . In fact, (T −1 ) ul , . . . , (T −1 ) br can be derived from T ul , . . . , T br , but the expressions are too complicated to use. We next characterize the properties on T that is sufficient for security.

Definition 2 (Good Linear Layer for 5 Rounds). A matrix
ur · (T −1 ) ur , and T br · T −1 ur · (T −1 ) ur are such that: 1. They contain no zero entries, and 2. Any column vector of the 6 induced matrices consists of w/2 distinct entries.
We remark that, as T is MDS, all the four matrices T ul , T ur , T bl and T br are all MDS (and invertible). A natural question is whether such strong T exists at all. For this, we make an exhaustive search for n = 8, 11 and find some candidates: see Appendix B.
With such a good T , we have the following theorem on 5-round P-SPNs.
Theorem 1. Assume w ≥ 2, and q S + wq C /2 ≤ 2 n /2. Let C5 be a 5-round, linear P-SPN structure defined in Eq. (5), with distribution K over keys (k 0 , . . . , k 5 ). If k 0 and k 5 are uniformly distributed and the matrix T fulfills Definition 2, then All the remaining of this subsection devotes to prove Theorem 1. The main flow follows the general paradigm of the H-coefficient technique. In detail, we first establish notations in subsect. 3.2.1. We then complete the two steps defining and analyzing bad transcripts and bounding the ratio µ(τ )/ν(τ ) for good transcripts in subsect. 3.2.2 and 3.2.3 resp. For clarity, the proofs of some of the lemmas are deferred to subsect. 3.3.
Remark 1. Rate 1/2 P-SPN may remind the reader of the Feistel network, which also applies the random round functions to a half of the data in each round. However, the two schemes significantly deviate in detail, and thus rate 1/2 P-SPN consumes one more round than Feistel (which needs 4 rounds) to allow for provable security, and the concrete proof approaches are also different. We refer the reader to Appendix C for a complete discussion.

Remark 2.
The Misty network slightly resembles a rate 1/2 P-SPN with w = 2. As a Misty-R round has basically the same cryptographic strength as the inverse of a Misty-L round (see [Lee13]), below we focus on Misty-R. The "diffusion layer" of Misty-R, which maps u1 u2 to u2 u1⊕u2 , is much weaker than Definition 2. This matches the observation that Misty-R achieves faster diffusion in the forward direction than that in the backward, and thus 5 Misty-R rounds are needed for SPRP security. In contrast, for rate 1/2 P-SPN with a good linear layer and w = 2, actually 4 rounds could be secure, as briefed in Appendix C. In all, the linear layers we use are significantly stronger than Misty's and indeed help achieving better security.

Proof setup
Fix a deterministic distinguisher D. Wlog, we assume D makes exactly q C (non-redundant) forward/inverse queries to its left oracle that is either C5 S k or P , and exactly q S (nonredundant) forward/inverse queries to each of the oracle S i on its right side. We call a query from D to its left oracle a construction query, and a query from D to one of its right oracles an S-box query.
The interaction between D and its oracles is recorded in the form of 6 lists of pairs lists the construction queries-responses of D in chronological order, where the i th pair (x (i) , y (i) ) indicates the i th such query is either a construction query x (i) that was answered by y (i) or an inverse query y (i) that was answered by x (i) . Q S1 , . . . , Q S5 are defined similarly with respect to queries to S 1 , . . . , S 5 . Define Q S := (Q S1 , . . . , Q S5 ). Note that D's interaction with its oracles can be unambiguously reconstructed from these sets since D is deterministic. For convenience, for i ∈ {1, 2, 3, 4, 5} we define Following [CS14], we augment the transcript (Q C , Q S ) with a key value k = (k 0 , . . . , k 5 ). In the real world, k is the actual key used by the construction. In the ideal world, k is a dummy key sampled independently from all other values according to the prescribed key distribution K. Thus, a transcript τ has the final form τ = (Q C , Q S , k).

Bad transcripts
Let T be the set of all possible transcripts that can be generated by D in the ideal world (note that this includes all transcripts that can be generated with nonzero probability in the real world). As in Sect. 2.2, let µ, ν be the distributions over transcripts in the real and ideal worlds, respectively.
We define a set T 2 ⊆ T of bad transcripts as follows: a transcript τ = (Q C , Q S , k) is bad if and only if one of the following events occurs: 1. There exist a pair (x, y) ∈ Q C and an index i ∈ {w/2+1, . . . , w} such that 4. There exist two indices i, ∈ {1, . . . , q C } such that > i, and: • (x ( ) , y ( ) ) was due to a forward query, and y ( ) As in Sect. 2.2, T 1 := T \T 2 denotes the set of good transcripts.
To understand the conditions, consider a good transcript τ = (Q C , Q S , k) and let's see some properties (informally). First, since the 1st condition is not fulfilled, each construction query induces w/2 inputs to the 1st round S-box and w/2 inputs to the 5th round S-box, the outputs of which are not fixed by Q S . Second, since neither the 2nd nor the 3rd condition is fulfilled, the inputs to the 1st round (5th round, resp.) S-box induced by the construction queries are distinct unless unavoidable. These ensure that the induced 2nd and 4th intermediate values are somewhat random and free from multiple forms of collisions. Finally, the last condition will be crucial for some structural properties of the queries that will be crucial in the subsequent analysis (see subsect. 3.3.2, the proof of Lemma 3).
Let's then analyze the probabilities of the conditions in turn. Since, in the ideal world, the values k 0 , k 5 are independent of Q C , Q S and (individually) uniform in {0, 1} wn , it is easy to see that the probabilities of the first three events do not exceed wq C q S /2 n , For the 4th condition, consider the th construction query (x ( ) , y ( ) ). When it is forward, in the ideal world it means D issued P (x ( ) ) to the 2wn-bit random permutation P and received y ( ) , which is uniform in 2 wn − + 1 possibilities. Thus, when ≤ q C ≤ 2 wn /2, Similar result follows when (x ( ) , y ( ) ) is backward. A union bound thus yields

Bounding the ratio µ(τ )/ν(τ )
Let Ω X = P(n) 5 × K be the probability space underlying the real world, whose measure is the product of the uniform measure on (P(n)) 5 and the measure induced by the distribution K on keys. (Thus, each element of Ω X is a tuple (S, k) with S = (S 1 , . . . , S 5 ), S 1 , . . . , S 5 ∈ P(n) and k = (k 0 , . . . , k 5 ) ∈ K.) Also let Ω Y = P(wn) × P(n) 5 × K be the probability space underlying the ideal world, whose measure is the product of the uniform measure on P(wn) with the measure on Ω X . Let τ = (Q τ C , Q τ S , k τ ) be a transcript. We introduce four types of compatibility as follows.
• Third, a tuple of S-boxes S * ∈ (P(n)) 5 is compatible with τ = (Q τ C , Q τ S , k), and write S * ↓ τ , if (S * , k) ∈ Ω X is compatible with τ , where k is the key value of the fixed transcript τ .
• Last, we say that (P * , S * ) ∈ P(wn) × (P(n)) 5 is compatible with τ For the rest of the proof we fix a transcript τ = (Q C , Q S , k) ∈ T 1 . Since τ ∈ T , it is easy to see (cf. [CS14]) that where the notation indicates that ω is sampled from the relevant probability space according to that space's probability measure. We bound µ(τ )/ν(τ ) by reasoning about the latter probabilities. In detail, with the third and fourth types of compatibility notions, the product structure of Ω X , Ω Y implies where S * and (P * , S * ) are sampled uniformly from (P(n)) 5 and P(wn) × (P(n)) 5 , respectively. Thus, By these, and by |Q C | = q C , |Q S1 | = . . . = |Q S5 | = q S , it is immediate that To compute Pr S * [S * ↓ τ ] we start by writing To analyze Pr S * [S * ↓ (Q C , Q S , k) | S * ↓ (∅, Q S , k)], we proceed in two steps. First, based on Q C and two outer S-boxes S * 1 , S * 5 , we derive the 2nd and 4th rounds intermediate values: these constitute a special transcript Q mid on the middle 3 rounds. We characterize conditions on S * 1 , S * 5 that will ensure certain good properties in the derived Q mid , which will ease the analysis. Therefore, in the second step, we analyze such "good" Q mid to yield the final bounds. Each of the two steps will take a paragraph as follows.
The outer 2 rounds. Given a tuple of S-boxes S * , we let Bad(S * ) be a predicate of S * that holds if any of the following conditions is met: There exist distinct pairs (x, y), (x , y ) ∈ Q C and two indices i, i ∈ {w/2 + 1, . . . , w} such that: . (B-1) captures the case that a 2nd round S-box input or a 4th round S-box output has been in Q S , (B-2) captures collisions among the 2nd round S-box inputs & 4th round S-box outputs for a single construction query, while (B-3) captures various collisions between the 2nd round S-box inputs, resp. 4th round S-box outputs, from two distinct queries. Note that essentially, Bad(S * ) only concerns with the randomness of the outer 2 S-boxes S * 1 and S * 5 . For simplicity, define Good(S * ) := (S * ↓ Q S ) ∧ ¬Bad(S * ). Then it holds Hence, all that remains is to lower bound the two terms in the product of (8). We serve the result below, and defer the proof to subsect. 3.3.1.

Lemma 2.
When q S + w ≤ 2 n /2, we have Analyzing the 3 middle rounds. Our next step is to lower bound the term Pr S * S * ↓ (Q C , Q S , k) | Good(S * ) from Eq. (8). Given S * for which Good(S * ) holds, for every (x (i) , y (i) ) ∈ Q C we define u (i) in which the tuples follow exactly the same chronological order as in Q C . Define and write S * ↓ (Set, Q S , k) for the event that "C3 S * (u 2 ) = v 4 for every (u 1 , u 2 , v 4 , v 5 ) in the set Set". Then it can be seen To bound Eq. (10), we will divide Q mid into multiple sets according to collisions on the "second halves" u 1 [w/2 + 1..w] and v 5 [w/2 + 1..w], and consider the probability that S * is compatible with each set in turn. In detail, the sets are arranged according to the following rules: Assume that Q mid is divided into α sets by the above rules, with |Q m | = β . Then α =1 β = q C , and Now we could focus on analyzing the th set Q m . Assume that The superscript ( , i) indicates that it is the i th tuple in this th set Q m . For this index , we define six sets ExtDom The proof is deferred to subsect. 3.3.2.
Lemma 4. Consider the th set Q m and any two distinct u in Q m . Then, there exist two indices j 1 , j 2 ∈ {w/2+1..w} such that, • when Q m is of Type-I: u The proof is deferred to subsect. 3.3.3. With the help of these two lemmas, we are able to bound the probability that the randomness is compatible with the th set Q m .

Proof of Lemma 3
Wlog, consider the case of Type-I Q m , as the other case is just symmetric. Assume otherwise, and assume that tuple 1 = u Q mi are such two tuples with the smallest indices j 1 , j 2 . Wlog assume j 2 > j 1 , i.e., tuple 2 was later. Then tuple 2 was necessarily a forward query, as otherwise u .w] would contradict the goodness of τ (the 4th condition). By this and further by the 4th condition, v (j2) 5 is "new", and tuple 2 cannot be in any Type-II set Q mi , i ≤ −1. This means there exists a Type-I set Q mi , i ≤ − 1, such that tuple 2 ∈ Q mi . By our rules, the tuples in the purported Q m should have been Q mi , and thus Q m should not exist, reaching a contradiction.
By the above, for Type-I sets, the claims hold in all cases. Thus the claim.

Proof of Lemma 5
We distinguish two cases depending on the type of Q m .
For the case of i 1 = i 2 ∈ {1, . . . , β }, fix distinct j 1 , j 2 ∈ {w/2 + 1, . . . , w}. Consider the condition u  The positive results are stated w.r.t. the idealized model P-SPN structures E P-SPN (see Definition 1), i.e., it relies on the assumption that the IDs are independent from the S-boxes. Formally, this means Pr(∆ 1 E S − − → ∆ 2 ) > 0 as long as χ(∆ 1 ) = χ(∆ 2 ), where E S is a "(full) S-layer structure". Under this assumption, we have the main result of this section, i.e., the provable security of 4-round, rate 3/4 P-SPN structures E P-SPN using the same MDS linear layer T in every round.
Theorem 3. When w + 2 ≤ 2 n , for the P-SPN structure E P-SPN built upon an MDS linear layer T and with rate r ≥ 3/4, there does not exist 4-round truncated impossible differentials.
It can be seen there exists a matrix T obtained by rearranging the rows and columns of T , such that  where t 1,1 , . . . , t w,1 ∈ F w−1 , t 1,2 , . . . , t w,2 ∈ F. Note that T is MDS, since it is obtained by rearranging rows and columns of T . By Lemma 1, T is also MDS, meaning that t 1,2 = 0, . . . , t w,2 = 0. Therefore, (1) To ensure . These plus the condition ∆ 8 [j β ] = 0 exclude at most w + 1 values in total, and thus the claim.

Zero-Correlation Linear Security
The positive results regarding ZC attacks again rely on structures. Formally, this means cor E S (α 1 , α 2 ) = 0 as long as χ(α 1 ) = χ(α 2 ). Under this idealized assumption, Sun et al. showed that the existence of impossible differential in an SPN is equivalent to the existence of zero correlation linear hull in the "dual structure" of this SPN [SLG + 16]. But the "dual structure" of P-SPNs has never been formalized. For simplicity, we establish the ZC security via Theorem 3. Theorem 4. When w + 2 ≤ 2 n , for the P-SPN structure E P-SPN built upon an MDS linear layer T and with rate r ≥ 3/4, there does not exist 4-round zero correlation linear hull.
Proof. Assume that C4 is the 4-round rate r P-SPN structure E P-SPN using an MDS T as the linear layer. For any α 1 , α 2 ∈ F w \{0 w }, we show that where C4 is the rate r P-SPN structure E P-SPN built upon the MDS linear layer (T T ) −1 (since T is MDS, T T is also MDS and invertible). This implies the claim by Theorem 3.

Linear Layers for P-SPNs with Rate Below 1/2
We first establish a theorem regarding the differential propagation in such "sparse" P-SPNs. The construction of the linear layers will be clear during its proof. For conceptual convenience, in (and only in) this section we let ρ = r −1 , and write 1/ρ (instead of r) for the rate.
Theorem 5. For any integer ρ such that ρw ≤ 2 n , for rate 1/ρ P-SPNs, ρ rounds are necessary and sufficient to ensure at least one active S-box during differential propagation.
Proof. Necessity. This seems a folklore. Formally, assume that the linear layers used in the i th round is T i . Then, by construction, if there exists a (ρ − 1)-round differential characteristic with no active S-box, then there exists ∆ 1 , . . . , ∆ ρ−1 ∈ F w such that: -round characteristic with no active S-box. Note that the final round only contains a partial S-box layer, and thus the difference ∆ ρ−1 is invariant.) We show that the above equations are equivalent to a linear equation system with where The right most w ρ columns of T i are multiplied by 0 w/ρ and have no influence, and thus we simply refer to them by . The equations imply the following homogeneous system: where the I is the identity matrix in F ]. Combining them yields a homogeneous system shown in Fig. 4. The system has (ρ−1) 2 w ρ unknowns, and its coefficient matrix has only (ρ − 2)w rows. As (ρ − 2) < (ρ−1) 2 ρ , this system always has (approximately 2 n w ρ ) non-zero solutions, and every such solution turns out to be a differential characteristic on ρ − 1 rounds with no active S-box.
Note that, while the above transformations T M 1 , T M 2 , . . . appear quite complicated, they are all MDS. To see this, consider x ∈ F w } constitute all the codewords of a small MDS code. Therefore, T M i is MDS.
Larger values for n are certainly preferred, but such S-boxes seem to be more costly. To remedy, we advocate using large-but-weak S-boxes, which significantly enlarges the design space. For example, 11-bit S-boxes with acceptable performance can be found in [BDMD + 20] or constructed via the SHA3 approach [BDPA11], 64-bit ARX S-boxes have been recently constructed [BBdS + 20], and power-based S-boxes on non-binary field of size around 2 255 was used in [GKK + 19]. As discussed in [BDMD + 20], some of these large S-boxes are even cheaper for relevant scenarios such as side-channel masking. As shown in Table 2, with n = 11, if we target a 352-bit P-SPN (i.e., w = 32), then linear layers for P-SPNs with rates ranging from 1/2 to 1/32 can be constructed. We omit the calculations for various other meaningful cases and only summarize some (im)possibilities in Table 2.

Conclusion
We provide the first systematic provable security analysis of SP networks with partial nonlinear layers (P-SPNs), regarding SPRP security and provable security against impossible differential and zero-correlation linear attacks. For P-SPNs with rate r < 1/2, r −1 ∈ N, we also propose the first dedicated linear layers that consist of r −1 − 1 different transformations and ensures at least one active S-boxes in r −1 rounds. Our results have justified P-SPNs as a sound approach comparable to or even surpass the normal SPNs in some well-defined sense.
We leave several open problems as follows.
Christian Rechberger, Markus Schofnegger, Qingju Wang, as well as the anonymous reviewers for their invaluable comments.
We have also found plenty of candidates for n = 11 and w up to 32, which are however omitted for the sake of space.
Facing the difficulty w.r.t. 4 rounds, we resort to more rounds for better readability. In fact, 6 rounds are needed to ensure that the qC construction queries induce wqC equations on two "fixed" middle rounds, i.e., in the 3rd and 4th rounds. Using a slightly more sophisticated idea as in Sect. 3.2, we tried to achieve wqC equations in the 2nd, 3rd, and 4th rounds depending on the properties of the constructions queries in question. This enables a more (involved) proof with 5 rounds.

D Differential Security
As mentioned in Introduction, our conclusions on provable security against differential attacks are mainly negative: some trivial security lower bounds are indeed tight.
First, recall that Theorem 5 shows ρ rounds needed for at least 1 active S-box. One naturally asks if non-trivial lower bounds on the number of active S-boxes can be proved. Unfortunately, we find this impossible.
Second, consider r ≥ 1 3 . Using MDS linear layers, it is easy to see that the number of active S-boxes in 2-round differential characteristics is at least w + 1 − 2w(1 − r), which is tight. For 3-round characteristics, w + 1 − 2w(1 − r) + wr = (3r − 1)w + 1 is a trivial lower bound for the number of active S-boxes. Unfortunately, this is also tight.
Theorem 7. For rate r P-SPNs, there always exist 3-round differential characteristics with at most (3r − 1)w + 1 active S-boxes, even if two linear layers T1, T2 are used.