On the Security of Sponge-type Authenticated Encryption Modes

. The sponge duplex is a popular mode of operation for constructing authenticated encryption schemes. In fact, one can assess the popularity of this mode from the fact that around 25 out of the 56 round 1 submissions to the ongoing NIST lightweight cryptography (LwC) standardization process are based on this mode. Among these, 14 sponge-type constructions are selected for the second round consisting of 32 submissions. In this paper, we generalize the duplexing interface of the duplex mode, which we call Transform-then-Permute . It encompasses Beetle as well as a new sponge-type mode SpoC (both are round 2 submissions to NIST LwC). We show a tight security bound for Transform-then-Permute based on b -bit permutation, which reduces to ﬁnding an exact estimation of the expected number of multi-chains (deﬁned in this paper). As a corollary of our general result, authenticated encryption advantage of Beetle and SpoC is about T ( D + r 2 r ) 2 b where T , D and r denotes the number of oﬄine queries (related to time complexity of the attack), number of construction queries (related to data complexity) and rate of the construction (related to eﬃciency). Previously the same bound has been proved for Beetle under the limitation that T (cid:28) min { 2 r , 2 b/ 2 } (that compels to choose larger permutation with higher rate). In the context of NIST LwC requirement, SpoC based on 192-bit permutation achieves the desired security with 64-bit rate, which is not achieved by either duplex or Beetle (as per the previous analysis).


Introduction
The Sponge function was first proposed by Bertoni et al. at the ECRYPT Hash Workshop [7], as a mode of operation for variable output length hash functions. It received instant attention due to NIST's SHA-3 competition, which had several candidates based on the sponge paradigm. Most notably, JH [31] and Keccak [12] were among the five finalists, and Keccak became the eventual winner. In time, the Sponge mode found applications in message authentication [7,11], pseudorandom sequence generation [9], the duplex mode [10] for authenticated encryption. In particular, the recently concluded CAESAR competition for the development of authenticated encryption (AE) schemes had received a dozen sponge-based submissions. Ascon [18], a winner in lightweight applications (resource constrained environments) use-case of the CAESAR competition, also uses the duplex mode of authenticated encryption.
The Sponge construction is also one of the go-to mode of operation for designing lightweight cryptographic schemes. This is quite evident from the design of hash functions such as Quark [2], PHOTON [20], and SPONGENT [13], and authenticated encryption schemes such as Ascon [18] and Beetle [14]. In fact, majority of the submissions to the ongoing NIST lightweight cryptography standardization process are inspired by the Sponge paradigm.
At a very high level, Sponge-type constructions consist of a b-bit state, which is split into a c-bit inner state, called the capacity, and an r-bit outer state, called the rate, where b = c + r. Traditionally in Sponge like modes, data absorption and squeezing is done via the rate part, i.e. r bits at a time. SpoC [1], a round 2 submission to NIST Lightweight Cryptography (LwC) standardization process, is a notable exception, where the absorption is done via the capacity part and the squeezing is done via the rate part. In [8], Bertoni et al. proved that the Sponge construction is indifferentiable from a random oracle with a birthday-type bound in the capacity. While it is well-known that this bound is tight for hashing, for keyed applications of the Sponge, especially authenticated encryption schemes, such as duplex mode, it seems that the security could be significantly higher.

Existing Security Bounds for Sponge-type AE Schemes
Sponge-type authenticated encryption is mostly done via the duplex construction [10]. The duplex mode is a stateful construction that consists of an initialization interface and a duplexing interface. Initialization creates an initial state using the underlying permutation π, and each duplexing call to π absorbs and squeezes r bits of data. The security of Sponge-type AE modes can be represented and understood in terms of two parameters, namely the data complexity D (total number of initialization and duplexing calls to π), and the time complexity T (total number of direct calls to π). Initially, Bertoni et al. [10] proved that duplex is as strong as Sponge, i.e. secure up to DT 2 c . Mennink et al. [25] introduced the full-state duplex and proved that this variant is secure up to DT 2 κ , D 2 c/2 , where κ is the key size. Jovanovic et al. [21] proved privacy security up to DT 2 b , D min{2 b/2 , 2 κ }, T min{2 b/2 , 2 c−log 2 r , 2 κ }, and integrity security up to DT 2 c , D min{2 c/2 , 2 κ , 2 τ }, T min{2 b/2 , 2 c−log 2 r , 2 κ }, where τ denotes the tag size. Note that the integrity security has an additional restriction that D 2 c/2 , where D is dominated by the decryption data complexity. Daemen et al. [17] gave a generalization of duplex that has built-in multi-user security. Very recently, a tight privacy analysis ( [22]) is provided. However, one of the dominating terms present in all of the existing integrity analysis of duplex authenticated encryption is Moreover, a forgery attack with matching bound is not known. A recent variant of duplex mode, called the Beetle mode of operation [14], modifies the duplexing phase by introducing a combined feedback based absorption/squeezing, similar to the feedback paradigm of CoFB [15]. In [14], Chakraborti et al. showed that feedback based duplexing actually helps in improving the security bound, mainly to get rid of the term DT /2 c . They showed privacy security up to DT 2 b , D 2 b/2 , T 2 c , and integrity security up to DT 2 b , D min{2 b/2 , 2 c−log 2 r , 2 r }, T min{2 c−log 2 r , 2 r , 2 b/2 }, with the assumptions that κ = c and τ = r.

Security of Sponge-typed AE in Light of NIST LwC Requirement:
In NIST's LwC call for submissions, it is mentioned that the primary AE version should have at least 128-bit key, at least 96-bit nonce, at least 64-bit tag, data complexity 2 50 − 1 bytes, and time complexity 2 112 . In order to satisfy these requirements, a traditional duplexbased scheme must have a capacity size of at least 160-bit. All sponge-type submission to NIST LwC standardization process uses at least 192-bit capacity, except CLX [32] for which no security proof is available.
On the other hand, the known bound for Beetle imposes certain limitations on the state size and rate. Specifically, Beetle-based schemes require approximately 120-bit capacity and approximately 120-bit rate to achieve NIST LwC requirements. This means that we need a permutation of size at least 240 bits. In light of the ongoing NIST LwC standardization, it would be interesting to see whether these limitations can be relaxed for Beetle.

Our Contributions
In this paper, inspired by the NIST LwC requirements, we extend a long line of research on the security of Sponge-type AE schemes. We study Sponge-type AEAD construction with a generalization of the feedback function used in the duplexing interface, that encompasses the feedback used in duplex, Beetle, SpoC etc. We show that for a class of feedback function, containing the Beetle and SpoC modes, optimal AE security is achieved. To be specific, we show that the AE security of this generalized construction is bounded by adversary's ability of constructing a special data structure, called the multi-chains. We also show a matching attack exploiting the multi-chains. As a corollary of this we give 1. improved and tight bound for Beetle, and 2. a security proof validating the security claims of SpoC.
Notably, we show that both Beetle and SpoC achieve NIST LwC requirements with just 128-bit capacity and ≥ 32-bit rate. In other words, they achieve NIST LwC requirements with just 160-bit state, which to the best of our knowledge is the smallest possible state size among all known sponge like constructions which are proven to be secure.

Organization of the Paper
In section 2 we define different notations used in the paper. We give a brief description of the design and security models of AEAD. We also give a brief description of coefficient H technique [26,27]. In section 3 we state some multicollision results with proofs which are used in the paper. In section 4 we define what we call the multi-chain structure and give an upper bound on the expected number of multi-chains that can be formed by an adversary in a special case. In section 5 we study a Sponge-type AEAD construction called Transform-then-Permute with a generalization of the feedback function used in the duplexing interface. We give a tight security bound for the special case when the feedback function is invertible. We show that the generalization encompasses the feedback functions used in Sponge AE, Beetle, SpoC etc. Particularly, Beetle and SpoC modes fall under the class where the feedback function is invertible and hence for those modes optimal AEAD security is achieved. In section 6 using the multi-chain security game from section 4 we give a complete security proof of the AEAD security bound given in theorem 3. Finally, in section 7 we give some attack strategies to justify the tightness of our bound.

Preliminaries
Notational Setup: For n ∈ N, For any bit string x with |x| ≥ n, x n (res. x n ) denotes the most (res. least) significant n bits of x. For n, k ∈ N, such that n ≥ k, we define the falling factorial (n) k := n!/(n − k)! = n(n − 1) · · · (n − k + 1).
For q ∈ N, x q denotes the q-tuple (x 1 , x 2 , . . . , x q ). For q ∈ N, for any set X , (X ) q denotes the set of all q-tuples with distinct elements from X . Two distinct strings a = a 1 . . . a m and b = b 1 . . . b m , are said to have a common prefix of length n ≤ min{m, m }, if a i = b i for all i ∈ (n], and a n+1 = b n+1 . For a finite set X , X ←$ X denotes the uniform sampling of X from X which is independent to all other previously sampled random variables. X wor ← X denotes uniform sampling of X from X without replacement.

Authenticated Encryption: Definition and Security Model
Authentication Encryption with Associated Data: An authenticated encryption scheme with associated data functionality, or AEAD in short, is a tuple of algorithms AE = (E, D), defined over the key space K, nonce space N , associated data space A, message space M, ciphertext space C, and tag space T , where: Here, E and D are called the encryption and decryption algorithms, respectively, of AE. Further, it is required that D(K, N, A, E(K, N, A, M )) = M for any (K, N, A, M ) ∈ K × N × A × M. For all key K ∈ K, we write E K (·) and D K (·) to denote E(K, ·) and D(K, ·), respectively. In this paper, we have K, N , A, M, T ⊆ {0, 1} + and C = M, so we use M instead of C wherever necessary.
AEAD Security in the Random Permutation Model: Let Π ←$ Perm(b) , Func denote the set of all functions from N × A × M to M × T such that for any input ( * , * , M ) the output is of length |M | + t for some predefined constant t and Γ ←$ Func. Let ⊥ denote the degenerate function from (N , A, M, T ) to {⊥}. For brevity, we denote the oracle corresponding to a function (like E, Π etc.) by that function itself. A bidirectional access to Π is denoted by the superscript ±. Definition 1. Let AE Π be an AEAD scheme, based on the random permutation Π, defined over (K, N , A, M, T ). The AEAD advantage of any nonce respecting adversary A against AE Π is defined as, Here A E K ,D K ,Π ± denotes A 's response after its interaction with E K , D K , and Π ± , respectively. Similarly, A Γ,⊥,Π ± denotes A 's response after its interaction with Γ, ⊥, and Π ± .
In this paper, we assume that the adversary is nonce-respecting, i.e. it never makes more than one encryption queries with same nonce. We further assume that the adversary is non-trivial, i.e. it never makes a duplicate query, and it never makes a query for which the response is already known due to some previous query. We use the following notations to parameterize the adversary's resources: • q e and q d denote the number of queries to E K and D K , respectively. σ e and σ d denote the total number of blocks of input (associated data and message) across all encryption and decryption (respectively) queries where (informally), number of blocks per query is determined by the total number of primitive calls required to process the input (see 5.1 for formal definition). We sometime also write q = q e + q d and σ = σ e + σ d to denote the combined construction query resources which can be interpret as the online or data complexity D from section 1.
• q f and q b denote the number of queries to Π + and Π − , respectively. We sometime also use q p = q f + q b , to denote the combined primitive query resources which can be interpret as the offline or time complexity T from section 1.
Any adversary that adheres to the above mentioned resource constraints is called a (q p , q e , q d , σ e , σ d )-adversary or simply (q p , σ)-adversary.

coefficient H Technique
Consider a computationally unbounded and deterministic adversary A that tries to distinguish the real oracle, say O 1 , from the ideal oracle, say O 0 . We denote the queryresponse tuple of A 's interaction with its oracle by a transcript ω. Sometimes, this may also include any additional information that the oracle chooses to reveal to the distinguisher at the end of the query-response phase of the game. We will consider this extended definition of transcript. We denote by Θ 1 (res. Θ 0 ) the random transcript variable when A interacts with O 1 (res. O 0 ). The probability of realizing a given transcript ω in the security game with an oracle O is known as the interpolation probability of ω with respect to O. Since A is deterministic, this probability depends only on the oracle O and the transcript ω. A transcript ω is said to be attainable if Pr [Θ 0 = ω] > 0. In this paper, , and the adversary is trying to distinguish O 1 from O 0 in AEAD sense. Now we state a simple yet powerful tool due to Patarin [26], known as the coefficient H technique (or simply the H-technique).
Theorem 1 (H-technique [26,27]). Let Ω be the set of all transcripts. For some bad , ratio > 0, suppose there is a set Ω bad ⊆ Ω satisfying the following: Then for any adversary A , we have the following bound on its AEAD distinguishing advantage: A proof of this theorem is available in multiple papers including [27,16,24].

Some Results on Multicollision
In this section we briefly revisit some useful results on the expected value of maximum multicollision in a random sample. This problem has seen a lot of interest (see for instance [19,3,30,29]) in context of the complexity of hash table 1 probing. However, most of the results available in the literature are given in asymptotic forms. We state some relevant results in a more concrete form, following similar proof strategies and probability calculations as before. Moreover, we also extend these results for samples which, although are not uniform, have high entropy, almost close to uniform.

Expected Maximum Multicollision in a Uniform Random Sample
Let X 1 , . . . , X q ←$ D where |D| = N and N ≥ 2. We denote the maximum multicollision random variable for the sample as mc q,N . More precisely, mc q,N = max a |{i : X i = a}|.
For any integer ρ ≥ 2, We justify the inequalities in the following way: The first inequality is due to the union bound. If there are at least ρ indices for which X i takes value a, we can choose the first ρ indices in q ρ ways. This justifies the second inequality. The last inequality follows from the simple observation that For any positive integer valued random variable Y bounded above by q, we define another random variable Y as Using Eq. (2), and the above relation we can prove the following results for the expected value of maximum multicollision. We write mcoll(q, N ) to denote Ex [mc q,N ]. So from the above relation, for all positive ρ. We use this relation to prove an upper bound of mcoll(q, N ) by plugging in some suitable value for ρ.
Proof. We first prove the result when q = N . A simple algebra shows that for n ≥ 2, When q ≥ nN , we can group them into q/nN samples each of size exactly nN (we can add more samples if required). This would prove the result when q ≥ nN . Remark 1. Note that, similar bound as in proposition 1 can be achieved in the case of non-uniform sampling. For example, when we sample X 1 , . . . , X q wor ← {0, 1} b and then we define Y i = X i r for some r < b. In this case, we have This can be easily justified as we have to choose the remaining b − r bits distinct (as X 1 , . . . , X q must be distinct). So, same bound as given in Proposition 1 can be applied for this distribution.

A Special Example of Non-uniform Random Sample
In this paper we consider the following non-uniform random samples. Let x 1 , . . . x q be distinct and y 1 , . . . , y q be distinct b bits. Let Π denote the random permutation over b bits, Notice that, since Π is a random permutation, this probability is independent of the choice of {i 1 , . . . , i ρ } and {j 1 , . . . , j ρ }. Hence, without loss of generality we can assume that i k = j k = k. Let N := 2 b . We also assume a = 0 b . Since otherwise, we consider Π (x) = Π(x) ⊕ a which is also a random permutation and consider Note that y i 's are clearly distinct. So the problem reduces to bounding We say that c ρ valid if c i = x j if and only if c j = y i . The set of all such valid tuples is denoted as V . For any valid c ρ , define S : On the other hand, if c ρ is not valid then the above probability is zero. Let V s be the set of all valid tuples for which |S| = s.
If |S| = 2ρ − k, then we must have exactly k many pairs (i 1 , j 1 ), . . . (i k , j k ) such that c i = x j . Now the number of ways this k-many pairs can be chosen is bounded by ρ 2k . The remaining ρ − k many c i 's can be chosen in [q] is of size q 2 , we denote the maximum multicollision random variable for the sample as mc q 2 ,N . Then we have by a similar analysis as in the previous section, We write mcoll (q 2 , N ) to denote Ex mc q 2 ,N . So from the above relation, Now for q 2 ≥ N n 2 we can group them into n 2 q 2 N samples each of size exactly N n 2 (we can add more samples if required). This would prove the bounds.

Multi-chain Security Game
In this section we consider a new security game which we call multi-chain security game. In this game, adversary A interacts with a random permutation and its inverse. It's goal is to construct multiple walks having same labels. We first need to describe some notations which would be required to define the security game. Let L be a linear function over b bits. Given such a list we define a labeled directed graph G L L over the set of vertices range(L) ⊆ {0, 1} b as follows:

The Multi-Chain Structure
We can similarly extend this to a label walk W from a node w 0 to w k as We simply denote it as w 0 Here k is the length of the walk. We simply denote the directed graph G L L by G L wherever the linear function L is understood from the context.

Definition 2.
Let L be a fixed linear function over b bits. Let r, τ ≤ b be some parameters. We say that a set of labeled walks {W 1 , . . . , W p } forms a multi-chain with a label x : We also say that the multi-chain is of length k. The maximum size of the set of multi-chain of length k (with some label x) is denoted as W k . Thus, for a fixed linear function L, W k is completely determined by L. Now we describe how the list L is being generated through an interaction of an adversary A and a random permutation.

The Multi-Chain Advantage
Consider an adversary A interacting with a b-bit random permutation Π ± . Suppose, the adversary A makes at most t many interactions with Π ± . Let (x i , dir i ) denote ith query where x i ∈ {0, 1} b and dir i is either + or − (representing forward or inverse query). If dir i = +, it gets response y i as Π(x i ), else the response y i is set as ) which only stores the information about the random permutation. For the sake of simplicity we assume that adversary makes no redundant queries and so all u 1 , . . . u t are distinct and v 1 , . . . , v t are distinct. For a linear function L consider the directed graph G θ . For any k, we have already defined W k . Now we define the maximum multi-chain advantage as

Bounding µ t for Invertible L Functions
In this section, we derive concrete bounds for µ t under a special assumption that the underlying feedback function is invertible.

Proof of Theorem 2
We first make the following observation which is straightforward as L is invertible.
We now describe some notations related to multi-chain W k .
1. Let W fwd,a denote the size of the set {i : dir i = +, v i τ = a} and max a W fwd,a is denoted as W fwd . This denotes the maximum multi-collision among τ most significant bits of forward query responses.
2. Similarly, we define the multi-collision for backward query responses as follows: Let W bck,a denote the size of the set {i : dir i = −, v i r = a} and max a W bck,a is denoted as W bck .
3. In addition to the multicollisions in forward only and backward only queries, we consider multicollisions due to both forward and backward queries. Let W mitm,a denote size of the set

Lemma 1. For all possible interactions, we have
Proof. We can divide the set of multi-chains into three sets: Forward-only chains: Each chain is constructed by Π queries only. By definition, the size of such multi-chain is at most W fwd .
Backward-only chains: Each chain is constructed by Π − queries only. By definition, the size of such multi-chain is at most W bck .
Forward-backward chains: The multi-chain consists of at least one chain that uses both Π and Π − queries. Let us denote the size of such multi-chain by W fwd-bck k .
Then, we must have This can be easily argued by pigeonhole principle, given Observation 1. The argument works as follows: For each of the individual chain W i , we have at least one index j ∈ [k] such that −), (−, +)}. Note that, it is possible that the i-th chain can co-exist in multiple buckets. But more importantly, it will exist in at least one bucket. As there are k many buckets and w many chains, by pigeonhole principle, we must have one bucket j ∈ [k], such that it holds at least w/k many chains. Now we complete the proof of Theorem 2. Observe that W fwd and W bck are the random variables corresponding to the maximum multicollision in a truncated random permutation sample of size t, and corresponds to remark 1 of subsection 3.1. Further, W mitm is the random variable corresponding to the maximum multicollision in a sum of random permutation sample of size t 2 , i.e., the distribution of sub section 3.2. Now, using linearity of expectation, we have

Related work
In [23] Mennink analyzed the Key-prediction security of Keyed Sponge using a special type of data structure which is close to but different from our multi-chain structure. Here we give a brief overview of Mennink's work in our notations and describe how our structure is different from the structure considered by him.
We can similarly extend this to a label walk W from a node w 0 to w k as We simply denote it as w 0 . . . , x k ). Here k is the length of the walk. The set yield c,k (L) consists of all possible labels x such that there exists a k-length walk of the form 0 b x → w k in the graph G L . Consider the graph, G L . The configuration of a walk from w 0 to w k is defined as a tuple The use of tools like multi-collision and the similarity in the data structure of [23] with our multi-chain structure can be misleading. Here we try to discuss the difference between them and show that the underlying motivation behind both the problems are philosophically as different as possible.
Note that using multi-chain structure, we try to bound the number of different walks with the same label and distinct starting points whereas yield c,k (L) is the number of different walks with same starting point namely 0 b and distinct labels. Hence the multichain structure deals with a different problem than yield c,k (L). A notable change in our work is to deal with multicollision of sum of two permutation calls ( we call it meet in the middle multicollision, see definition of W mitm ). This computation is not straightforward like usual computation of expectation of multi-collision (see section 3.2).

Transform-then-Permute Construction
In this section we describe Transform-then-Permute (or TtP in short), which generalizes duplexing method used in sponge AEAD encompassing many other constructions such as Beetle, SpoC etc.

Parameters and Components
We first describe some parameters of our wide family of AEAD algorithms.
1. State-size: The underlying primitive of the construction is a b-bit public permutation.
We call b state size of the permutation.
3. Nonce-size: In this paper we consider fixed size nonce. Let ν denote the size of nonce. In addition to parsing N A, we also parse a message or ciphertext Z as (Z 1 , . . . , Z m ) r ← Z into m blocks of size r where m = |Z|/r . We define t := a + m to be the total number of blocks corresponding to a input query of the form (N, A, Z). Domain Separation: To every pair of nonnegative integers (|A|, |Z|) with a = a(|A|), m = |Z|/r , and for every 0 ≤ i ≤ a + m, we associate a small integer δ i where We collect all these δ values through the following function DS(|A|, |Z|) = (δ 0 , δ 1 , . . . , δ a+m ).
Feedback Functions: We also need some linear functions L ad , L e : {0, 1} b → {0, 1} b which are used to process associate data and message respectively in an encryption algorithm. Now, given a linear function L : x , is used to process the j-th block Z (either a plaintext or a ciphertext) using the output Y of the previous invocation of the random permutation:

The Description of Transform-then-Permute AEAD
We describe the Transform-then-Permute algorithm in Algorithm 2 which generalizes duplexing method used in sponge AEAD.

Security Analysis of TtP
We prove the following result on the AE security of Transform-then-Permute when the linear functions L d,i and L e are invertible for all 1 ≤ i ≤ r. Let q p , q e and q d define the number of primitive, encryption and decryption queries respectively by an adversary and let σ e and σ d define all the data blocks processed, including nonce, associated data and message, in those encryption and decryption queries, respectively,.

Theorem 3 (main theorem). Let TtP be a construction where L d,i for all i ∈ [r]
and L e are invertible. For any (q p , q e , q d , σ e , σ d )-adversary A , we have

How to Convert a Generalized Sponge-type Constructions to TtP
In this section we describe why Transform-then-Permute captures wide class of permutation based sequential constructions in which only non-linear operation lie in the underlying permutation. Let L : 1} r be any linear function defined by Consider the Sponge-type construction which takes state input X i and data input M i and generate the data output C i and next state input X i+1 as follows: As L 2,1 · Y + L 2,2 · M = C we must have rank(L 2,2 ) = r, otherwise encryption is not a bijective function from message space to ciphertext space. For the sake of simplicity we can assume that L 2,2 = I r (the identity matrix of size r). Otherwise, we can redefine message block as M = L 2,2 · M . Now we observe that rank(L 2,1 ) = r. If not, then there exists a non-zero vector γ such that γ · L 2,1 = 0. Hence, γ · M = γ · C holds with probability 1. In case of ideal permutation as γ is non-zero and C is chosen uniformly independent of M , this event occurs with probability 1 2 . Hence the privacy advantage of any adversary for such a construction will be ≥ 1 2 . As rank(L 2,1 ) = r, there exists an invertible matrix Z b×b such that L 2,1 · Z = I r 0 r×(b−r) . Let L e = L 1,1 · Z. Then by simple matrix algebra we have Note that, multiplication by an invertible matrix is a permutation and composition of a random permutation with a public permutation is again a random permutation. Hence, we can redefine the random permutation output as Z −1 · Π(X i ). Let us denote encode(M ) = L 1,2 · M and hence the the general linear function based Sponge-type construction boils down to the construction TtP.

New Improved Security of Beetle
In Beetle [14], the linear function L e is defined as L e (y

The secondary version of PHOTON-Beetle can be bounded by
In [15], the authors proved that for any (q p , q e , q d , σ e , σ d )-adversary A , The primary version of the PHOTON-Beetle [4] mode of AEAD has r = τ = c = 128 and b = 256. Comparing with the σ and q p values prescribed by NIST we have 2 r = 2 τ ≥ q p ≥ σ The secondary version of the PHOTON-Beetle [4] mode of AEAD has r = 32, c = 224, τ = 128 and b = 256. Comparing with the σ and q p values prescribed by NIST we have 2 τ ≥ q p ≥ σ, σ ≥ 2 r and 2 b ≥ b 2 q 2 p . By equation 4 the advantage of Beetle is bounded by qp 2 r−1 r . Hence for Beetle to be secure, r has to be large.
It can be noticed that the primary version of PHOTON-Beetle has r = 128 > 112. Hence by equation 4, it is secure within the NIST requirements.
For secondary version of PHOTON-Beetle, we have r = 32 < 112 and hence equation 4 fails to prove the security for this version under NIST requirements.
The major difference between our analysis and the analysis of [15] is that, we use the expected number of multichains to bound the security of Beetle whereas in [15], it was only done using multicollision probability at the rate part. Hence our new bound is much tighter than that of the existing one. Now corollary 1 follows from, theorem 3, proposition 1 and proposition 2. Further using the relation that σ ≤ q p (as per NIST LwC requirements) we can bound the advantage for the primary version as, and the secondary version as, Hence, by this new improved security bound, it is proved that both the primary and the secondary version of PHOTON-Beetle are secured under the NIST requirements.

Security of SpoC
In SpoC [1], the linear function L e is identity, and the linear function L d is defined by the mapping L(x, y) → (x, x 0 c−r ⊕ y), where (x, y) ∈ {0, 1} r × {0, 1} c . Clearly the L e and L d,i functions are involutions, and hence invertible. Further, it is easy to check that they have full rank.

Corollary 2.
For any (q p , q e , q d , σ e , σ d )-adversary A , we have the primary version of SpoC can be bounded as, The primary version of SpoC mode of AEAD has r = τ = 64, b = 192. Using the NIST prescribed values of σ and q p we have σ < 2 r but 2 r = 2 τ ≤ q p and 2 b ≤ b 2 q 2 p . Hence corollary 2 follows from theorem 3, proposition 1 and proposition 2. Further using the relation that σ ≤ q p (as per NIST LwC requirements) we can bound the advantage as,

Interpretation of Corollary 1 and Corollary 2
Keeping in mind the NIST LwC requirement of time complexity q p = 2 112 and data complexity rσ = 2 53 we try to find out the smallest possible permutation under which the Beetle and SpoC modes can achieve security. For this discussion we ignore the constants appearing in bounding the advantage terms. we take 2 r ≤ σ ≤ q p ≤ 2 c . We further assume that σ ≤ 2 τ ≤ q p and 2 b ≤ b 2 q 2 p . Then by applying proposition 1 and proposition 2 in Corollary 1 or Corollary 2 we have, Hence ignoring the constant we conclude that, in case of Beetle and SpoC, with rate r = 32 and permutation size b = 160, we achieve security almost close to the NIST LwC requirements.

Security of Sponge
In case of the original Sponge construction, the L d function is defined by Note that the L d function is not invertible. As described in Theorem 2, we have a bound for µ qp in the cases where L d is invertible or more specifically in the cases where Observation 1 holds. Hence the results of Theorem 2 can not be applied in case of original Sponge. However since L e is invertible, with a similar analysis as in the case of TtP we get, Bounding µ qp in case of Sponge is an interesting problem which is open to further research. However, it seems very hard to have a tight estimate of µ qp for Sponge AE. A straightforward estimate of µ qp leads to the known security bound of σ d q p /2 c . So as of now the tight security bound of Sponge AE is still an open problem. However, our result helps in reducing the problem of finding tight bound to solving some functional graph problem (estimation of µ qp ). The functional graph of random functions are well-studied in cryptanalysis of iterated hash functions and MACs [28,6,5]. It is quite possible that similar approach may lead to a better understanding of the security of Sponge AE.

Proof of Theorem 3
The proof employs coefficient H-technique of Theorem 1. To apply this method we need to first describe the ideal world which basically tries to simulate the construction. The real world behaves same as the construction and would be described later. For the sake of notational simplicity we assume size of the nonce is at most b − κ. Later we mention how one can extend the proof when nonce size is more than b − κ. We also assume that the adversary makes exactly q p , q e and q d many primitive, encryption and decryption queries respectively.

Ideal World and Real World
Online Phase of Ideal World. The ideal world responds three oracles, namely encryption queries, decryption queries and primitive queries in the online phase.
(1) On Primitive Query (W i , dir i ): The ideal world simulates Π ± query honestly. 2 In particular, if dir i = 1, it sets According to our convention we assume that the decryption query is always non-trivial. So the ideal world returns abort symbol M * i := ⊥.
Offline Phase of Ideal World. After completion of oracle interaction (the above three types of queries possibly in an interleaved manner), the ideal oracle sets E, , D, P to denote the set of all query indices corresponding to encryption, decryption and primitive queries respectively. So E D P = [q e + q d + q p ] and |E| = q e , |D| = q d , |P| = q p . Let the primitive transcript The encryption transcript ω e = (X i,j Y i,j ) i∈E,j∈ [0..ti] . So, the transcript of the adversary consists of ω := (Q, ω p , ω e , ω d ) where Q := (Q i ) i∈E∪D .
Real World. In the online phase, the AE encryption and decryption queries and direct primitive queries are faithfully responded based on Π ± . Like the ideal world, after completion of interaction, the real world returns all X-values and Y -values corresponding to the encryption queries only. Note that a decryption query may return M i which is not ⊥.

Bad Transcripts
We define the bad transcripts into two main parts. We first define bad events due to encryption and primitive transcript. The following bad events says that (i) there is a collision among inputs/outputs of ω p and ω e (ii) there is a collision among input/outputs of ω e . So, given that there are no such collision, all inputs and outputs are distinct and hence ω e ∪ ω p is permutation compatible (can be realized by random permutation). More formally, we define the following bad events: Now we describe the bad event due to decryption queries. Suppose the bad events (B1 ∨ · · · ∨ B5) as defined above due to encryption queries and primitive don't occur i.e. we have ω p ∪ ω e is permutation compatible. Suppose Π is the partially defined permutation defined over domain of ω p ∪ ω e and mapping the corresponding range elements. For each decryption query . We define p i is the largest index j for which the input X j is in the domain of ω e ∪ ω p while we run the decryption algorithm using Π for Q i . Consider the case, p i = t i i.e. the complete decryption algorithm computation for the query is determined by the ω e ∪ ω p transcript. In such a case we define bad (called mBAD) if the corresponding tag also matches. Note that for this bad transcript the real world should not abort the decryption query. Now we define all bad events in a more formal way.
Definition of p i . Before we define p i , we first define p i which is the input index we can compute for the decryption query only using encryption queries transcript. Formally, p i is defined as −1 if for all i ∈ E, N i = N * i . Otherwise, there exists a unique i ∈ E such that N i = N * i (as we consider nonce-respecting adversary only). Let p i + 1 denote the length of the longest common prefix of (D i ,0 , · · · , D i ,t i ) and (D * i,0 , · · · , D * i,ti ). Note that p i = −1 in case there is no common prefix.
We now define Y * i,0..pi = Y i ,0..pi , X * i,0..pi = X i ,0..pi when p i ≥ 0 and By lemma 2, p i < t i , p i < t i . By definition of longest common-prefix, we have Definition of p i . If p i < a i or if X * i,pi+1 / ∈ domain(ω p ) define p i = p i . Else, we further extend X * -values and Y * -values based on the primitive transcript ω p . Let x i,j := D * i,j for all i ∈ D, 1 ≤ j ≤ t i . If there is a labeled walk (in the labeled directed graph induced by ω p as described in section 4 from Y * i,pi+1 with label (x i,pi+2 , . . . , x i,j ) then we denote the end node as Y * i,j . In notation we have Let p i denotes the maximum of all such possible j's. For all those i and j in which Y * i,j has been defined as described above, we define X * i,j+1 : Bad events due to decryption transcript: mBAD: For some i ∈ D with p i = t i and Y * i,ti τ = T * i .
We write BAD to denote the event that the ideal world transcript Θ 0 is bad. Then, with a slight abuse of notations and union bound, we have Bi .
We postpone the proof of lemma 3 and 4 to subsection 6.5.

Good Transcript Analysis
The motivation for all the bad events would be clear from the understanding of a good transcript (i.e., not a bad transcript). Let ω = (Q, ω p , ω e , ω d ) be a good transcript. For the sake of notation simply we ignore the query transcript Q as it is not required to compute the probability of a transcript.
1. The tuples ω e is permutation compatible and disjoint from ω p . So union of tuples ω e ∪ ω p is also permutation compatible.
2. Let D 1 (type-1 decryption query) be the set of all i ∈ D, if p i = t i with Y * i,ti τ = T * i . In this case, decryption algorithm should abort with probability one. Set of all other indices is denoted as D 2 (type-2 decryption query). In this case, p i < t i but X * i,p i +1 ∈ domain(ω e ∪ ω p ). So, Y * i,p i +1 value and subsequent Y -values will have almost b-bit entropy. Thus, with a negligible probability we may not abort the query.
Ideal World Interpolation Probability. Let Θ 0 and Θ 1 denote the transcript random variable obtained in the ideal world and real world respectively. As noted before, all the input-output pairs for the underlying permutation are compatible. In the ideal world, all the Y values are sampled uniform at random; the list ω p is just the partial representation of Π; and all the decryption queries are degenerately aborted; whence we get Here σ e denotes the total number of blocks present in all encryption queries including nonce. In notation σ e = q e + i m i .
Real World Interpolation Probability. In the real world, for ω we denote the encryption query, decryption query, and primitive query tuples by ω e , ω d and ω p , respectively. Then, we have Here we have slightly abused the notation to use ¬ω d,i to denote the event that the i-th decryption query successfully decrypts and and ¬ω d is the union ∪ i∈D2 ¬ω d,i (i.e. at least one decryption query successfully decrypts). The encryption and primitive queries are mutually permutation compatible, so we have (ω e , ω p ).

Now we show an upper bound Pr
values have been defined recursively as follows Let I and O denote the set of inputs and outputs for Π which are present in the transcript (ω e , ω p ). Recall that X * i,p i +1 is fresh, i.e., X * i,p i +1 ∈ I.
Proof. Since X * i,p i +1 is not the last block, then the next input block may collide with some encryption or primitive input block with probability at most σe+qp 2 b −σe−qp . Applying this same argument for all the successive blocks till the last one, we get that if none of the previous block input collides then the probability that the last block input collides is at most Proof. Since the last input block X * i,ti is fresh, hence Π(X * i,ti ) = T * i with probability at most 2/2 τ (provided σ e + q p ≤ 2 b−1 which can be assumed, since otherwise our bound is trivially true).
Let E j denote the event that X * i,j is fresh and E := ∧ ti j=p i +1 E j Using the claims, we have Pr Θ1 (¬ω d,i | ω e , ω p ) ≤ Pr Θ1 (¬ω d,i ∧ E | ω e , ω p ) + Pr(E c ).
The last inequality follows from the above claims. Now, we can proceed by using the union bound as follows.

Pr [Case 1] ≤
σ e + q p 2 b . Case 2 a i ≤ p i ≤ p i : This corresponds to the case when either the first non-trivial decryption query block doesn't match any primitive query or it matches a primitive query and follows a partial chain and then matches with some encryption query block. Doing similar analysis as in Case 3 of B3|¬B1, The probability that this happens for i-th decryption is at most q p /2 c × m i Φ out /2 c . Summing over all i ∈ D, the conditional probability is at most . By taking expectation we obtain the following: Pr[Case 3] ≤ q p σ d mcoll(σ e , 2 r ) 2 2c .
By adding all these probabilities we prove our result.

Matching Attack on Transform-then-Permute
Now we see some matching attacks for the bound. We explain the attacks for the simplified version (by considering empty associated data).
1. Suppose µ qp maximizes for some adversary B interacting with Π. Now, the AE algorithm A will run the algorithm B to get the primitive transcript ω p . We first make q d many encryption queries with single block messages with distinct nonces N 1 , . . . , N q d and hence for all 1 ≤ i ≤ q d , Y i,0 r , X i,1 r and Y i,1 τ values are known. Suppose for length m i , the multi-chain for the graph induced by ω p start from the nodes (whose r most significant bits of the domain is u i ) to the nodes (whose τ most significant bits of the range is T i ) and with label x i . Now we choose the appropriate ciphertext C * 1 such that X * i,1 r = u i . Moreover, we choose C * i,j such that C * i,j is same as x i,j (here we assume that B makes queries so that the labels are compatible with encoding function). Now, we make decryption queries (N i , C * i , T i ). With probability W mi /2 c , the ith forgery attempt would be successful. Then maximizing Wm i mi and by taking expectation, we achieve the desired success probability.
2. Guessing the key K through primitive query would lead a key-recovery and hence all other attacks. The correct guess of the key can be easily detected by making some more queries for each guess to compute an encryption query. This attack requires q p = O(2 κ ). Similarly random forging gives success probability of forging about O(q d /2 τ ).
3. Another attack strategy can be adapted to achieve σ e q p /2 b bound. We look for a collision among X-values and primitive-query inputs. This can be again detected by adding one or two queries to each guess. The same attack works with success probability q p mcoll(σ e , 2 r )/2 c if we make primitive queries after making all encryption queries.

4.
A similar attack strategy can be adapted to achieve q p mcoll(σ e , 2 r )/2 b−τ bound. We look for a collision among T -values and primitive-query inputs where primitive queries are done after the encryption queries to predict the unknown b − τ bits of the final output value.
These attacks show that the bounds in theorem 3 and equation (5) are tight.

Conclusion
In this paper we have proved improved bound for Beetle and provided similar bound for newly proposed mode SpoC. Our bound resolves all limitations known for Beetle and Sponge AE. We are able to provide tight estimation of µ qp when the feedback function for decryption is invertible. This is the case for Beetle and SpoC, but not for Sponge duplex.
Although as discussed in section 7, we obtain tight expression for AE advantage for Sponge AE, the variable µ qp (present in our upper bound ) needs to be tightly estimated.