Improved Security Bound of (E/D)WCDM

. In CRYPTO’16, Cogliati and Seurin proposed a block cipher based nonce based MAC, called Encrypted Wegman-Carter with Davies-Meyer ( EWCDM ), that gives 2 n/ 3 bit MAC security in the nonce respecting setting and n/ 2 bit security in the nonce misuse setting, where n is the block size of the underlying block cipher. However, this construction requires two independent block cipher keys. In CRYPTO’18, Datta et al. came up with a single-keyed block cipher based nonce based MAC, called Decrypted Wegman-Carter with Davies-Meyer ( DWCDM ), that also provides 2 n/ 3 bit MAC security in the nonce respecting setting and n/ 2 bit security in the nonce misuse setting. However, the drawback of DWCDM is that it takes only 2 n/ 3 bit nonce. In fact, authors have shown that DWCDM cannot achieve beyond the birthday bound security with n bit nonces. In this paper, we prove that DWCDM with 3 n/ 4 bit nonces provides MAC security up to O (2 3 n/ 4 ) MAC queries against all nonce respecting adversaries. We also improve the MAC bound of EWCDM from 2 n/ 3 bit to 3 n/ 4 bit. The backbone of these two results is a reﬁned treatment of extended mirror theory that systematically estimates the number of solutions to a system of bivariate aﬃne equations and non-equations, which we apply on the security proofs of the constructions to achieve 3 n/ 4 bit security.


Introduction
In the era of digital transmissions, cryptographic algorithms are used to authenticate the transmitted message over an insecure communication channel. Message Authentication Code, or in short MAC, is a popular symmetric key cryptographic primitive that plays an important role to enable two legitimate parties (having access to a shared secret key) to authenticate their transmissions. One of the natural approaches to authenticate a message M is to generate a random string of a constant size, which is used to mask the hash of the message that needs to be authenticated. The disadvantage of the scheme is that for every message that needs to be authenticated, it requires generating fresh constant sized random strings. To eliminate this one-time authentication problem, Brassard [Bra82] suggested to use a pseudorandom generator that generates a sequence of pseudorandom strings from a short master key. But in some applications, messages may come in arbitrary order due to network latency. Therefore, a direct means of computing the pseudorandom string (instead of sequentially computing the string) is much desired. Although Brassard suggested the use of Blum-Blum Shub generator [Bra82] for directly computing the pseudorandom string, a pseudorandom function (PRF) was a natural choice for this purpose to directly compute the pseudorandom string out of a nonce, a non-repeating value. This construction is known as Wegman-Carter (WC) MAC, defined as follows: where ν is the nonce. WC is a powerful MAC that provides the security guarantee up to the differential probability of the underlying hash function (also known as almost-xoruniversal advantage 1 ) when a nonce does not repeat in the queries (also known as the nonce respecting setting). The primary disadvantage of the WC construction is that it is completely broken when nonce repeats at least once (in other words, nonce misuse setting). In fact one can mount universal forgery in the case of a single repetition of a nonce. Due to the lack of availability of practical PRFs, Shoup suggested that F can be replaced by a block cipher E. This resulting MAC is known as Wegman-Carter-Shoup (WCS). However, unlike WC MAC, the security of WCS drops down to the birthday limit in the number of queries when a nonce is not repeated, and it also suffers from the problem of providing adequate security in the nonce-misuse setting. To achieve security in the nonce misuse setting, Cogliati and Seurin [CS16] proposed Encrypted Wegman-Carter (EWC) construction that offers birthday bound security in the nonce misuse setting but provides a high security in the nonce respecting setting. EWC is defined as follows: However, replacing the PRF F of EWC with a block cipher E makes its security drop to the birthday bound in the nonce respecting setting. To alleviate the problem, one can instantiate the PRF F of EWC construction with the xor of two permutations (XoP) construction [BKR98,Luc00]. Since XoP has been proved to be optimally secure [DHT17], the resulting construction provides optimal MAC security in the nonce respecting setting. Although the construction provides high MAC security, it requires three block cipher calls altogether. Interestingly, Cogliati and Seurin [CS16] were able to reduce the number of block cipher calls by 1 through their construction Encrypted Wegman-Carter with Davies-Meyer (EWCDM), where they have instantiated the PRF F with the Davies-Meyer construction. They have shown that EWCDM provides 2n/3 bit MAC security in the nonce respecting setting and n/2 bit security in the nonce-misuse setting.

Encrypted Wegman-Carter with Davies-Meyer
In CRYPTO'16, Cogliati and Seurin [CS16] proposed EWCDM, a nonce-based MAC, defined as follows: where ν is the nonce and M is the message. Note that EWCDM uses two independent block cipher keys, k 1 and k 2 , and an independent hash key k h for the AXU hash function. Authors have proved that EWCDM is secure against all nonce-respecting adversaries 2 that make q m 2 2n/3 MAC queries and q v 2 n verification queries. They have also shown n/2 bit security of EWCDM against nonce-misuse adversaries. It is interesting to note here that, although the Davies-Meyer (DM) construction is not a beyond birthday bound secure PRF, but encrypting its output after masking with the hash of a message makes the construction a beyond birthday bound secure MAC. Later in CRYPTO'17, Mennink and Neves [MN17] proved n bit PRF security of EWCDM in the nonce respecting setting using the result of Mirror theory for general ξ max 3 [Pat05, Pat10], and mentioned that the analysis is extended to the analysis for the unforgeability of the construction. The trick involved in proving the optimal security of EWCDM is by replacing the last block cipher call with its inverse. This subtle change does not make any difference in the output distribution and as a bonus, it trivially allows one to view an evaluation of T = EWCDM(ν, M ) as the xor of two permutations in the middle of the function (or in general a bi-variate affine equation 4 ), i.e., It is only this feature which is captured by the mirror theory to derive the security bound of the construction. However, as the construction requires two independent block cipher keys, reducing the number of block cipher keys to one was posed as an open problem.

Decrypted Wegman-Carter with Davies-Meyer
As an attempt to reduce the number of block cipher keys of EWCDM to one, Datta et al. [DDNY18] proposed a clever idea, where they replace the second block cipher call of EWCDM with the inverse of the first block cipher. This resulted in the construction called Decrypted Wegman-Carter with Davies-Meyer DWCDM, a nonce-based MAC, defined as follows: whereν ∈ {0, 1} 2n/3 is the nonce, M is the message and ν =ν 0 n/3 . Note that DWCDM uses a single block cipher key k and another independent hash key k h for the AXU hash function. However, the main drawback of the construction is that DWCDM can only take 2n/3 bit nonces.ν In fact, authors have proved that DWCDM is not secured beyond the birthday limit with full n bit nonces. They have shown that DWCDM is 2n/3 bit secure against all noncerespecting adversaries and n/2 bit secure against nonce-misuse adversaries. Moreover, the authors have also proposed a single-keyed nonce based MAC, dubbed 1K-DWCDM, where the hash key is derived using a block cipher evaluation on the input 0 n−1 1. This construction is secure up to 2n/3 bits (resp. n/2 bits) in the nonce-respecting (resp. the nonce-misuse) setting. The nice property of DWCDM is that it allows one to view an evaluation of the construction as the xor of permutations in the middle of the string, i.e., T = DWCDM(ν, M ) can be equivalently viewed as This feature allows the authors to use the mirror theory result for proving the security of their construction. However, to incorporate the verification attempts in the proof, they extended the mirror theory result by including univariate and bivariate affine nonequations along with bivariate affine equations. This result is known as the Extended Mirror Theory [DDNY18].
In the same paper [DDNY18], authors have mentioned that DWCDM can asymptotically achieves full n bit security. They have given a sketchy proof that for a general k with nonce space {0, 1} kn k+1 , DWCDM achieves kn k+1 bit MAC security in the nonce respecting setting with the following condition and the conjecture 1. Condition: the underlying hash function must be j-way regular for all 3 ≤ j ≤ k, i.e., for any j distinct input points, the probability that sum of the the hash values evaluated at those points is non-zero, should be very low.
2. Conjecture: Proving kn k+1 bits security for the extended mirror theory with ξ max = k.
Even though the above condition can be realized with a certain class of hash functions (e.g., Polyhash [MI11]), it is very difficult to prove the conjecture. In fact, in a follow-up work, Datta et al. [DDNY19] could only prove 2n/3 bit MAC security of DWCDM with n − 1 bit nonce space and left open for proving its security up to 3n/4 bits. It is worth mentioning here that it is hard to improve the security of DWCDM beyond 2 2n/3 with 2n/3 bit nonce space. In general, improving the security of DWCDM beyond 2 kn k+1 with kn k+1 bits of nonce space is a challenging task. In fact, we also do not know whether there exists an attack on DWCDM that uses kn k+1 bit nonce with 2 kn k+1 MAC queries.

Mirror Theory and Its Relatable Debate
Mirror theory [Pat10] is an important combinatorial tool that provides a lower bound on the number of distinct solutions to a system of bivariate affine equations over any finite abelian group. Patarin stated this result as a conjecture in [Pat03] and proved in [Pat05]. This result was known as Theorem P i ⊕ P j for ξ max = 2 [Pat05], which was later renamed to Mirror theory for ξ max = 2 in [Pat10]. The result of Mirror theory with ξ max = 2 has been acknowledged in the community as a potential and a strong approach to establish the optimal security of XoP constructions [DHT17].
Besides the result of Mirror theory for ξ max = 2, Patarin [Pat05] also claimed that the number of distinct solutions to a system of q bivariate affine equations with ξ max > 2 and with non-equality among the variables is always larger than the average number of solutions, provided q ≤ 2 n /67.(ξ max − 1). Patarin named this result the Theorem P i ⊕ P j for any ξ max . This result was also stated as a conjecture in [Pat03] (see Conjecture 8.1) in analyzing the security of the Feistel cipher. Only a couple of years later, this result was articulated in many follow-ups works for analyzing the security of the xor of two permutations, and it took a few articles [Pat05,Pat08b,Pat10,Pat13] for his result and security argument to evolve. Later, in 2017, this work culminated in a book [NPV17] called Feistel Ciphers: Security Proofs and Cryptanalysis by Nachef et al. However, the proofs of this result in most of these works are very sketchy with plenty of giant equations and are missing most of the important details.
Theorem P i ⊕ P j for any ξ max result plays a crucial role in deriving higher security bound of numerous cryptographic designs. Over the years, this general result has been applied in the context of deriving higher security bounds of numerous cryptographic constructions [DDNY18, DDNY19, DNT19, ML19, BDLN20, IMV16, MN17] that use XoP function as a component in their designs. The security proofs of most of these designs require a degeneration of the final outputs to get rid of the adaptive nature of the adversary. Hence the proof cannot use the fact that XoP function is a PRF. Instead, these security proofs require (by applying the H-Coefficient technique [Pat08a]) a good lower bound on the number of distinct solutions to a system of bivariate affine equations with a general ξ max , and therein comes the role of the result. As stated earlier, Mennink and Neves [MN17] used it to prove the optimal security bound of EWCDM. Iwata et al. [IMV16] also used this result to show the optimal security bound of CENC.
Despite the vivid applications of Theorem P i ⊕ P j for general ξ max , its proof is not very well understood in the community. The existing proofs of this result [Pat03,Pat05,Pat10] are very involved with lots of complicated equations. Moreover, the derivational process of these proofs has a lot of sloppiness in most of the crucial junctions. Hence, these proofs are practically not verifiable at all. Although the correctness of the proofs [Pat05,Pat10,NPV17] is debatable in the community, several authors have used this precarious result to derive an optimal bound for some constructions such as [IMV16,MN17,ZHY18]. Recently, Dutta et al. [DNS20] and Cogliati and Patarin [CP20] have independently developed a concrete and verifiable proof of Mirror theory for ξ max = 2. However, verifiable proof for Theorem P i ⊕ P j for any ξ max result is still unavailable.
Remark 1. We would like to mention that applying the result of Theorem P i ⊕ P j for any ξ max in deriving the optimal security of cryptographic constructions like EWCDM, CENC is technically correct. However, it may not be scientifically appropriate to apply a result whose correctness is still a matter of debate.

Our Contribution
In this paper, we prove that DWCDM with nonce space {0, 1} 3n/4 is secure against all computationally bounded adversaries that make roughly 2 3n/4 MAC queries and 2 n verification queries in nonce-respecting setting. We have also improved the MAC security bound of EWCDM from 2n/3 bits to 3n/4 bits in nonce-respecting setting. We would like to reiterate here that Mennink and Neves have already shown n bit PRF security of EWCDM, leaving the proof of unforgeability open. However, as stated earlier that their analysis is solely based on the result of Theorem P i ⊕ P j for any ξ max , the correctness of the proof is a subject of debate. Inspired by the result of [KLL20, JN20], we have proved that the extended mirror theory for general ξ max is secured roughly up to 3n/4 bits. In particular, we have proved two versions of this result. In one version, the system of equations and non-equations of the extended mirror theory is based on the same permutation, whereas in the other version, the system of equations and non-equations is based on two independent random permutations. Our security proof of the constructions is based on the H-Coefficient technique [Pat08a]. Our first result of the extended mirror theory helps to bound the real interpolation probability for a good transcript of DWCDM, whereas the other one helps to bound the real interpolation for a good transcript of EWCDM. We would like to point out that the proof of EWCDM is similar to that of [CLLL20]. Moreover, the proof in establishing 3n/4 bit security of EWCDM is less involved than proving 3n/4 bit bound of nEHtM construction [CLLL20], as our construction deals with two independent random permutations whereas the latter one deals with a single random permutation. However, our non-trivial primary contribution in the paper is to establish 3n/4 bit security of DWCDM.
As EWCDM is a close contender of DWCDM, and the proof of EWCDM was shown secure with less than 2 2n/3 MAC queries by Cogliati and Seurin [CS16] (albeit the optimal PRF bound by Mennink and Neves [MN17]), we include the proof of the improved bound of EWCDM in the paper.

Proof Approach
Our MAC security proof of DWCDM and EWCDM fundamentally relies on Patarin's H-coefficient technique [Pat08a,Pat08b]. Similar to the technique of [CS16,DNT19], we cast the unforgeability game of MAC to an equivalent indistinguishability game, with a suitable choice of an ideal world, that allows us to apply the H-coefficient technique for bounding the distinguishing advantage of the construction of our concern. One can express the evaluation of DWCDM (resp. EWCDM) as a sum of two identical permutations (resp. two independent permutations). Thus, q many such evaluations of DWCDM gives us a system of q many affine bi-variate equations as follows: Here Along with this, we also need to ensure that the verification attempt of the adversary should fail (as a part of the good transcript), i.e., for a verification query (ν , M , T ) (for DWCDM) and (ν , M , T ) (for EWCDM), chosen by the adversary, we should always have Hence, it tells us that we also need to incorporate bivariate affine non-equations along with the system of bivariate affine equations. This leads us to extend the mirror theory technique incorporating the affine non-equations along with the affine bivariate equations. We use this extended mirror theory result while lower bounding the real interpolation probability for a good transcript.

Preliminaries
General Notations: For a set X , we use the notation X ←$ X to denote that X is sampled uniformly at random from X and independent of all random variables defined so far. We denote an empty set as ∅. For two mutually disjoint sets X and Y, i.e., X ∩ Y = ∅, we denote their union as X Y, which we refer to as disjoint union. For a natural number n, {0, 1} n denotes the set of all binary strings of length n and {0, 1} * denotes the set of all binary strings of arbitrary length. For a non-empty finite set X ⊆ {0, 1} n and an element λ ∈ {0, 1} n , we write X ⊕ λ to denote the set {x ⊕ λ : x ∈ X }. For any binary string x ∈ {0, 1} * , |x| denotes the length i.e. the number of bits in x. For x, y ∈ {0, 1} n , we write z = x ⊕ y to denote xor of x and y. 0 denotes the element 0 n ∈ {0, 1} n and 1 denotes 0 n−1 1 ∈ {0, 1} n . For integers 1 ≤ b ≤ a, we write (a) b to denote a(a − 1) . . . (a − b + 1), where (a) 0 = 1 by convention and for any natural number q, [q] denotes the set {1, . . . , q}. We denote the set of all permutations over X as Perm(X ). When X = {0, 1} n , then we omit X and simply write Perm to denote the set of all permutations over {0, 1} n .

Security Definition of Block Cipher
A block cipher with key space K and domain {0, 1} n is a mapping E : K × {0, 1} n → {0, 1} n such that for all key k ∈ K, x → E(k, x) is a permutation over {0, 1} n and we denote E k (x) for E(k, x). We consider a distinguisher A with oracle access to a permutation of {0, 1} n that makes at most q queries with running time at most t and outputting a single bit after it finishes the interaction with the oracle. We define the pseudorandom permutation (prp)-advantage of A against the block cipher E as is the maximum prp advantage in which the maximum is taken over all adversaries A that makes q many queries with running time is at most t. Similar to the prp advantage, we say that A has strong pseudorandom permutation (sprp)-advantage against E if A is given an additional oracle access to the inverse of the permutation such that A makes at most q + queries (forward) to the permutation and q − queries (backward) to inverse permutation with running time at most t. We often merge the forward and backward queries and simply say A makes total q queries including forward and backward queries.

Nonce Based MAC
Let F : K × N × M → T be a keyed function where K, N , M and T are the key space, nonce space, message space and the tag space respectively. Based on F, we define the nonce-based message authentication code I = (I.KGen, I.TagGen, I.Ver) as follows: For k ∈ K, the signing algorithm I.TagGen k , takes as input (ν, M ) ∈ N × M and outputs T ← F(k, ν, M ) and the verification algorithm I.Ver k , takes as input (ν, M, T ) ∈ N ×M×T and outputs 1 if F k (ν, M ) = T ; otherwise it outputs 0. Let A be a (q m , q v , t)-adversary against the unforgeability of I with oracle access of the signing algorithm I.TagGen k and the verification algorithm I.Ver k such that it makes q m signing and q v verification queries with running time at most t. A is said to be nonce respecting if she does not repeat a nonce in signing queries. However, A may repeat nonces in its verification queries. Moreover, the signing and the verification queries can be interleaved. A is said to forge I if for any of its verification query (not obtained through a previous signing query), the verification algorithm returns 1. The advantage of A against the unforgeability of the nonce based MAC I is defined as where the randomness is defined over k ←$ K and the randomness of the adversary (if any). We write Adv nMAC where the maximum is taken over all (q m , q v , t)-adversaries A. In this paper, we skip the time parameter of the adversary as we will assume throughout the paper that the adversary is computationally unbounded. This will render us to assume that the adversary is deterministic. Moreover, A is non-trivial in the sense that it does not repeat any queries and does not make any queries whose output can be trivially computed.
Upper bound on Adv nMAC I (A). We obtain an upper bound for the nonce respecting MAC secuity of I in terms of the distinguishing advantage [DJN17], where the ideal world is comprised of a random oracle $ that samples the tag T independently and uniformly at random from {0, 1} n for every nonce message pair (ν, M ) and the reject oracle ⊥ that always returns 0 for any (ν, M, T ). Then, for any computationally unbounded and non-trivial nonce respecting adversary A, Adv nMAC where D O ⇒ 1 denotes that the distingisher D outputs 1 after interacting with its oracle O.

H-Coefficient Technique for Nonce-Based MAC
Let I = (I.KGen, I.TagGen, I.Ver) be a nonce-based MAC based on a keyed function F : K × N × M → T , where K, N , M and T are the key space, nonce space, message space and the tag space respectively. We fix a non-trivial and computationally unbounded distinguisher D that interacts with either of the two worlds: (1) in the real world it interacts with oracles (I.TagGen k , I.Ver k ) for a random key k or (2) in the ideal world it interacts with oracles ($, ⊥), making at most q m queries to its left (MAC) oracle and at most q v queries to its right (verification) oracle, and outputting a single bit. Let be the list of MAC queries and responses of D and be the list of verification queries and responses of D, where for all j, b j ∈ {0, 1} denotes the accept (b j = 1) or reject (b j = 0). We consider D to be stronger in the sense that it obtains some additional information after it made all its queries and obtains the corresponding responses but before it output its decision. If D interacts with the real world, then it obtains the key k of the construction and if D interacts with the ideal world, then a dummy key k is sampled uniformly at random from {0, 1} n and released to the adversary. The triplet τ = (τ m , τ v , k) constitutes the query transcript of the attack. Let X re and X id denote the random variable of realizing a transcript τ in the real world and ideal world respectively. τ is said to be attainable (with respect to D) if Pr[X id = τ ] = 0. Θ denotes the set of all attainable transcripts. Note that for an attainable transcript τ = (τ m , τ v , k), b i = 0, for every i ∈ [q v ]. Now, we state the main result of the H-coefficient technique (see e.g. [CS14] for the proof) as follows: Lemma 1. Let D be a fixed deterministic distinguisher and Θ = Θ g Θ b be some partition of the set of all attainable transcripts. Suppose there exists ratio ≥ 0 such that for any and there exists bad ≥ 0 such that Pr[X id ∈ Θ b ] ≤ bad . Then, Adv(D) ≤ ratio + bad .

Universality and Regularity of Keyed Hash Functions
Let K h and X be two non-empty finite sets and H be a keyed function H : K h ×X → {0, 1} n . Then, (i) Almost-Xor-Universality: H is said to be an axu -almost xor universal (AXU) hash function, if for any distinct x, x ∈ X and for any ∆ ∈ {0, 1} n , (ii) Almost Regularity: We say that H is an reg -almost regular (AR) hash function, if for any x ∈ X and for any ∆ ∈ {0, 1} n , (iii) r-way Regular: We say that H is said to be an r-reg r-way regular hash function if for any distinct x 1 , x 2 , . . . , x r ∈ X and for any non-zero ∆ ∈ {0, 1} n ,

Extended Mirror Theory
We prove the MAC security of EWCDM and DWCDM using the H-Coefficient technique, where one is required to lower bound the probability of realizing a good transcript in the real and the ideal world. In order to compute this probability in the real world, we need to count the number of permutations such that the following system of bivariate affine equations and non-equations hold. Note that π 1 and π 2 are two independent n-bit permutations for EWCDM, whereas π 1 = π 2 = π for DWCDM. Moreover, . . , Z sr } such that s = s + s r be the total number of vertices in the graph. We write an edge of E as {Y i , Z j }, and we denote its label as where V = is the set of vertices of V such that they are incident on at least one edge of E and L |E is the function L restricted over the set E. For a path P in the graph G = , we define the label of the path as L(P) Similarly, for a cycle C in the graph G, we define the label of the cycle as L(C) We say the graph G is good if it satisfies the following two conditions: 1. L(P) = 0, for all paths P in the graph G = , and 2. L(C) = 0, for all cycles C containing exactly one non-equation edge e ∈ E (i.e., all the remaining edges of C are elements of E).
For a bipartite graph G, we say that G is good, if it satisfies the following two conditions: 1. L(P) = 0, for all paths P of even length in the graph G = and 2. L(C) = 0, for all cycles C of even length containing exactly one non-equation edge e ∈ E (i.e., all other edges of C are elements of E).
Why good graph is called good? Let P be any path in G = , and Y s , Y t be the starting and the end vertex of the path respectively. Note that if L(P) is zero, then that implies Y s ⊕ Y t = 0, which is nothing but the permutation collision. Regarding condition (2), let C be any cycle of G such that it contains exactly one non-equation edge e and let x be the label of the path where Y s is the starting and Y t is the ending vertex of P respectively. Note that, the label of the edge {Y s , Y t } is L(e ). Therefore, . This is why we exclude such graphs from the set of good graphs. Similarly, for a bipartite graph G, we assume P to be a path of even length in G = , and Y s , Y t are the starting and end vertex of the path respectively. Note that as the path length is even, starting and ending vertex is Y s and Y t respectively.
In fact, the starting and the ending vertex could have been Z s and Z t . However, if the path length is odd, then the stating and ending vertex would have been Y s and Z t respectively. Note that, for such a even length path P, if L(P) is zero, then that implies Y s ⊕ Y t = 0 or Z s ⊕ Z t = 0 (if the starting and ending vertex would have been Z s and Z t respectively), which is nothing but the permutation collision. Regarding condition (2), let C be any cycle of G of even length such that it contains exactly one non-equation edge e and let x be the label of the path P = C/e . Then, it implies that Y s ⊕ Z t = x, where we assume that Y s is the starting and Z t is the ending vertex of P respectively. Note that, the label of the edge {Y s , Z t } is L(e ), and hence, This contradicts the non-equation Y s ⊕ Z t = L(e ), and hence we exclude such graphs from the set of good graphs. For such a good graph G, we associate a system of bivariate affine equations and non-equations for the general graph and for the bipartite graph as follows: Note that, in where we assume that there are α many components of G = (i.e., C 1 , . . . , C α ) with component size greater than 2 and β many components of G = (i.e., D 1 , . . . , D β ) having component size exactly 2. We write C to denote C 1 . . . C α and D to denote D 1 . . . D β . Definition 1. Let E G be a system of equations and non-equations corresponding to a good acyclic edge-labelled graph G (as defined above). An injective function Φ : In the following, we state and prove the following result of mirror theory which says that if G is a good acyclic edge-labelled (bipartite) graph such that its subgraph G = can be decomposed into finitely many components of size greater than 2 and exactly 2, then the number of injective solutions to E G is very close to the average number of solutions until the number of edges in E is roughly 2 3n/4 . Disclaimer: Although the way we define a component is a set of vertices, from now onwards, we also equivalently view a component as a graph with appropriate edges. Thus, C = C 1 . . . C α alternatively denotes a disjoint collection of subgraphs of G = . Now, we state the main theorem of Extended Mirror Theory that in principle estimates a lower bound on the number of solutions to the induced system of equations and nonequations for a good graph G. We state two versions of the theorem, one is for a good acyclic general graph, and another is for a good acylic bipartite graph.
Let q c denote the total number of edges in C. Then the total number of injective solutions to E G which are chosen from {0, 1} n , is at least: Theorem 2 (Bipartite Graph). Let G = (V 1 V 2 , E E , L) be a good bipartite graph with s many vertices in V 1 and s r many vertices in V 2 , such that |E| = q m , |E | = q v and s = s + s r , the total number of vertices of the graph G. Let q c denote the total number of edges in C. Then the total number of injective solutions to E G which are chosen from {0, 1} n , is at least: Notations: Before we prove the above two theorems, we set up a few notations. Let h(G) denote the number of solutions to the graph G. Let h c (i) denote the number of solutions for the subgraph C 1 . . . C i and h d (i) denotes the number of solutions for the subgraph . For the graph G, a blue dashed edge represents a non-equation edge and hence belongs to the set E and a red continuous edge represents an equation edge and hence belongs to the set E. Moreover, V = denotes the set of all vertices of the subgraph G = . We assume that there are µ i,j edges from E connecting vertices of the i-th and j-th components of G = where j < i. Moreover, let |V \ V = | = k and for any vertex v i ∈ V \ V = , there are µ i many blue dashed edges incident on v i .

Proof of Theorem 1
We prove the result in a step by step manner. We first estimate a lower bound on h c (α) and then we estimate a lower bound on h d (β), and finally, we estimate a lower bound on the number of solutions to G \ G = . Let V = C denote the set of vertices of C and w i

Lower Bound on h c (α).
To lower bound h c (α), we count the number of solutions in each of the α components of C. For the first component C 1 , there are 2 n ways to assign values to any one of the vertices of the component, and that uniquely determines the values to the rest of the variables in that component. For example, consider the graph as depicted in Fig. 3.2 and let us assume that we assign a value to vertex v and let the assigned value ve x. Then the value at node v 1 is x ⊕ λ 1 , at node v 2 is x ⊕ λ 2 , at node v 3 is x ⊕ λ 3 and at node v 4 is x ⊕ λ 4 . As the graph is good, none of the λ values is zero, and all the λ values are distinct. These two facts ensure the distinctness of the values assigned at node v 1 , v 2 , v 3 and v 4 by assigning the value to node v which has 2 n choices. Once such a solution is fixed for the first component, we consider the second component. We consider any arbitrary vertex in the second component should not take w 1 w 2 values. This is due to the fact that Y iw 1 +1 cannot take w 1 values. Moreover, once an assignment is done to Y iw 1 +1 , it fixes the value of the rest of w 2 − 1 vertices of C 2 such that each of the remaining vertices of C 2 do not collide with the previous w 1 values. Therefore, a total of w 1 + (w 2 − 1)w 1 = w 1 w 2 values are discarded. Additionally, as there are µ 2,1 many blue dashed edges connecting the component C 1 and C 2 , there are µ 2,1 many paths from the vertex Y iw 1 +1 to the vertices of the component C 1 , and hence it cannot take µ 2,1 values that violate the non-equality conditions of µ 2,1 many blue dashed edges. As a result, there are at most w 1 w 2 + µ 2,1 forbidden values for assignment to the vertex Y iw 1 +1 . Hence, there are at least (2 n − w 1 w 2 − µ 2,1 ) valid choices for Y iw 1 +1 . Once a valid value is assigned to the variable Y iw 1 +1 , the remaining variables in the second component will be assigned uniquely.
In general, for the i-th component, once the injective solution is fixed for the previous i − 1 components, there are at least where (1) holds as δ 1 + δ 2 + . . . + δ α = q v , the total number of blue dashed edges across the components of G = and (2) holds as (w 1 + . . . + w α ) = σ α = q c + α and α ≤ q c /2.

Lower Bound on h d (β).
Now we would like to find a lower bound on h d (i + 1) in terms of h d (i). Let us denote the label of the edges in component D i is λ * i and recall that σ α denotes the total number of vertices in C. Now, we consider the component D i+1 .
. Now, we are interested in obtaining a lower bound on h d (i + 1) as follows. Let Y σα+2i+1 be the vertex of D i+1 . Then, Y σα+2i+1 must satisfy the following: Since We say two indices u, v ∈ {σ α + 1, . . . , σ α + 2i} are in the same component of D i , if there is an edge between vertices Y u and Y v in the subgraph D i . Now, there are the following two cases: • On the other hand, we consider the case when u and v are in different components is the label of the edge whose end vertex is Y u (resp Y v ). In this case, we estimate a lower bound on h (P, Q). Note that when u and v are in different components and if any of the above conditions hold, then h (P, Q) = 0.
The following lemma gives a lower bound on h (P, Q) in terms of h d (i), proof of which is postponed in Sect. 3.2.
Now, we define the following two sets: Note that, δ is the number of multi-collisions of λ * i+1 and let ∆ denote the maximum number of multi-collisions maximized over λ * values. Now, it is easy to see that This is because k and k are not in the same component, and hence we have 2i(2i−2) choices for (k, k ). However, out of these many choices, there are 2δ(2i−2δ) Therefore, from Eqn. (4), Eqn. (5), Lemma 2 and the above two cases where (u, v) either belongs to the same component or in a different component, we have: .
(6) Having the number of solutions for D i+1 in terms of the number of solutions to D i , we find out the number of solutions to the remaining variables as follows.

Lower Bound on the Number of Solutions to Remaining Variables.
Now, we lower bound the number of solutions for V \ V = . Recall that |V \ V = | = k . Fix such a vertex Y σα+2β+i and let us assume that µ σα+2β+i many blue dashed edges are incident on Y σα+2β+i . Let y be assigned to the variable Y σα+2β+i . For y to be a valid assignment, it must satify the following: y should be distinct from previous σ α + 2β many assigned values, y should be distinct from (i − 1) many assigned values to the variables of the set V \ V = , y should not take µ σα+2β+i values such that it violates the non-equality conditions of µ σα+2β+i many blue dashed edges.
Therefore, the number of valid choices of y is at least Summarizing everything, the total number of possible injective solutions for the remaining vertices is at least Therefore, from Eqn. (3), Eqn. (6) and Eqn. (7), we have

Algebraic Calculation.
In this section, we individually bound A.1, A.2 and A.3. We begin with bounding A.1 as follows: Bounding A.1: Recall that σ α = q c + α. By using Eqn.
Merging three bound: In the final step, we merge the bound that we obtained from A.1, A.2 and A.3. Therefore, by plug-in the lower bounds of A.1, A.2 and A.3 into Eqn. (8), we obtain where the above inequality follows as q v + q v + q v = q v , the total number of non-equation edges.

Proof of Lemma 2
In this section, we prove Lemma 2. Let P = Y u and Q = Y v such that u and v are in the different components and λ is the label of the edge whose end vertex is Y u (resp Y v ). By removing the two equations whose constant part is λ * a and λ * b , we derive Therefore, from the above two equations with the trivial inequality that 2 2n ≥ (2 n − (σ α + 2i − 4))(2 n − (σ α + 2i − 2)), we obtain the result.

Proof of Theorem 2
We prove the result in the same way as we proved Theorem 1, i.e., we lower bound on h c (α), and then we estimate a lower bound on h d (β), and finally we estimate a lower bound on the number of solutions to G \ G = . For the i-th component of C, i.e., C i , which is acyclic and labelled bipartite graph, let V = Ci be the set of vertices of the component C i .

Lower Bound on h c (α).
We lower bound h c (α) by counting the number of solutions in each of the α components of C. For the first component, C 1 , there are 2 n ways to assign values to any one of the vertices of V ↑ 1 , which uniquely determines the values of all the vertices of V ↓ 1 ∪ V ↑ 1 . For assigning values to a vertex of V ↑ 2 of the second component C 2 , it cannot take s ,1 s ,2 + s r,1 s r,2 values. Additionally, as there are µ 2,1 many blue dashed edges connecting the component C 1 and C 2 , there are µ 2,1 many paths from the assigned vertex to the vertices of the component C 1 and hence it cannot take µ 2,1 values that violate the nonequality conditions of µ 2,1 many blue dashed edges. As a result, there are at least (2 n − s ,1 s ,2 − s r,1 s r,2 − µ 2,1 ) valid choices. In general, for the i-th component, there are at least (2 n − (s ,1 + . . . + s ,(i−1) )s ,i − (s r,1 + . . . + s r,(i−1) )s r,i − µ i,1 − . . . − µ i,i−1 ) injective solutions for the i-th component. For notational simplicity, we write δ i = ( µ i,1 +. . .+ µ i,i−1 ). Hence, we have where (1) holds as δ 1 + δ 2 + . . . + δ α = q v , the total number of blue dashed edges across the components of G = and (2) holds as (s ,1 + s r,1 . . . + s ,α + s r,α ) = q c + α and α ≤ q c /2. Bound on h d (β).

Lower
We want to have a lower bound of h d (i + 1) in terms of h d (i). Let us denote the label of the edges in the component D i is λ * i and recall that We write the vertex of one part of D i as Y s ↑ +i and the other part as Z s ↓ +i , i.e., Note that |Z 1 | = (s ↑ + i) and |Z 2 | = (s ↓ + i). Applying the inclusion-exclusion principle, we have where h (P, Q) denotes the number of solutions to We say two indices u ∈ {s ↑ + 1, . . . , s ↑ + i}, v ∈ {s ↓ + 1, . . . , s ↓ + i} are in the same component of D i , if there is an edge between vertices Y u and Z v in the subgraph D i . Now, there are the following two cases: • If P = Y u and Q = Z v such that u and v are in the same component of D i , and λ * i+1 = λ (λ is the label of the edge connecting vertices Y u and Z v ), then h (P, Q) = h d (i). Moreover, if λ * i+1 = λ, then h (P, Q) = 0.
• Otherwise, we consider the case when u and v are in different components, and is the label of the edge whose end vertex is Y u (resp Z v ). In this case, we estimate a lower bound on h (P, Q). Note that when u and v are in different components and if any of the above conditions hold, then h (P, Q) = 0.
The following lemma gives a lower bound on h (P, Q) in terms of h d (i), proof of which can be found in Eqn. (4) of [KLL20].
Now, we define the following two sets: Recall that, δ is the number of multi-collisions of λ * i+1 . Now, using the similar argument, one can check that Therefore, Eqn. (10), Eqn. (11), Lemma 3, and the above two cases where (u, v) either belongs to the same component or in different component lead us to the following inequality: where (1) holds as i ≤ 2 n−1 .Having the number of solutions for D i+1 in terms of the number of solutions to D i , we find out the number of solutions to the remaining variables as follows:

Lower Bound on the Number of Solutions to Remaining Variables.
Now, we lower bound the number of solutions for V \ V = . Recall that |V \ V = | = k . Fix such a vertex and let us assume that µ σα+2β+i many blue dashed edges are incident on it. Let y be assigned to the variable. For y to be a valid assignment, it must have the following: y should be distinct from previous σ α + 2β many assigned values.
y should be distinct from (i − 1) many assigned values to the variables of the set V \ V = .
y should not take µ σα+2β+i values such that it violates the non-equality conditions of µ σα+2β+i many blue dashed edges.
Merging three bound: In the final step we merge the bound that we obtained from A.1, A.2 and A.3. With the fact that q v + q v + q v = q v , the total number of non-equation edges and by plugging-in the lower bounds of A.1, A.2 and A.3 into Eqn. (14), we obtain Remark 2. We would like to note here that the proof of Theorem 1 and Theorem 2 differs from that of [KLL20] as the proof in [KLL20] takes care of only lower bounding the number of solutons to a system of bivariate affine equations, whereas our result takes care in lower bounding the number of solutions to a system of bivariate affine equations and non-equations. That is why, while counting the number of solutions in Sect. 3.1.1 and Sect. 3.3.1 for extended mirror theory, we discarded the choices which violated the non-equality conditions. Such restrictions were not present in [KLL20].

Security Result of EWCDM
In this section we state and prove that EWCDM can be secured up to 2 3n/4 MAC queries and 2 n verification queries against nonce respecting adversaries. The following result bounds the MAC advantage of EWCDM against nonce respecting adversaries.
, t H be the time for computing the hash function. Assuming axu ≈ 2 −n , EWCDM is secured up to roughly q m ≈ 2 3n/4 MAC queries and q v ≈ 2 n verification queries.

Proof of Theorem 3
For the sake of notational simplicity, we refer to the construction EWCDM[E, H] simply as EWCDM when the primitives are understood from the context. As the first step of the proof, we replace two independent block ciphers of the construction with two independently sampled n-bit uniform random permutations π 1 and π 2 at the cost of the sprp advantage of E and denote the resulting construction as EWCDM * [π 1 , π 2 , H], i.e., Instead of arguing the security of EWCDM * , we argue the security of EWCDM * [π 1 , π −1 2 , H], which we denote as EWCDM + . Note that the distinguishing advantage of the adversary D for the latter is identical to the former as π 1 , π 2 are mutually independent. The advantage of analysing the security of the latter construction is that it is convenient to argue the security of EWCDM + as one can view an evaluation T = EWCDM + (ν, M ) as the xor of two permutations in the middle of the function, i.e., Our goal is to upper bound the information-theoretic MAC security of EWCDM + . For doing this, we resort to the Eqn.(1) which allows us to bound the MAC security of EWCDM + in terms of the distingusihing advantage in distinguishing EWCDM + from an ideal world consisting of a random oracle $ that outputs a random tag on every input (ν, M ) ∈ {0, 1} n × M and a reject oracle ⊥ that always outputs 0 on every query (ν, M, T ). At the end of the interaction, the real world releases the hash key k h and the ideal world releases a random dummy key k h . As a result of it, we apply the H-Coefficient Technique [Pat08a] to bound the distinguishing advantage of EWCDM + . We can represent an attainable transcript τ = (τ m , τ v , k h ) in terms of the following equations,

Definition and Probability of Bad Transcripts
In this section, we define and bound the probability of bad transcripts in the ideal world. We say a transcript τ = (τ m , τ v , k h ) is bad if it satisfies either of the following conditions: Having defined the bad transcripts, we bound the probability of realizing bad transcripts in the ideal world as follows.
Lemma 4. Let X id and Θ b be defined as above. Then, we have We defer the proof of the Lemma in Sect. 5.

Analysis of good transcripts
Let us consider τ = (τ m , τ v , k h ) be a good transcript and we show that realizing τ is almost as likely in the real world as in the ideal world. In particular, we prove the following result.
Lemma 5. Let τ = (τ m , τ v , k h ) be a good transcript. Then Proof. Since the MAC oracle in the ideal world is perfectly random and the verification oracle always outputs 0, one simply has To lower bound the real interpolation probability, we say that a pair of permutations (π 1 , π 2 ) is compatible with τ if Let Comp(τ ) denotes the set of all pair of permutations (π 1 , π 2 ) that are compatible with τ . Therefore, we have Lower bounding P mv implies lower bounding the probability of the number of solutions to the system of q m many bivariate affine MAC equations and q v many bivariate affine verification non-equations E m ∪ E v . From the above system of bivariate affine equations and non-equations, one can induce an edge-labelled undirected bipartite graph G τ = (V = V 1 V 2 , E E , L), where the set of nodes V is partitioned into two sets, V 1 = {Y 1 , . . . , Y s } and V 2 = {Z 1 , . . . , Z sr }, E is the set of edges corresponding to each MAC equation, and E is the set of edges corresponding to each verification non-equation. Therefore, q m = |E| and is the subgraph of G τ . Now, it is easy to argue the following.  Note that for the star type component, if we consider any path of G = τ of even length, then the label of the path is non-zero, otherwise bad condition B.3 would have been satisfied. Moreover, the graph is acyclic by construction, which proves the claim.
Resuming the proof of Lemma 5. Since G τ is good, the graph is acyclic. Therefore, let us assume that there are α + β components in the subgraph G = τ such that the size of each of the first α components are greater than 2 and the remaining β components are of size 2 each. Moreover, due to the construction, each of the first α components is star type graph. As the transcript τ is good, the total number of edges in the first α components is at most q 2/3 m . Therefore, we apply Theorem 2 to obtain Therefore, from Eqn. (16) and Eqn. (17), we have Finally, by taking the ratio of Eqn. (18) to Eqn. (15), we obtain the result.

Proof of Lemma 4
Using the union bound, we write In the following, we bound the probabilities of all the bad events individually.
Bounding B.1: Let X denotes the cardinality of the set {∃i = j ∈ [q m ] : T i = T j } and for all i = j ∈ [q m ], let I ij be the indicator random variable that takes the value 1 if T i = T j , otherwise it takes the value 0. Therefore, By using the linearity of expectation, we have . Note that, for a fixed choice of i, a, the probability of the above event is at most axu due to the randomness of the hash key. However, the number of choices for i is at most one as the adversary is nonce-respecting. Therefore, by varying over all possible choices of a, we have Since, in the ideal oracle, the hash key is sampled independently to all previously sampled MAC responses T i , we write Finally, Lemma 4 follows from Eqn. (19)-(23).

Security Result of DWCDM
In this section, we state and prove that DWCDM is secure up to 2 3n/4 MAC queries and 2 n verification queries against nonce respecting adversaries. The following result bounds the MAC advantage of DWCDM against nonce respecting adversaries. For the sake of notational simplicity, we refer DWCDM as Π.
, t H be the time for computing the hash function. Assuming reg , axu , 3-reg , 4-reg ≈ 2 −n , we obtain the desired bound for DWCDM.
We want to point out that the hash function's 3-way regular and 4-way regular properties do not necessarily demand longer hash keys. For example, 3-way and 4-way regular bound of Polyhash [MI11] function with n-bit key is /2 n [DDNY19, Proposition 1], where denotes the maximum number of message blocks.

Proof of Theorem 4
For the sake of simplicity, we will refer to the construction Π[E, E −1 , H] as Π when the underlying primitives are understood from the context. As the first step of the proof, we replace the block cipher E and its inverse with an n-bit uniform random permutation π and its inverse respectively. This comes at the cost of the sprp advantage of E, and we denote the resulting construction as Π * . Our goal is to upper bound the information-theoretic MAC security of Π * . For doing this, we resort to the Eqn.(1), which allows us to bound the MAC security of Π * in terms of the distinguishing advantage in distinguishing Π * from an ideal world consisting of a random oracle $ that outputs a random tag on every input (ν, M ) ∈ {0, 1} 3n/4 × M and a reject oracle ⊥ that always outputs 0 on every query (ν, M, T ). As a result, we apply the H-Coefficient Technique [Pat08a] to bound the distinguishing advantage of Π * . For the sake of notational simplicity, we write ν =ν 0 n/4 . As before, we can represent an attainable transcript τ = (τ m , τ v , k h ) in terms of the following equations: We say a cycle C = = (i 1 , i 2 , . . . , i p ) of length p in the graph G = = (V = , E, L |E ) is valid if the imposed equality pattern of (ν, T ), generated out of C = , derives the following equation: We also consider a cycle C = = (i 1 , i 2 , . . . , i p ) of length p in G = , containing exactly one non-equation edge e ∈ E (i.e., all other edges of C = are elements of E). We call C = to be valid if the imposed equality pattern of (ν, T ) and (ν , T ), generated out of C = , derives the equation where e represents the equation π(ν ) ⊕ π(T ) = ν ⊕ H k h (M ). Now, we state a simple result for MAC queries. Proof. If (ν j , M j , T j ) appears after the i-th query (ν i , M i , T i ), then the event that T j collides with ν i holds with probability 2 −n . Moreover, if (ν j , M j , T j ) appears before the i-th query (ν i , M i , T i ), then the event that T j collides with ν i holds with probability exactly 1/2 n/4 . Here we use the fact that the probability of the last n/4 bits of T i set to all zero is 1/2 n/4 .

Definition and Probability of Bad Transcripts
In this section, we define and bound the probability of bad transcripts in the ideal world. We say a transcript τ = (τ m , τ v , k h ) is bad if its associated graph G τ satisfies either of the following conditions: -B.2 : G = has a component of size at least 5.
-B.4 : G = contains a valid cycle C = of any arbitrary length.
-B.5 : G = contains a valid cycle C = of any arbitrary length.
Moreover, τ is also said to be bad if -B.9 : G = has a path of length 3 such that its label is zero.
Having defined the bad transcripts, we bound the probability of realizing bad transcripts in the ideal world as follows.
Lemma 7. Let X id and Θ b be defined as above. Then, we have We defer the proof of the Lemma in Sect. 7.

Analysis of good transcripts
Let us consider τ = (τ m , τ v , k h ) a good transcript, and we show that realizing τ is almost as likely in the real world as in the ideal world. In particular, we prove the following result. Proof. Since the MAC oracle in the ideal world is perfectly random and the verification oracle always outputs 0, we obtain To lower bound the real interpolation probability, we say that a permutation π is compatible Let Comp(τ ) denotes the set of permutations that are compatible with τ . Therefore, we have is the subgraph of G τ . Now, it is easy to argue the following.

Claim 1. For a good transcript τ , the induced graph G τ is a good graph.
Proof. For a good transcript τ , as the component size of G = τ is at most 4, it is to be noted that the subgraph G = τ contains only three types of components as follows in Fig. 6.1. Resuming the proof of Lemma 8. Suppose there are α + β components in the subgraph G = τ such that the size of each of the first α components is greater than two and the remaining β components are of size two each. As the transcript τ is good, the total number of edges in the first α components is at most q 2/3 m . Now applying Theorem 1, we obtain P mv ≥ 1 2 nqm 1 − Therefore, from Eqn. (25) and Eqn. (26), we have Note that the hash key is not derived using π, it is sampled independent to the block cipher keys.

Proof of Lemma 7
Using the union bound, we write Depending on the collision pattern of the vertices, we have the following cases, each of which is analyzed one by one as follows: Bounding Case-I. ∃i, j, k, l ∈ [q m ] such that T i = T j = T k = T l . For a fixed set of i, j, k, l ∈ [q m ], this event is bounded by 2 −3n as each T i is sampled uniformly at random from {0, 1} n . Summing over all possible choices of i, j and k, we obtain the bound q 4 m 24·2 3n ≤ qm 2 3n/4 , assuming q m ≤ 2 3n/4 . Bounding Case-II. ∃i, j, k, l ∈ [q m ] such that T i = T j = T k = ν l or T i = T j = T k , ν k = T l . For a fixed set of i, j, k T i = T j = T k is bounded by 2 −2n . Now, we have the following two subcases for ν l = T i = T j = T k : • Case (a): If l < i, j, k, then the probability of ν l = T i is bounded by 2 −n . Thus the overall probability becomes 2 −3n . Summing over all possible choices of i, j, k, l, we obtain q 4 m 24·2 3n .
• Case (b): Otherwise, without loss of generality, we assume that l > i. In that case, the probability that ν l can be set to T i is the probability that T i is a valid nonce. Applying Lemma 6, the overall probability is 2 −9n/4 , and if we sum over all possible choices of i, j, k, we obtain the bound q 3 m 6·2 9n/4 ≤ qm 2 3n/4 . For the other case, i.e., T i = T j = T k , ν k = T l , we have the following two subcases: • Case (a): If l < k, then due to Lemma 6, the probability of ν k = T l is bounded by 2 −n/4 , and thus the overall probability becomes 2 −9n/4 . The number of choices for each of i, j, l is q m , and the number of choices for k = 1. Summing over all possible choices of i, j, k, l, we obtain • Case (b): If l > k, then the probability of the event ν k = T l is 2 −n , and therefore, the overall probability becomes 2 −3n . Summing over all possible choices of i, j, k, l, we have • Case (a): If l < k < i, j, then the probability of ν k = T i is bounded by 2 −n , and thus the overall probability becomes 2 −3n . Summing over all possible choices of i, j, k, l, we obtain the bound q 4 m 2 3n .
• Case (b): If l < k, k > i or k > j, then without loss of generality, we assume that k > i, and in that case the probability that ν k is set to T i is the probability that T i is a valid nonce, and applying Lemma 6, the overall probability becomes 2 −9n/4 . Summing over all possible choices of j, k, l, we have q 3 m 2 9n/4 ≤ qm 2 3n/4 .
• Case (c): If l > k and k < i, j, then the probability of ν k = T i is bounded by 2 −n , ν l is set to T k is the probability that T k is a valid nonce, and thus due to Lemma 6, the overall probability is 2 −9n/4 . Summing over all possible choices of i, j, k, l, we obtain the bound q 3 m 2 9n/4 .
• Case (d): If l > k, k > i or k > j, then without loss of generality, we assume that k > i, and in that case the probability that ν k is set to T i is the probability that T i is a valid nonce, and applying Lemma 6, the overall probability becomes 2 −3n/2 . Summing over all possible choices of j, k, l, we have Bounding Case-IV. ∃i, j, k, l ∈ [q m ] such that ν i = T j , ν j = T k , ν k = T l . We bound this event using different subcases.
• Case (a): If i < j < k < l, then due to Lemma 6, we obtain 2 −3n bound, and varying over all possible choices of i, j, k, l, we obtain q 4 m 2 3n bound.
• Case(b): If i < j < k and k > l, then due to Lemma 6, we obtain 2 −9n/4 bound, but there is exactly one choice of k and q m many choices for i, j, l. Hence, by summing over all possible choices of i, j, k, l, we have q 3 m 2 9n/4 ≤ qm 2 3n/4 .
• Case (c): If i < j and j > k > l, then due to Lemma 6, we obtain 2 −3n/2 bound, but there is exactly one choice of k, l and q m many choices for i, j. Hence, by summing over all possible choices of i, j, k, l, we obtain q 2 m 2 3n/2 ≤ qm 2 3n/4 bound. • Case (d): If i > j > k > l, then due to Lemma 6, the probability of the event is bounded by 2 3n/4 , and there is exactly one choice of i, j, k, leaving q m choices for l, which eventually gives qm 2 3n/4 bound.
Bounding Case-V. ∃i, j, k, l ∈ [q m ] such that T i = T j , ν j = T k , ν k = T l . For a fixed set of i, j T i = T j is bounded by 2 −n . Now, we have the following three subcases for ν j = T k , ν k = T l : • Case (a): If l < k < j, then the probability of both ν j = T k , and ν k = T l are bounded by the probability that T k and T l are valid nonce respectively, both of which is equal to 2 −n/4 . Thus, the overall probability becomes 2 −3n/2 , and summing over all possible choices of i, j, k, l, we obtain • Case (c): If l, j > k or l, j < k, then with similar argument as above, the overall probability becomes 2 −9n/4 , and the number of choices for i, j, k, l is q 3 m , and hence we obtain By using the linearity of expectation, we have Bounding Parallel Edges. A parallel edge or a cycle of length 2 in G = implies that ∃i = j ∈ [q m ] such that ν i = T j , ν j = T i . For a fixed choice of i, j (w.l.o.g, assume i < j), the probability of ν i = T j , ν j = T i is bounded by 2 −5n/4 . This is because of Lemma 6, the probability of ν i = T j is bounded by 2 −n and the probability of ν j = T i is bounded by 2 −n/4 . As there exists only one choice of j and q m many choices of i, summing over all possible choices of i and j, we obtain qm 2 5n/4 bound. Bounding Triangle. A triangle in G = implies that ∃i = j = k ∈ [q m ] such that ν i = T j , ν j = T k , ν k = T i . If i < j < k, the probability of ν i = T j , ν j = T k , ν k = T i is bounded by 2 −9n/4 . Note that, due to Lemma 6, the probability of ν i = T j , and ν j = T k can be bounded by 2 −n each, and the probability of ν k = T i is bounded by 2 −n/4 . As there exists only one choice of k and q m many choices for i, j, summing over all possible choices of i and j, we obtain q 2 m 2 9n/4 bound. On the other hand, if i > j > k, the probability of ν i = T j , ν j = T k , ν k = T i is bounded by 2 −3n/2 , and the number of choices for each of i, j is 1, and the choice of k is q m , and hence we obtain a bound of qm 2 3n/2 . All the other ordering of i, j, k leads to similar analysis as done in the above two cases, and hence this bad case can be bounded by qm 2 3n/2 . Note that any other way of forming the triangle involving only the MAC queries immediately implies that the triangle either contains a self loop or contains a parallel edge. Since these two events (i.e., self loop and parallel edge) have already bounded, we have deliberately skipped the analysis for these cases.
Bounding Square. A square in G = implies that ∃i = j = k = l ∈ [q m ] such that ν i = T j , ν j = T k , ν k = T l , ν l = T i . If i < j < k < l, the probability of ν i = T j , ν j = T k , ν k = T l , ν l = T i is bounded by 2 −13n/4 . This is because of Lemma 6, the probability of ν i = T j , ν j = T k , and ν k = T l are bounded by 2 −n each, and the probability of ν l = T i is bounded by 2 −n/4 . As there exists only one choice of l and q m many choices for i, j, k, summing over all possible choices of i, j and k, we obtain q 3 m 2 13n/4 bound. On the other hand, if i > j > k > l, the probability of the event ν i = T j , ν j = T k , ν k = T l , ν l = T i is bounded by 2 −7n/4 . Moreover, the number of choices for each of the i, j, k is 1, and the number of choice for l is q m . Hence, we obtain a bound of qm 2 7n/4 . All the other ordering of i, j, k leads to similar analysis as done in the above two cases, and hence this bad case can be bounded by qm 2 7n/4 . Note that any other way of forming the square involving only the MAC queries immediately implies that the square either contains a self loop or contains a parallel edge, or contains a triangle. Since these events have already been bounded, we have deliberately skipped these cases from our analysis. Therefore, from the above four cases, we obtain the maximum bound to be qm 2 n , and thus we write Bounding B.5 | B.1 ∧ B.2 ∧ B.4. Recall that event B3 holds if there exists a valid cycle C = in G = , which implies that the sum of the labels of the cycle is zero. However, as we conditioned on B.1 ∧ B.2 ∧ B.4, it is enough to bound the existence of a valid cycle of length one (self loop), two (parallel-edges), three (triangle) and four (square) involving a verification query.
Bounding Self-Loop. A self loop or cycle of length 1 in G = implies that ∃a ∈ [q v ] such that ν a = T a and H k h (M a ) = ν a . Note that, for a fixed choice of a, the above event holds with probability at most reg , as we have assumed the hash function to be reg regular. Summing over all choices of a, we obtain q v reg bound.
Node with concentric circle denotes the verification query node.
Bounding Parallel Edges. A parallel edge or cycle of length 2 in G = that involves a verification query implies either of the following two conditions: ∃i ∈ [q m ], a ∈ [q v ] : • Bounding Case-I. For a fixed choice of i and a, the probability of the event is bounded by axu . This is due to the randomness of hash key k h (note that M i = M a , as we have assumed a non-trivial distinguisher). As there exists only one choice of i for which the above probability is bounded by axu , summing over all possible choices of i and a, we obtain the bound q v axu .
• Bounding Case-II. Similarly, for a fixed choice of i, a, the probability of is bounded by axu due to the randomness of hash key k h , Note that, in this case there exists at most three choices of i (as we have at most three collision of T ) for which the above probability is bounded by axu . Summing over all possible choices of i and a, we obtain the bound 3q v axu .
Thus, from the above two cases, the probability of forming a parallel edge can be bounded by 3q v axu .
Bounding Triangle. For the case of a triangle or cycle of length 3 in G = that involves the valid cycle C = implies that ∃i, j ∈ [q m ], a ∈ [q v ] such that either of the following holds: Case-III : We bound each of these events as follows: • Bounding Case-I. For a fixed choice of i, j, a, the probability of the event is bounded by 3-reg using the randomness of the hash key as we have assumed the hash function is 3-reg -3-way regular (note that we have conditioned on B.1 and therefore T j = 0). Note that, the number of choice for j and i is restricted to one and three respectively. Hence, summing over all possible choices of indices, we obtain the bound to be 3q v 3-reg .
• Bounding Case-II. We analyze this case in two different subcases: = 0, and thus we can bound the event by 3-reg . In this case, choices of i and j is 3 an 1 respectively, and hence we obtain the bound to be 3q v 3-reg .
= 0, and in that case we again consider two different subcases: (i) if i < j, then T j = ν i holds with probability 2 −n and T i is to be valid (i.e. last n/4 bits of T i has to be zero), which holds with probability 2 −n/4 . Moreover, the number of choices of i, j in this case are q m and 1 (as ν j = ν i ⊕ T i ) resp. and thus we obtain the bound to be qm 2 5n/4 . (ii) If i > j, then T j = ν i holds with probability 2 −n/4 , and T i should be ν i ⊕ ν j which holds with probability 2 −n . In this case, the number of choices of j is q m and i in 1, resulting in probability of the event to be bounded by qm 2 5n/4 . • Bounding Case-III. For a fixed choice of i, j and a, the probability of the event is bounded by 3-reg as = T i holds with probability at most 3-reg (by the assumption that the hash function is 3-reg -3-way regular). Note that, in this case choice of j and i is one. Hence, summing over all possible choices of indices, we obtain the bound to be q v 3-reg .
Therefore, we see that for all the above cases, the maximum probability of forming a closed triangle is max{3q v 3-reg , qm 2 5n/4 }. Note that the other way of forming the valid cycle C = in G = that involves the verification query immediately implies the existence of a self loop or parallel edges.
Bounding Square. For the case of a square or a valid cycle C = of length 4 in G = implies that ∃i, j, k ∈ [q m ], a ∈ [q v ] such that either of the following holds: Case-III : ν a = T i , ν i = T j , ν j = T k , T a = ν k and We bound each of these events as follows.
• Bounding Case-I. Note that, = 0 as T i = T j implies ν j = ν k , which is not possible, and thus we can bound the event by q v 4-reg , as the choices of i, j and k are one each.
• Bounding Case-II. Here also we have, as T k = T a implies ν k = T k which essentially gives a self loop. Thus, we can bound the event by q v 4-reg , as the choices of i, j and k are one each.
• Bounding Case-III. We analyze this case in two different subcases depending on whether ν k is equals to T i ⊕ T j ⊕ T k or not. For a fixed choice of i, j and a, if = 0, and thus we can bound the event by 4-reg . In this case, choices of i, j and k is 1, 1, and 1 respectively, and hence we obtain the bound to be q v 4-reg .
On the other hand, if ν k = T i ⊕ T j ⊕ T k , then H k h (M i ) ⊕ H k h (M j ) ⊕ H k h (M a ) = 0, and in that case we consider the following subcases: -Case (a): If k > i, j, then T k = ν j holds with probability 2 −n . If i < j, then T j = ν i holds with probability 2 −n , and the number of choices of i, j, k in this case are q m , q m , and 1 (as ν k = T i ⊕ ν i ⊕ ν j ) resp., and thus we obtain the bound to be q 2 m 2 2n . On the other hand, if i > j, then T j = ν i holds with probability 2 −n/4 , and the choices for i would become 1, obtaining a bound of qm 2 5n/4 . -Case (b): If j > i, k, then T j = ν i holds with probability 2 −n , and T k = ν j holds with probability 2 −n/4 . The number of choices for i, j, k are q m , 1 and 1, and hence resulting in probability of the event to be bounded by qm 2 5n/4 . -Case (c): If i > j, k, then if j > k, both T j = ν i and T k = ν j holds with probability 2 −n/4 , and T i = ν i ⊕ ν j ⊕ ν k holds with probability 2 −n , hence bounding the overall probability by qm 2 3n/2 . If k > j, T k = ν j holds with probability 1 2 n , while the choices for j can be at most q m . Hence, the probability can be bounded by q 2 m 2 9n/4 . Therefore, we see that for all the above cases, the maximum probability of forming a closed square is max{3q v 4-reg , qm 2 3n/4 }. Therefore, we see from all of the above cases the maximum probability of forming a valid cycle C = in G = is max{3q v 4-reg , 3q v 3-reg , 3q v axu , q v reg , q m 2 3n/4 }.
Bounding B.7: Recall that, the event B.7 holds if ∃i = j ∈ [q m ] such that ν i ⊕ H k h (M i ) = ν j ⊕ H k h (M j ), ν i = T j . Now, we consider two subcases: • (a) For fixed i and j, if i < j then ν i = T j holds with probability 2 −n (due to Lemma 6), and ν i ⊕ H k h (M i ) = ν j ⊕ H k h (M j ) holds with probability axu . Summing over all possible choices of i and j, we obtain the bound to be q 2 m axu 2 n .
• (b) When i > j, then ν i = T j holds with probability 2 −n/4 (due to Lemma 6), and as before ν i ⊕ H k h (M i ) = ν j ⊕ H k h (M j ) holds with probability axu . In this case, possible choices of i and j is 1 and q m respectively and therefore by summing over all possible choices of indices, we obtain the bound to be qm axu 2 n/4 . Therefore, from each of the above cases we have Bounding Case-I. ∃i, j, k ∈ [q m ] such that T i = T j , ν j = T k . For a fixed set of i, j ∈ [q m ], the probability of the event T i = T j can be bounded by 2 −n as each T i is sampled uniformly at random from {0, 1} n . Now, we first consider the case ν i ⊕ ν j ⊕ ν k = 0. In this case, we can bound the probability of the event 3-reg property of the hash function. Now we have the following subcases: • Case (a): If k < j, the probability of the event ν j = T k is bounded by 2 −n/4 . Moreover, the total number of choices for i, j, k is at most q 2 m . Therefore, summing over all possible choices of i, j and k, we obtain the bound max{ q 2 m 1. the induced graph from the transcript contains a cycle, 2. or any of the components of the induced graph contains a path of length 3 or more.
For example, consider the following system of equations E m =      (i). π(ν 1 ) ⊕ π(T 1 ) = λ 1 (ii). π(ν 2 ) ⊕ π(T 2 ) = λ 2 (iii). π(ν 3 ) ⊕ π(T 3 ) = λ 3 with the equality that ν 1 = T 2 and T 1 = T 3 . Now, if we represent the above system of equations in the form of a graph as defined in [DDNY18], then it would result in a graph (a) as depicted in Fig. 8, where the path length is 2. As the number of equations is 3, the graph in (a) contains three nodes. An edge joins node (i) with node (ii) as the equation (i) and (ii) has a common variable π(ν 1 ) = π(T 2 ). Similarly, node (i) is joined by an edge with node (iii) because the equation (i) and (iii) has a common variable π(T 1 ) = π(T 3 ). However, if we represent E m in our graph-theoretic setup, then that would result in graph (b) as depicted in Fig. 8, where the path length is three. In order to improve the security bound of DWCDM with 3n/4 bit nonce, we allow the transcripts whose induced graph contains a component having path length of at most 3. In other words, we reject all the transcripts that result in a graph with components of path length four or more. As before, we also avoid the presence of any cycles in the components. We also reject the transcript if any verification query forms a cycle with some MAC queries of the transcript. These restrictions immediately lead us to have one extra level of assumption on the hash function, which says that it is not sufficient to have only the 3-way regular property of the hash function, but it should be 4-way regular as well. Now, it is natural to wonder whether one can get kn/k + 1 bit security with kn/k + 1 bit nonce by allowing the components to have a path of length at most k and rejecting the transcripts that induce components of path length k + 1 or more. In fact, this result was stated as a conjecture in [DDNY18]. However, the bottleneck of proving this result is the good transcript analysis which stands on the availability of verifiable proof for Theorem P i ⊕ P j for any ξ max result.