Cryptanalysis of Rocca and Feasibility of Its Security Claim

. Rocca is an authenticated encryption with associated data scheme for beyond 5G/6G systems. It was proposed at FSE 2022/ToSC 2021(2), and the designers make a security claim of achieving 256-bit security against key-recovery and distinguishing attacks, and 128-bit security against forgery attacks (the security claim regarding distinguishing attacks was subsequently weakened in the full version in ePrint 2022/116). A notable aspect of the claim is the gap between the privacy and authenticity security. In particular, the security claim regarding key-recovery attacks allows an attacker to obtain multiple forgeries through the decryption oracle. In this paper, we first present a full key-recovery attack on Rocca . The data complexity of our attack is 2 128 and the time complexity is about 2 128 , where the attack makes use of the encryption and decryption oracles, and the success probability is almost 1. The attack recovers the entire 256-bit key in a single-key and nonce-respecting setting, breaking the 256-bit security claim against key-recovery attacks. We then extend the attack to various security models and discuss several countermeasures to see the feasibility of the security claim. Finally, we consider a theoretical question of whether achieving the security claim of Rocca is possible in the provable security paradigm. We present both negative and positive results to the question.


Introduction
Background. An authenticated encryption with associated data (AEAD) scheme is a symmetric key primitive to authenticate associated data (AD), and authenticate and encrypt messages. Various AEAD schemes have been proposed, and we focus on one of them called Rocca [SLN + 21, SLN + 22] that was proposed at FSE 2022/ToSC 2021(2). Rocca is an AES-based design that follows the design approach of AEGIS [WP13], Tiaoxin-346 [Nik14], and that of Jean and Nikolić [JN16]. In these designs, a round function is designed based on one AES round (aesenc) and a 128-bit XOR operation to fully take advantage of the AES-NI and SIMD (single instruction, multiple data) instructions.
Rocca was designed with the goal to meet the performance and security requirements in beyond 5G/6G systems. Concretely, it was designed to achieve 100 Gbps encryp-tion/decryption speed, 256-bit security against key-recovery attacks, and 128-bit security against forgery attacks. Indeed, [SLN + 21] reports that Rocca achieves the speed of 138.22 Gbps on an Intel Ice-lake CPU, and the result shows that Rocca outperforms AEGIS and Tiaoxin-346, and other relevant schemes AES-256-GCM [MV04], ChaCha20-Poly1305 [NL05], and SNOW-V-GCM [EJMY19].
As for the security, the designers make the following claim in [SLN + 21]: Claim 1 ([SLN + 21]). Rocca provides 256-bit security against key-recovery and distinguishing attacks and 128-bit security against forgery attacks in the nonce-respecting setting. We do not claim its security in the related-key and known-key settings.
We note that Rocca is a nonce-based AEAD scheme, meaning that its security relies on the uniqueness of the nonce, and we also note that the tag length of Rocca is 128 bits.
A notable aspect of Claim 1 is the gap between the privacy security and the authenticity security. In particular, the security claim regarding key-recovery attacks allows an attacker to obtain multiple forgeries through the decryption oracle. These observations bring us a natural question of feasibility and infeasibility of achieving the claim as we detail below. We remark that the security claim was weakened in [SLN + 22] after the publication of [SLN + 21]. Specifically, in [SLN + 22], the indistinguishability security claim is weakened to 128-bit security, and various limitations on the input lengths are added 1 . To quote: Claim 2 ([SLN + 22]). Rocca provides 256-bit security against key-recovery and 128-bit security against distinguishing and forgery attacks in the nonce-respecting setting. We do not claim its security in the related-key and known-key settings.
The message length for a fixed key is limited to at most 2 128 and we also limit the number of different messages that are produced for a fixed key to be at most 2 128 . The length of associated data of a fixed key is up to 2 64 .
In this paper, we present our security analysis of Rocca. We also present a theoretical analysis of the security claim focusing on Claim 1.

Key-Recovery Attacks on Rocca.
We first present a key-recovery attack on Rocca. Our attack makes one encryption query and 2 128 decryption queries, and recovers the entire 256-bit key with probability almost 1, where the time complexity is about 2 128 . This attack is in a single-key setting and follows the nonce-respecting scenario, breaking the 256-bit security claim against key-recovery attacks made in both Claims 1 and 2.
As mentioned in the background, in Claims 1 and 2, 256-bit security regarding keyrecovery attacks is claimed. However, the tag length is only 128 bits. These claims cannot invalidate any key-recovery attack exploiting multiple forgeries through a decryption oracle. It can be interpreted as, at the cost of 2 128 decryption queries, an attacker is in the releasing unverified plaintext (RUP) setting [ABL + 14], in which case the attacker has an oracle that returns a message of any ciphertext without verifying the correctness, and is free to repeat nonces.
In [SLN + 21], the designers already observed the feasibility of a state-recovery attack under a nonce-misuse setting, however, the detailed attack procedure is not presented. In particular, it is interesting to see how many nonce-repeated plaintext-ciphertext pairs are required for the state-recovery attack. We show a detailed attack procedure exploiting the property of the AES S-box and recovering the entire 1024-bit state from only one nonce-repeated input-output pair. Moreover, we show a meet-in-the-middle technique that reduces the attack time complexity to practical. As a result, the time complexity is about 2 20 , and the success probability is sufficiently high. Note that in Rocca, the state-recovery attack immediately leads to the key-recovery attack. Since the attack complexity is practical (under the nonce-misuse/RUP setting), we implemented our key-recovery attack and verified the correctness 2 .
Our attack is a chosen-ciphertext attack (CCA), and it is one of the critical attack scenarios for AEAD schemes as discussed, e.g., in [Mèg19,Kha22]. If the secret key of the AEAD is efficiently recovered under the RUP scenario as in Rocca, it is unlikely to enhance the security level beyond the tag length. Such an AEAD scheme is extremely vulnerable when the tag is truncated. For example, our attack implies that if the tag length of Rocca is reduced to 32 bits, it only ensures 32-bit security against all kinds of attacks. It is instructive to consider the impact of CCAs to understand the risk of truncating the tag.
We next discuss extensions of the above attack to various other security models. In the above attack, the attacker has the encryption and decryption oracles. We point out that limiting the number of decryption queries still gives a key-recovery attack that is faster than the exhaustive key search. We also consider the case where the attacker has the decryption oracle only, and the case where the nonce-respecting condition is applied to the decryption oracle as well. The latter case is highly impractical, while a key-recovery attack is still possible for both cases.
Then, we consider several approaches of countermeasures. This includes increasing the number of rounds in the initialization and/or finalization, increasing the nonce length, and increasing the tag length, and we conclude that none of them works. The most promising idea (with negligible impact on the cost) to reduce the impact of our attack is to use the secret key after the initialization and before the finalization as is done, e.g., in ASCON [DEMS21].
Theoretical Analysis of the Security Claim. We next turn our attention to a theoretical analysis of Claim 1 of Rocca. Specifically, our attention is on the gap in the bit security between the distinguishing and forgery attacks. We present theoretical analyses that are valid not only for Rocca but also for any AEAD with different bit security for distinguishing and forgery attacks.
We observe that our key-recovery attack above also invalidates the 256-bit security claim against distinguishing attacks in Claim 1, depending on the interpretation of the security model 3 . As mentioned, Rocca is a nonce-based AEAD scheme and its security relies on the uniqueness of the nonce. However, since the tag length of Rocca is 128 bits, it is always possible to obtain a nonce-repeated input-output pair after 2 128 decryption queries, and whether the attacker has the decryption oracle (in a distinguishing attack) depends on the security model considered. It is often the case that the security against distinguishing attacks is modelled as the indistinguishability against chosen-plaintext attacks (IND-CPA), and the security against forgery attacks is modelled as the integrity of ciphertexts (INT-CTXT), as these two notions imply the indistinguishability against CCAs (IND-CCA) and also imply the unified AEAD security notion, which are regarded as the right security notions for AEAD schemes to achieve [BN08,NRS14,RS06,NRS13]. Indeed, a significant number of AEAD schemes with a proof of security aim at proving the IND-CPA security and INT-CTXT security. See, e.g., [Rog04a,KR11,IOM12]. Now the entire security as AEAD in the IND-CCA notion or in the unified AEAD notion is given as the lower bound of the two notions, i.e., if an AEAD scheme has k 1 -bit IND-CPA security and k 2 -bit INT-CTXT security, then it ensures min{k 1 , k 2 }-bit IND-CCA security and min{k 1 , k 2 }-bit AEAD security (in the unified notion). See [Mèg19,Kha22] for a related discussion. In many cases, the security suggested by the IND-CPA bound and INT-CTXT bound are comparable, and this approach does not impose visible impact on the security of the AEAD scheme as a whole. However, this is not the case in Claim 1.
Claim 1 can be interpreted at least in two ways. One is to achieve 256-bit IND-CPA security and 128-bit INT-CTXT security. This is motivated by the approach taken by many AEAD schemes, where the difference is that Rocca has a gap between the two, and if we follow the discussion above, it ensures only 128-bit IND-CCA security. As the limitation of having a stronger IND-CPA security claim, this does not cause an issue if real world attackers are isolated from a decryption oracle, in which case users can benefit from the strong IND-CPA security bound. However, in this case, it is not clear whether using an AEAD scheme is needed in the first place, as the functionality of authenticity is redundant if the adversary does not have the decryption oracle. In contrast to this, in many use cases of AEAD schemes, real world attackers do have access to the decryption oracle and users expect that the security is maintained in this situation, which is one of the primal reasons to use an AEAD scheme.
In order to capture this, another way to interpret Claim 1 is to achieve 256-bit IND-CCA security and 128-bit INT-CTXT security, and the question we ask is the feasibility and infeasibility of achieving this type of security, together with the 256-bit security against key-recovery attacks, in the provable security paradigm, where we consider schemes with 128-bit tags as in Rocca.
Feasibility and Infeasibility of the Security Claim. In order to give the answer to the question, we start by pointing out that achieving 256-bit IND-CPA security and 128-bit INT-CTXT security is possible with known approaches. Concretely, we point out that a variant of GCM [MV04], OCB [KR11], OPP [GJMN16], and duplex sponge [BDPA11] can achieve the security. However, as stated above, this does not necessarily imply that 256-bit privacy is guaranteed in an environment where attackers have a decryption oracle.
We next show that a class of AEAD schemes called an online AEAD scheme [BBKN12] cannot achieve 256-bit IND-CCA security, by presenting a distinguishing attack with 2 128 query complexity. In an online AEAD scheme, the i-th output block depends only on the first i blocks of input, and this class includes all the schemes stated above and Rocca. We remark that an online AEAD scheme here is different from those studied in [FFL12,HRRV15] in that the goal is to have the best possible security under nonce repeating scenario.
This result rules out efficient solutions, and we present our feasibility result to answer the question above with an offline construction. We show that the Encode-then-Encipher approach [BR00] with an appropriate assumption on the interface can simultaneously achieve 256-bit IND-CCA security and 128-bit INT-CTXT security. The efficiency is limited, while this result does show that achieving Claim 1 in a provable security paradigm is feasible. We remark that works on AEAD schemes with variable stretches [RVV16,GRV21] consider a problem of varying the tag length during the lifetime of the key, which is a different problem from the focus of this paper.
Organization. In Sect. 2, we review the specification of Rocca. In Sect. 3, we present our key-recovery attack and discuss extensions of the attack to several security models. In Sect. 4, we consider countermeasures to mitigate the impact of our attack. In Sect. 5, we present our theoretical treatment of achieving Claim 1. We conclude the paper in Sect. 6.

Specification of Rocca
Rocca is an authenticated encryption with associated data (AEAD) scheme. Rocca has a 128 × 8 = 1024-bit internal state, and the state is updated by the round update function   R while absorbing associated data and a message, and squeezing output to encrypt the message.

Notation
We use the following notations in the paper. • Z 0 : A 128-bit constant block defined as Z 0 = 428a2f98d728ae227137449123ef65cd (in hex).
• A(X): The AES round function without AddRoundKey, as defined below: where MixColumns, ShiftRows and SubBytes are the same operations as defined in AES.
• R(S t , X 0 , X 1 ): The round function used to update the state S t .
For a bit string X, |X| denotes the the length of X in bits. We write 0 l for a zero string of length l bits. For bit strings X and Y , X ∥ Y denotes their concatenation.

The Round Update Function
The input of the round function R(S t , X 0 , X 1 ) of Rocca consists of the state S t and two blocks (X 0 , X 1 ). The output S t+1 ← R(S t , X 0 , X 1 ) is computed as follows:

The Mode of Operation of Rocca
The processing of Rocca consists of four phases: initialization, processing associated data, encryption, and finalization. Rocca accepts a 256-bit key K 0 ∥ K 1 ∈ F 128 2 × F 128 2 , a 128-bit nonce N , associated data AD, and a message M as input. The output is the corresponding ciphertext C and a 128-bit tag T . For a string X of any bit length, define X = X ∥ 0 l , where l is the minimal non-negative integer such that |X| is a multiple of 256. In addition, for a string X of |X| being a multiple of 256, we write X as

Initialization.
A 128-bit nonce N and 256-bit key K 0 ∥ K 1 are loaded into the state S in the following way: Then, 20 iterations of the round update function R(S, Z 0 , Z 1 ) is applied to the state.
Processing the Associated Data. Associated data AD is padded to AD and the state is updated as follows: until t = |AD| 256 . Note that this phase is skipped if AD is empty.
Processing Message. On encryption, we process a message as follows: A message M is first padded to M . Then, M is absorbed with the round function, and the corresponding ciphertext C is generated. If the length of the last block of M is b bits for some 0 < b < 256, the last block of C is truncated to the first b bits. A detailed procedure is shown as follows: Note that this phase is skipped if M is empty.
Processing Ciphertext. On decryption, we process a ciphertext as follows: A ciphertext C is first padded to C. Then, C is absorbed with the round function, and the corresponding message M is generated. If the length of the last block of C is b bits for some 0 < b < 256, the last block of M is truncated to the first b bits. A detailed procedure is shown as follows: Note that this phase is skipped if C is empty.

Finalization.
After processing the message/ciphertext, the state S passes through 20 iterations of the round function R(S, |AD|, |M |) and then the tag is computed in the following way: On decryption, the computed tag is compared with the tag given as input. Figure 2 shows the encryption procedure of Rocca.

Attack Concept
We first review the security claims of Rocca from Claims 1 and 2. Rocca outputs a 128-bit tag and claims 128-bit security against forgery attacks and 256-bit security against key-recovery attacks. Rocca is a nonce-based AEAD scheme. Namely, attackers cannot ask for different messages with the same nonce to the encryption oracle.
These security claims are sound at first glance. However, when a decryption oracle is available to an attacker, the decryption oracle returns a valid plaintext-ciphertext pair when the 128-bit tag is consistent. Therefore, "nonce-repeated" pairs can be collected even if the attacker follows the nonce-respecting scenario with respect to the encryption oracle. This observation leads to the following attack procedure.
1. The attacker makes an encryption query (N, M ) to obtain (C, T ), where N is any nonce and M is any message whose length is properly chosen (specifically, at least eight blocks for our key-recovery attacks).
2. The attacker injects a proper difference ∆ to the ciphertext and makes a decryption query (N, C ⊕ ∆, T ′ ) while trying out q d possible values of T ′ . The procedure returns a "nonce-repeated" plaintext (N, M ′ ) with a probability of q d 2 128 . 3. The attacker recovers the internal state by exploiting the collected "nonce-repeated" pair. The attacker recovers the secret key from the recovered internal state by applying the inverse of the round update function.
The procedure above shows that attackers can collect nonce-repeated pairs even in the nonce-respecting scenario. In [SLN + 21], the designers noted that the state-recovery attack seems trivial if the nonce is misused. That is, when nonce-repeated pairs are given, the state-recovery attack would be possible. It remains to determine the number of "nonce-repeated" pairs needed to recover the state. In the concrete attack against Rocca (shown in the following subsection), we show the attack procedure to recover the internal state by using only one "nonce-repeated" pair with a practical time complexity. We combine the state-recovery attack using a "noncerepeated" pair with the attack exploiting the decryption oracle. With a sufficiently large number of decryption queries, q d , we obtain a key-recovery attack with data complexity q d that recovers the 256-bit secret key with a probability of q d 2 128 under the nonce-respecting scenario. When q d = 2 128 , it recovers the 256-bit secret key with a probability of 1 and a time complexity of about 2 128 .
Although each attack is not surprising, the combined attack breaks the claimed security. In practice, there is a demand to truncate the tag length on a real security system. Users may expect that the tag truncation does not affect the claimed security except for the security against forgery attacks. However, our attack implies that, in Rocca, the tag truncation directly degrades all the security claims, e.g., Rocca with a 32-bit tag ensures only 32-bit security.

Key-Recovery Attack using A Nonce-Repeated Pair
As noted by the designers in [SLN + 21], the state-recovery attack would be trivial when a nonce is misused. However, it is still challenging if we can recover the secret key when only one nonce-repeated pair is available. We tackle this problem, and surprisingly, we show only one nonce-repeated pair is enough to recover the whole 1024-bit internal state with the practical time complexity, i.e., about 2 20 . Note that the state recovery of Rocca directly leads to the secret key recovery.
We describe our attack under the releasing unverified plaintext (RUP) setting because of a straightforward conversion to the attack exploiting the decryption oracle in the noncerespecting scenario. Note that the same attack works when the nonce is misused in the encryption oracle.

State-Recovery Attack
We first introduce a well-known property regarding input-output differences through the AES S-box, which is another view of Observation 1 in [BDD + 12].

Lemma 1. Let x be an unknown random input of the AES S-box. Given the knowledge of a non-zero input
We split these pairs into subsets such that any pair in the same subset implies the same S(x) ⊕ S(x ⊕ ∆ in ) = ∆ out . Then, independently of ∆ in , we have 63 subsets: one subset contains four pairs, and 62 subsets contain two pairs. Thus, the observed output difference ∆ out implies two candidates of x with a probability of 124/128 = 31/32 and four candidates with a probability of 1/32.
Given an input-output difference of the AES S-box, we can reduce the number of candidates of corresponding input-output values to two in many cases. With a probability of 1/32, the number of candidates is four. They correspond to every entry of the differential distribution table of the AES S-box.
We next show a concrete attack procedure to recover the internal state. Figure 3 shows the differential transition on the decryption process when non-zero byte differences are injected to 16 bytes of C 1 0 , where SB, SR, and MC denote SubBytes, ShiftRows, and MixColumns, respectively. We call this a proper difference. The message needs to contain at least eight blocks to mount our state-recovery attack. White-colored bytes are constant, yellow-colored bytes have known differences, and gray-colored bytes have unknown differences. The green-colored number shows the order of the 128-bit state whose candidates are narrowed down following the step number. Moreover, for the sake of simplicity of the attack procedure, we assume the case that each yellow-colored byte has a non-zero difference. With a low probability, a few yellow-colored bytes do not have differences due to a collision. Then, a complicated procedure is required, and we discuss such a case later.
Step 1. We focus on the round function labeled with A. The input difference is equal to ∆C 1 0 . The output difference is equal to ∆M 1 1 . Due to Lemma 1, the number of candidates of the input, These probabilities are about 0.6, 0.31, 0.075, and 0.011 for i 1 = 0, i 1 = 1, i 1 = 2, and i 1 = 3, respectively. Thus, the probability that the number of candidates is narrowed down to 2 16 is the highest, and the probability is higher than 99% for i 1 ≤ 3.
Step 2. We focus on the round function labeled with B. The input difference is equal to ∆C 1 0 . The output difference is equal to ∆M 0 2 . Due to Lemma 1 and similarly to Step 1, the number of candidates of the input, S 1 [4], is reduced to 2 16+i2 with a high probability for small i 2 . In Step 1, each byte of S 1 [4] ⊕ S 1 [0] has two candidates and let a and a ⊕ δ be the candidates. In Step 2, each byte of S 1 [4] also has two candidates and let b and b ⊕ δ be the candidates. Remark that the common difference δ appears in Step 1 and Step 2. Thus, each byte of S 1 [0] has two candidate, a ⊕ b or a ⊕ b ⊕ δ. Therefore, the number of candidates of S 1 [0] is reduced to 2 16+i1+i2 .
Step 3. We focus on the round function labeled with C. Similarly to Step 1, the number of candidates of the input, S 2 [4] ⊕ S 2 [0], is reduced to 2 16+i3 for small i 3 , and S 2 [2] is computed.
Step 4. We focus on the round function labeled with D. Similarly to Step 2, the number of candidates of the input, S 2 [4], is reduced to 2 16+i4 for small i 4 . The number of candidates of S 2 [0] is also reduced to 2 16+i3+i4 .
Step 6. We focus on the round function labeled with E. The number of candidates of the input, , is reduced to 2 16+i6 for small i 6 , and S 3 [2] is computed.
Step 7. On each guess, we recover the following state blocks: Then, the number of candidates of the whole state of S 1 is 2 16+i6 , and the unique one is easily recovered by observing recovered plaintext blocks or the tag. Once the whole state is recovered, we immediately recover the secret key.

Reducing Complexity Using Meet-in-the-Middle Technique
The attack described in Sect. 3.2.1 requires only one "nonce-repeated" pair and at least 2 64 time complexity. Since the required complexity is significantly lower than 2 256 , the attack already breaks the claimed security by combining the attack exploiting the decryption oracle. However, it is not a practical attack.
The dominant step is Step 5. We can reduce the attack complexity significantly to be practical thanks to the meet-in-the-middle technique. For the sake of simplicity, we describe the case where i 1 = i 2 = i 3 = i 4 = 0. The complexity is slightly larger but never significantly larger when they are not zero. We discuss this case in detail in Sect. 3.2.4. Figure 4 shows the meet-in-the-middle relation, where red line and blue line can be computed independently. Two lines collide in each column of S 2 [5]. In the red line, S 2 [5] is computed as is computed, the whole state of S 1 [0] is involved but only one diagonal of S 2 [0] is involved. Therefore, there are 2 16 × 2 4 = 2 20 candidates of each column of S 2 [5]. On the other hand, in the blue line, S 2 [5] is computed as and one column of S 2 [4] are involved. Therefore, there are 2 4 × 2 4 = 2 8 candidates of each column of S 2 [5]. On the meet-in-the-middle process, there are 2 20 × 2 8 = 2 28 candidates, and they are filtered by a 32-bit collision that happens with a probability of 2 −32 at random. Therefore, by changing target columns, we recover the unique solution from 2 64 candidates with the complexity of 2 20 only.

Case of Collisions in Some Bytes with Known Difference
The procedure above assumes the case that all yellow-colored bytes have a non-zero difference. Recall Lemma 1. If the corresponding byte difference is zero by chance, we cannot narrow down the number of candidates.
See Figure 3. Lemma 1 is applied to five AES round functions labeled with A, B, C, D, and E. Input differences in AES round functions labeled with A and B coincide with ∆C 1 0 , which is chosen by an attacker. Therefore, we can avoid collisions happening here by injecting a proper difference. Unfortunately, input differences in AES round functions labeled with C, D, and E are uncontrollable. Since C and D take the same input difference, the probability that we do not observe any collision is (255/256) 32 ≈ 88.2%.
We now discuss the case that there are some colliding bytes by chance. Then, 2 8 candidates remain in the corresponding bytes instead of 2 or 4 candidates.
We recall the meet-in-the-middle procedure, where 16 bytes of S 1 [0] and 4 bytes of We observe some colliding bytes in E too. When there are at most x colliding bytes out of 16 bytes, the probability is Note that this unfortunate case only happens in the RUP setting. On the nonce-misuse setting, we can avoid it by choosing message differences such that yellow-colored bytes have non-zero differences.

Experimental Verification
We conducted experiments to verify the validity of the proposed attack. The following is our experimental environment: a Linux machine with 28-core Intel(R) Xeon(R) Gold 6258R CPU (2.70 GHz), 256.0 GB of main memory, a gcc 9.4.0 compiler, and the C programming language. We used the Mersenne Twister 5 , which is a pseudorandom number generator proposed by Matsumoto and Nishimura [MN98], to generate the secret keys, nonces, associated data, and messages used in all our experiments, and thus did not reuse them in any of the experiments. Our experimental verification procedure is as follows: Step 1. We generate a secret key K (= K 0 ∥ K 1 ), a nonce N , associated data AD, and a message M at random, and then simulate the encryption oracle to get the ciphertext C from a tuple (K, N, AD, M ).
Step 2. We inject a proper difference ∆ to the ciphertext C, and then simulate the decryption oracle to get the message M ′ from a tuple (K, N, AD, C ⊕ ∆). We used ∆C 1 0 = 0x01010101010101010101010101010101 in our experiments. Note that we use the decryption oracle that releases unverified plaintexts to make the experiment practical.
Step 3. We use the "nonce-repeated" pair (M, C) and (M ′ , C ⊕ ∆) to recover the whole internal state of S 1 based on the proposed attack procedure described in Sects. 3.2.1 and 3.2.2.
We provide a test case for the proposed attack in Appendix A. After completing multiple trials with the above steps, we estimate the time complexity and success probability for the proposed attack while taking the specific case explained in Sect. 3.2.3 into consideration.
To this end, we conducted experiments with 2 20 trials, i.e., we used 2 20 different tuples (K, N, AD, M ) to verify the proposed attack. Table 1 summarizes the results of estimating the time complexity and success probability for the proposed attack. As described in the beginning of Sect. 3.2.1, given an input-output difference of the AES S-box, we can reduce the number of candidates of corresponding input-output values to two with a probability of 31/32, whereas the number of candidates is four with a probability of 1/32. This suggests that the proposed attack can be optimized (i.e., the proposed attack can be performed with a time complexity of 2 20 ) only when all of our guesses can be narrowed down to two candidates. However, it is not always possible to optimize. Indeed, Table 1 indicates that the proposed attack can be performed with an optimized time complexity of 2 20 , but its success probability is about 0.565.
There are three cases in which the attack complexity increases from 2 20 : The number of candidates for given input-output differences is four, an input difference of the target S-boxes becomes zero by chance, and an incorrect guess passes through a 128-bit filter. Taking all these cases into consideration, we summarize the success probability of our attack with time complexity 2 20 × α for α ≤ 2 11 by 2 20 experiments in Table 1. We observe that the success probability is sufficiently high for all the values of α we considered. Therefore, our experimental verification demonstrates that the proposed attack is practical.

Summary of Attacks
Once an attacker obtains one "nonce-repeated" pair (with a proper difference), it is possible to recover the secret key with the complexity of about 2 20 . The attack implies a risk of immediate recovery of the secret key if a nonce is misused even only once. Besides, we can convert the nonce-misusing attack with the attack exploiting the decryption oracle. That is, an attacker makes a known-plaintext query to the encryption oracle, injects a proper difference to the ciphertext, and queries the modified ciphertexts to the decryption oracle while trying out all the 2 128 possible tags. Since 2 128 trials always return the corresponding plaintexts, the attacker can recover the secret key with a success probability of 1 and the time complexity of 2 128 + 2 20 . This attack is more efficient than an exhaustive search of the 256-bit key.

Extension to Various Security Models
In addition to the attack with one encryption query and 2 128 decryption queries, in this section, we discuss the feasibility of the attack in three specific security models: 1) The case where the number of decryption queries that an attacker can make is limited.
2) The case where the encryption oracle is unavailable to the attacker.
3) The case where the decryption oracle is stateful and does not allow nonce-repeated decryption queries. Interestingly, in all these security models, Rocca is insecure. Table 2 shows the summary of attacks against Rocca in these security models.

Query Limitation.
Our attack shown in Sect. 3.1 requires 2 128 decryption queries to collect a nonce-repeated pair. When the number of decryption queries q d is limited, e.g., q d ≪ 2 128 , the attacker cannot query over 2 128 tags to the decryption oracle. In this case, the success probability of our attack decreases from 1 to q d /2 128 , and the time complexity of the attack is q d + 2 20 . Although the success probability never reaches 1, the attack is still more efficient than the exhaustive search with the same success probability. For example, when q d = 2 64 , the time complexity is about 2 64 with the success probability of 2 −64 . The corresponding exhaustive search with the same success probability would require the time complexity of 2 256−64 = 2 192 .

No Encryption Oracle.
Interestingly, there is a variant of the attack that no longer requires an encryption oracle. An attacker makes decryption queries with a random ciphertext while trying out 2 128 possible tags and obtains the corresponding message. Then, the attacker injects a proper difference to the ciphertext and runs the same attack. This variant requires 2 129 decryption queries, and the time complexity is 2 129 + 2 20 . This variant no longer requires the encryption oracle.
Decryption with Nonce-Repeat Detection. Let us assume a decryption oracle with a state to reject ciphertexts whose nonce is already used in a previous decryption query. Such a decryption oracle is not common because it is hard to implement. Besides, when a nonce is rejected although an already-queried ciphertext with the same nonce was not valid on the decryption, attackers can easily exhaust nonces, and the DoS attack works. Even such an impractical and complicated decryption oracle cannot invalidate our attack. Our attack can recover the secret key with only one nonce-repeated pair. Now an attacker makes an encryption query and obtains a plaintext-ciphertext pair. Then, the attacker injects a difference to the ciphertext and makes a decryption with the same nonce used in the encryption query. If it is rejected, we no longer use this nonce for both encryption and decryption queries. Instead, the attacker next makes an encryption query with a fresh nonce and repeats this until the decryption oracle returns a valid message. With q encryption and decryption queries, the time complexity of this attack is 2q + 2 20 with the success probability of q/2 128 .

Countermeasures
In this section, we consider several approaches to tweak Rocca to mitigate the impact of our attack.

Parameter Change
We first discuss if using different parameters, the number of rounds in the initialization and finalization, the nonce length, and the tag length, can invalidate the attack. Our attack does not exploit the concrete structure of the initialization and finalization. Even if the initialization and finalization have more rounds, such a tweak never invalidates our attack.
Our attack works in the nonce-respecting setting and requires only one encryption query. Therefore, increasing the nonce length does not invalidate our attack.
While increasing the tag length is relatively promising, we need to evaluate whether Rocca ensures the higher security corresponding to the tag length against forgery attacks. Let us consider the case that we increase the tag length from 128 bits to 256 bits. The designers guarantee at least 24 active S-boxes in differential trails available in the forgery attack. It is insufficient to ensure 256-bit security against the forgery attack. We evaluated the tight maximum differential characteristic probability to confirm the insufficiency. As a result, we found a differential trail available in the forgery attack with a probability of 2 −6×25 = 2 −150 . The existence of this differential trail implies that increasing the tag length is not a promising direction to invalidate the attack using the decryption oracle. We refer to Appendix B for details.

Key-Dependent Initialization and Finalization
To invalidate our attack, the most promising idea (with negligible impact on the cost) is involving the secret key to the initialization and finalization. Such an idea was already introduced in ASCON to enhance the security under the nonce-misuse scenario. Specifically, we XOR the secret key K 0 and K 1 to two branches after the initialization, e.g., we let Remind that this countermeasure never invalidates our state-recovery attack. It makes attackers non-trivial to recover the secret key even if they recover the internal state.
Our attack reveals the whole state of one output of the initialization at the cost of 2 128 decryption queries. The designers of Rocca did not analyze the initialization in such a setting. Therefore, whether 20 rounds are sufficient or not has to be re-evaluated to see the feasibility of the key-recovery attack under such a scenario. Although we do not claim, we expect that 20 rounds are sufficient.
We additionally recommend XORing the secret key K 0 and K 1 to two branches before the finalization, e.g., we let This replacement makes attackers non-trivial to mount a universal forgery attack even if they recover the internal state. Note that the number of required decryption queries for our attack already exceeds the claimed 128-bit security against forgery attacks. Therefore, this tweak is not necessary to sound the security claims of Rocca. However, considering practical risks under nonce-misuse scenario, we recommend that the finalization also involves the secret key.
We stress that this countermeasure is for the 256-bit security against our key-recovery attack. This countermeasure is not helpful to achieve 256-bit security against distinguishing attacks in Claim 1. Besides, note that message-recovery attacks or universal forgery attacks are still possible at the cost of about 2 128 even if this countermeasure is adopted, since the internal state can be recovered.
The indistinguishability security is strong and demanding, and we focus on the theoretical (in)feasibility of such strong security in the next section.

(In)feasibility of Achieving the Security Claim
In this section, we consider the feasibility and infeasibility of achieving Claim 1 in the provable security paradigm. We distinguish IND-CPA and IND-CCA for the security notion of distinguishing attacks, and we consider INT-CTXT as the security notion of forgery attacks.
We first define the relevant notions following [Rog02, Rog04b, BN08, ABL + 14]. Let A be an adversary 6 and Π = (Enc, Dec) be an AEAD scheme. The encryption algorithm Enc K takes a tuple (K, N, AD, M ) of a key, nonce, AD, and a message as input, and returns (C, T ) = Enc K (N, AD, M ), a ciphertext and a tag. The decryption algorithm Dec K takes (K, N, AD, C, T ) as input, and returns M = Dec K (N, AD, C, T ) or ⊥, where the latter indicates rejection. We consider the following advantage functions on an adversary A: Here, $ denotes an oracle that returns a uniform random bits with the same length as Enc K , and ⊥ denotes an oracle that always returns the reject symbol ⊥. We only consider nonce-respecting adversaries.  [BN08] for the first one. However, as discussed in Sect. 1, when interpreted as the bit security, they only ensure min{k 1 , k 2 }-bit IND-CCA or AEAD security when an AEAD scheme has k 1 -bit IND-CPA and k 2 -bit INT-CTXT security.
Recall that we denote the concatenation of bit strings X and Y by X ∥ Y , and |X| denotes the bit length of X. This proposition can be confirmed by various known AEAD modes of operations that use a 4n-bit block cipher or a 5n-bit cryptographic permutation (with an n-bit rate) and ensure up-to-birthday-bound security, when they have n-bit tags. For example, we can extend GCM so that it works with a 4n = 512-bit block cipher together with an extended GHASH defined by multiplications over F 2 4n . We use (say) 2n-bit nonces and a 2n-bit block counter where their concatenation forms the initial counter-mode block input, just as in the original GCM using 96-bit nonces and a 32-bit block counter. The tag length is τ = 128 bits. Such an extended version of GCM has the following security bounds: where A priv denotes the privacy adversary using σ e total encrypted blocks and q e encryption queries, and A auth denotes the authenticity adversary using σ e total encrypted and decrypted blocks and q e encryption and q d decryption queries, with the maximum of ℓ A blocks of AD in any query, for 4n-bit blocks. The first terms of the bounds show the indistinguishability of the underlying block cipher E from a 4n-bit random permutation (i.e., pseudorandom permutation advantage) and are assumed to be negligible. The proofs are mostly immediate from the original proofs [IOM12, Corollaries 3 and 4]. Similar results can be obtained for OCB [KR11], if the underlying (small) constant multiplications over F 2 4n (so-called doubling and tripling etc.) fulfill certain distinctness conditions 7 . Rogaway [Rog04a] presented concrete instances for the case of F 2 64 and F 2 128 , and Granger et al. [GJMN16] extended the analysis to F 2 512 to F 2 1024 , which can be useful for our case. In fact, [GJMN16] defined OPP, which is a permutation-based variant of (large-block) OCB. It gives an AEAD scheme based on a 4n = 512-bit permutation having 256-bit IND-CPA and 128-bit INT-CTXT security if tags are truncated to τ = 128 bits. To confirm this claim, we refer to the bound shown in the full version of the paper [GJMN15, Theorem 5]. It fixed a simple error in the proof of the original paper and presented a security bound for the unified AEAD security notion (Adv aead OPP (A)) for OPP. When a 512-bit permutation is used and the key length is k = 256, one can confirm the above 256-bit IND-CPA claim by making the number of decryption queries being zero in the unified bound. The 128-bit INT-CTXT bound is clear since the unified bound contains q d /2 τ and a unified bound is essentially the sum of IND-CPA and INT-CTXT bounds.
Sponge-based AEAD, such as duplex sponge [BDPA11], would also allow similar security bounds by using a permutation of an appropriate size. That is, a permutation of size 5n bits with an n-bit rate and a tag length of n bits is enough to achieve the bound.
Therefore, there are schemes with 256-bit IND-CPA security and 128-bit INT-CTXT security. However, as discussed in Sect. 1, the high indistinguishably security is guaranteed only when the decryption oracle is not available to the adversary. If the adversary has the decryption oracle, the security guarantee (in indistinguishability or in the unified notion) degrades to 128 bits. In general, this does not necessarily imply the existence of an attack with 2 128 complexity. However, we next show that distinguishing attacks are possible in the IND-CCA security notion.

Infeasibility of 256-bit IND-CCA and 128-bit INT-CTXT Security.
We consider a class of AEAD schemes called an online AEAD scheme [BBKN12, ABL + 14]. Intuitively, in these schemes, the i-th output block depends only on the first i blocks of the input. This class includes all the schemes stated above, Rocca, and its variant in Sect. 4.2, and we show that they cannot achieve 256-bit IND-CCA security.
Let m be a positive integer. Following [BBKN12, ABL + 14], we call an AEAD scheme Π = (Enc, Dec) m-online if it satisfies the following condition: For any tuple (K, N, AD), there exist functions f 1 , f 2 , . . . of which codomain is F m 2 and another function g such that The following proposition roughly shows that a τ -bit tag online AEAD scheme cannot achieve more than τ -bit IND-CCA security. In particular, 128-bit tag AEAD schemes based on the sponge(-like) construction, including Rocca (and its variant mentioned in Sect. 4.2), cannot achieve 256-bit IND-CCA security.
Proposition 2. Let Π be an m-online AEAD satisfying |Enc K (N, AD, M )| = |M | + τ for a fixed constant τ > 0. Let Enc ′ K be a truncated version of Enc K discarding outputs of g (see Eq. (2)), and assume the restriction of Enc ′ K (N, AD, ·) to F 2m 2 is a permutation for any choice of N and AD. Then there exists an adversary A making at most 2 τ decryption queries and a single encryption query such that Adv ind-cca Π (A) = 1 − 1/2 m holds.
Proof. In what follows we assume no AD is involved in encryption queries nor decryption queries.
Let A be an adversary running as follows: (1) Choose a nonce N and C 0 , C 1 ∈ F m 2 arbitrarily.
(2) For each T ∈ F τ 2 , query (N, , proceed to the next step. Proposition 2 shows that distinguishing attacks are possible against Rocca and its variant in Sect. 4.2, while for these schemes, message-recovery attacks are also possible as the internal state can be recovered.
Now Proposition 2 rules out efficient solutions, and we next consider an offline construction to see the feasibility.

Feasibility of 256-bit IND-CCA and 128-bit INT-CTXT Security.
We show that, with the Encode-then-Encipher approach [BR00], we can simultaneously achieve 256-bit IND-CCA security and 128-bit INT-CTXT security.
Before presenting our theorem and the proof, let us fix some notations. Let n be a positive integer and X be an n-bit string. For a positive integer x ≤ n, msb x (X) (resp. lsb x (X)) denotes the truncation of X to its x most (resp. least) significant bits. A tweakable block cipher (TBC) [LRW11] is a keyed function E : K × T W × M → M, where K is a key space, T W is a tweak space, and M is a message space. A tweak is a public parameter and E(K, T W, ·) is a permutation on M for ∀(K, T W ) ∈ K × T W. We also write E K to mean E(K, ·, ·). Let Perm(n) denote the set of all permutations on F n 2 . An n-bit tweakable permutation with a tw-bit tweak is a function π : F tw 2 × F n 2 → F n 2 such that π(T W, ·) ∈ Perm(n) for ∀T W ∈ F tw 2 . The set of all n-bit tweakable permutations with a tw-bit tweak is denoted by TPerm(tw, n). Let P such that P $ ← − TPerm(tw, n) be a tweakable uniform random permutation (TURP). The (strong) security of a TBC E is defined as the advantage function Adv tsprp Proof. Let E K (·, ·) be a TBC with a 256-bit input/output and a 256-bit tweak. We prove the Encode-then-Encipher (EtE) scheme based on E fulfills the bounds in Theorem 1. For the sake of ease, we assume that EtE only accepts a fixed message length of m := 128 bits and there is no AD. We later cover the general case. For IND-CCA security, we firstly convert E to the TURP P having the same tweak and message spaces. This step adds Adv tsprp E (A ′ ) to the bound, which we assume to be small. Let (N 1 , M 1 , C 1 ), . . . , (N qe , M qe , C qe ) be the sequence of encryption queries, and There is no bad event between encryption queries because the adversary follows the noncerespecting setting and P takes a nonce as tweak input. Also, there is no bad event between decryption queries because both worlds have the same (real decryption) scheme. In the real world, Case-1 and Case-2 do not appear since P is a random permutation under the fixed nonce. We evaluate Pr[Case-1 ∪ Case-2] in the ideal world.
Let q i denote the number of decryption queries whose nonce equals the nonce of the i-th encryption query N i , thus qe i=1 q i ≤ q d . We then split q i into two variables, q i,b and q i,a . The former is the number of the decryption queries with the nonce N i before the adversary obtains (N i , M i , C i ), and the latter is the one after that, thus q i = q i,b + q i,a . For the i-th encryption query, the probability of Case-1 is q i,b /2 256 . Note that M ′ in the decryption queries can be ⊥, i.e., there is no need for the adversary to first succeed in forgery. For Case-2, we renumber the decryption query, whose nonce is the same as N i , as . . , C ′i,a j−1 }, to the decryption oracle. Thus, assuming For INT-CTXT security, we use P for E. This adds a term Adv tsprp E (A ′′ ), which is assumed to be small. Now it is easy to see Adv int-ctxt For the general case, where there is variable-length associated data AD and a variablelength message M , the proof is easily extended. We assume that E accepts a variablelength tweak and a variable-length message block, and use (N ∥ AD) as a tweak and use (0 128 ∥ pad(M )) as an input block with a certain injective padding pad that ensures |pad(M )| > 128 for any M .
We do not specify the choice of E, which could be built from a scratch or could be a mode of operation such as [HR04,WFW05] (instantiated with a block cipher with an appropriate block length). If E has 256-bit security against key-recovery attacks, we see that the claim against key-recovery attacks in Claim 1 can also be achieved.
Finally, we remark that Theorem 1 can be extended to handle a general case. For any positive integer k 1 , k 2 such that k 1 > k 2 , we can simultaneously achieve k 1 -bit IND-CCA security and k 2 -bit INT-CTXT security. Let n := k 1 , τ := k 2 , m := n − τ . Let E be a TBC with an n-bit input/output and an n-bit tweak. We define a generalized EtE, which we call gEtE, as gEtE.
We can prove that gEtE has k 1 -bit IND-CCA security and k 2 -bit INT-CTXT in the same manner as the proof of Theorem 1.

Conclusions
In this paper, we first presented a key-recovery attack on Rocca, where the time complexity is about 2 128 and the success probability is almost 1. This shows that, in terms of keyrecovery, Rocca has 128-bit security. We then considered extensions of the attack to various security models and discussed countermeasures. We also studied a theoretical question of achieving the security claim of Rocca, and showed that the Encode-then-Encipher approach gives a feasible result at the cost of efficiency.
In Rocca, only one nonce-repeated pair is enough to recover the internal state and the secret key with the practical time complexity, which is the main reason of the success of the attack. Even if the nonce-misuse security is optional, this casts the importance of maintaining a level of security under the nonce-misuse scenario. Involving the secret key in the initialization and finalization is an effective way, as the security against key-recovery attacks improves and the cost is negligible, while message-recovery attacks or universal forgery attacks are still possible at the cost of about 2 128 complexity. Related to this, Rocca defines a raw encryption mode, which is obtained by removing the process of AD and the finalization, and our attack implies that if an attacker has a decryption oracle for this raw encryption mode, this immediately allows a key-recovery attack with 2 decryption queries.
As we discussed in Sect. 3.1, having a strong privacy security bound and a weaker authenticity bound is relevant in practical contexts. For instance, given some AEAD scheme, possibly used in IoT applications, one may want to truncate the tag to reduce the bandwidth to save energy for data transmission, hoping that the impact on the privacy security bound is limited. Rocca is not suitable for this purpose, as its tag length directly degrades the security. The Encode-then-Encipher scheme has an efficiency issue, and designing an efficient scheme retaining a level of CCA security even with tag truncation is an interesting question.

B Forgery Attack against Rocca Using Longer Tag Length
Increasing the tag length is one of the most simple countermeasures against our attack. However, the designers of Rocca do not guarantee strong security that is implied by the A. Hosoyamada, A. Inoue, R. Ito, T. Iwata, K. Mimematsu, F. Sibleyras, Y. Todo 149 increased tag length against forgery attacks. Therefore, we evaluated differential trails that are available in forgery attacks: differential trail whose input and output differences are zero but a non-zero message difference is absorbed. Figures 5 and 6 show such a differential trail. There are 25 active S-boxes and all of them transit with the probability of 2 −6 . Therefore, the probability is 2 −6×25 = 2 −150 .  94 * * detail of this round function Figure 6: Bottom 3 rounds in the differential trail available for the forgery attack.