Power Yoga: Variable-Stretch Security of CCM for Energy-Eﬃcient Lightweight IoT

. The currently ongoing NIST LWC project aims at identifying new standardization targets for lightweight authenticated encryption with associated data (AEAD) and (optionally) lightweight cryptographic hashing. NIST has deemed it important for performance and cost to be optimized on relevant platforms, especially for short messages. Reyhanitabar, Vaudenay and Vizár (Asiacrypt 2016) gave a formal treatment for security of nonce-based AEAD with variable stretch, i.e., when the length of the authentication tag is changed between encryptions without changing the key. They argued that AEAD supporting variable stretch is of practical interest for constrained applications, especially low-power devices operated by battery, due to the ability to ﬂexibly trade communication overhead and level of integrity. In this work, we investigate this hypothesis with aﬃrmative results. We present vCCM, a variable-stretch variant of the standard CCM and prove it is secure when used with variable stretch. We then experimentally measure the energy consumption of a real-world wireless sensor node when encrypting and sending messages with vCCM and CCM, respectively. Our projections show that the ﬂexible trade of integrity level and ciphertext expansion can lead up to 21% overall energy consumption reduction in certain scenarios. As vCCM is obtained from the widely-used CCM by a black-box transformation, allowing any existing CCM implementations to be reused as-is, our results can be immediately put to use in practice. vCCM is all the more relevant because neither the NIST LWC project, nor any of the candidates give a consideration for the support of variable stretch and the related integrity-overhead trade-oﬀ.


Introduction
In the absence of a broadly accepted precise definition, lightweight cryptography can be roughly understood to comprise cryptographic designs created and optimized for a specific design trade-off (along the axes of computational complexity, memory complexity, security level, qualitative security properties etc.), such that this trade-off is not well-served by the existing general-purpose cryptography. In practice, a vast majority of these specific design targets is dictated by severely constrained resources of the intended applications, such as limited amount of energy available in a battery-operated device, limited power available in externally-powered passive RFIDs, limited computational resources (HW area/RAM/ROM/computational cycles) available for an implementation of cryptography, or a limited bandwidth available for the communication overhead due to the use of cryptography [KM].
NIST LWC. Even though the field has already seen nearly two decades of research activities [BP17,SWE02,WSRE03,nisa], lightweight cryptography has been truly brought into the spotlight only recently by the ongoing NIST Lightweight Cryptography (LWC) Standardization process [nisb]. NIST LWC aims at identifying the most suitable candidates for a standardization of two symmetric-key functionalities: authenticated encryption with associated data (AEAD) [Rog02], and optionally also cryptographic hashing. Mirroring our informal definition, NIST states as their motivation that "the majority of current cryptographic algorithms were designed for desktop/server environments, many of these algorithms do not fit into constrained devices" [nisa]. The design requirement naturally implied is that new special-purpose candidate algorithms, in particular lightweight AEAD schemes, should perform significantly better in constrained environments compared to current general-purpose NIST standards, such as GCM and CCM modes for AEAD (instantiated with standard primitives e.g. AES).
Among other criteria, NIST has designated several cost metrics (e.g., area, memory, energy consumption), performance metrics (e.g., latency, throughput, power consumption) and implementation flexibility (in terms of achieving cost/performance trade-offs) for the evaluation of a candidate algorithm's merit. As has been the case in the other standardization projects and academic competitions, the target security levels have been fixed (e.g., 112-bit or 224-bit security against key recovery [nisb]), and the design parameters of a competing algorithm, such as the nonce-length or the ciphertext expansion must be fixed, resulting in one or more candidate instances.
Security-overhead trade-off. Meeting a clearly communicated, and generally accepted quantitative security level is a baseline necessity in modern cryptography. However, insisting on a single monolithic quantitative target for several security properties (of an AEAD scheme), and the related fixing of certain design parameters make it impossible to achieve an optimal trade-off between security and overhead, which is one of the central principles in lightweight cryptography [BP17]. In particular, we argue that facilitating a "flexible" security-overhead trade-off can be a practically relevant feature for several use cases of lightweight algorithms, provided such a "flexible" trade-off can be done safely.

Variable stretch.
To clarify what is meant by "flexible" security-overhead trade-off, consider for example a nonce-based AEAD algorithm that is secure to use with variable stretch [RVV16]; that is, under the same key, several tag lengths (a.k.a. ciphertext expansion or stretch) can be securely used for authenticated encryption of different messages. The tag length is a main determinant of the security level of an AEAD algorithm, but also a major factor in increasing communication overhead, hence the energy consumption in a battery-powered constrained device. The ease of varying the tag length without the need for re-keying thus allows the authentication-related communication overhead to be adapted to the level of sensitivity of each transmitted messages, having a significant impact on the operation of the device overall, when considering energy consumption as a critical cost metric in many application areas.
The impact of the incurred communication overhead can be quite significant in use cases where authenticated encryption of predominantly (very) short messages is intended, and unsurprisingly such use cases are abundant in practice [ALP + 19, ADP + 20]. For example, in applications such as smart parking lots, the payload to be sent by the sensors most of the time is just one bit ("free" or "occupied"), together with a unique identifier of the sensor in the parking lot (e.g., 2 bytes). Here, a high tag length, e.g. 128 bits, will then be the main contributor to the communication cost compared to the actually transmitted information, and arguably disproportional to the impact of a successful forgery (changing between "free" and "occupied" in a single status update). However, the sensors will also need to send and receive management traffic, which is likely going to be more sensitive (e.g., "go to sleep" or "enter high performance mode" commands being used for Contribution. Our main motivation is to investigate usefulness of the notion of variablestretch AEAD for enabling the security-overhead trade-off with respect to the energy consumption overhead due to the ciphertext expansion. In many applications relying on battery-operated low-power devices (such as wireless sensor networks, smart medical implants, etc.) energy is a critical resource and decreasing its consumption one of the major optimization targets.
Towards this goal, we first propose a variable-stretch variant of the CCM standard, naming it vCCM, and prove that it is a secure nonce-based variable-stretch AEAD scheme, i.e., nvAE-secure as formalized by Reyhanitabar et al. [RVV16]. We then experimentally measure the energy consumption of a real-world low-power wireless sensor platform in a simple model scenario when using vCCM and CCM, respectively. Based on our measurements, we estimate that the overall energy consumption can be reduced by about 8-21% in applications similar to our scenario, just by treating a majority of messages with a shorter stretch than used for the most sensitive messages (which is not possible with AEAD schemes that do not support the variable-stretch security).
Our results also raise the bar for evaluation of new (not only) lightweight algorithms, adding the criterion of supporting variable tag length for achieving optimal security-energy trade-off, with our lightweight variant of the existing CCM standard as a baseline reference.
Variable-stretch CCM in the wild. CCM mode for AEAD was originally proposed in 2002 [WHF02,WHF03] for inclusion in the IEEE 802.11i standard. Despite initial academic criticisms [RW03], CCM has become one of the most widely supported AEAD schemes in real-world crypto, being used in IPsec, TLS, Bluetooth Low Energy (BLE) and a minor variant (CCM*) in the ZigBee standard, to mention some.
Our proposed vCCM scheme is obtained from CCM using a black-box transformation, i.e., without requiring any changes in the internals of the standard CCM scheme. This property has been the primary design goal in this work: it allows practitioners to benefit from existing software and hardware implementations of CCM, while instantly enabling the trade-off between security level and energy consumption in a flexible and provably sound manner. At the same time, our black-box transform itself is very lightweight, requiring only a few elementary operations.
At the time of writing this paper, it is expected that NIST LWC project will conclude in a year, resulting hopefully in one or several promising lightweight AEAD schemes with significant improvements compared to current standards; however, considering the time for standardization, inclusion in protocols and actual deployment in embedded systems, the real availability of the new schemes can expected to be several years away at best. In the meantime, we believe that salvaging a widely implemented standard such as CCM and instantly enabling a provably graceful trading of security for energy savings is of high interest, with the potential to bring measurable improvements to real-world applications.

Preliminaries and Prior AE Definitions
Notations. We let a ←$ S denote the uniform sampling of an element in a finite set S and assigning it to the variable a. All strings are binary strings. We let |X| denote the length of a string X, and X Y the concatenation of two strings X and Y . We let ε denote the empty string of length 0. We let {0, 1} * denote the set of all strings of arbitrary finite lengths (s.t. ε ∈ {0, 1} * ) and we let {0, 1} n denote the set of all strings of length n for a positive integer n. For a string X ∈ {0, 1} * , we let left (X) denote the leftmost bits of X for an 0 ≤ ≤ |X|. We let N denote the set of all (positive) natural numbers and N 0 = N ∪ {0}. As the specification of CCM is byte-oriented, we define the short-hands B = {0, 1} 8 for the set of bytes, and for a byte string X ∈ B * the length in bytes |X| 8 = |X|/8. The sets of arbitrary length byte strings B * and byte strings of m bytes B m are defined in a natural way. It is also useful to define ({0, 1} n ) * , the set of all strings whose length is a multiple of some integer n.
We call a mapping encode : S → ({0, 1} n ) * that encodes elements of a domain S into strings of multiple-of-n-bit length prefix-free if for each pair of distinct s, s ∈ S we have llcp n (encode(s), encode(s )) < min(|encode(s)|, |encode(s )|)/n . Resource-parameterized advantage. The (in)security of a scheme Π with respect to a notion xxx is measured by taking the maximum over all adversaries A which use resources bounded by r.

Blockciphers.
A blockcipher is a function E : K × {0, 1} n → {0, 1} n for an integer n, the blocksize, such that the mapping E K (·) = E(K, ·) is a permutation of {0, 1} n for every K ∈ K with some finite K. We define the security of blockciphers though indistinguishability from a random permutation by an adversary A as: where Perm(n) is the set of all the permutations over n-bit strings. Similarly, we define for a function with the same signature E : K × {0, 1} n → {0, 1} n and a finite K (not insisting on the permutation property of E K (·)) where Func(n) is the set of functions functions from {0, 1} n to itself. The resource parameterized advantage functions are defined with the time complexity (t) of the adversary and the total number of queries (q) asked by the adversary. We now recall the security notions for nonce-based AE (nAE) schemes with associated data (AEAD schemes) [Rog02]. We also require that if M ∈ M then {0, 1} |M | ⊆ M, and that the scheme Π be correct and tidy [NRS14]. Correctness means that for every ( An ordinary nAE scheme that has a fixed stretch τ can be seen as a special case of the just-defined syntax, having I T = {τ }. The value of the stretch argument being trivially assigned in all encryption and decryption calls, it can be omitted from the list of arguments in both cases. We let Π[τ ] denote an ordinary nonce-based AE scheme obtained from a nonce-based AE scheme with variable stretch Π by fixing the expansion value for all queries to some value τ ∈ I T . nAE security definition. The all-in-one nAE security definition by Rogaway and Shrimpton captures AE security as indistinguishability of the real encryption and decryption algorithms from a random strings oracle and an always-reject oracle in a nonce-respecting, chosen ciphertext attack. The nae advantage of an adversary A against a scheme Π is defined as Adv nae with the security games defined in Figure 1.   Figure 2, parameterized by the so-called challenge stretch value τ c ∈ I T , where the adversary A cannot repeat a nonce-stretch combination in its queries [RVV16]. As observed in Figure 2, a query, be it encryption or decryption, with a stretch other than τ c will always be answered with the real encryption/decryption algorithm, giving the adversary extra information for distinguishing the real or ideal processing of queries stretched by τ c . We refer the reader to the original publication for a detailed explanation of the nvae notion. We define the advantage of A in breaking the nvae security of Π with the target stretch τ c as Adv We measure the resources of an nvae adversary in a fine-grained, vectorial fashion as (t, q e , q d , σ e , σ d ), where t denotes the running time of the adversary, q e = (q τ e |τ ∈ I T ) denotes the vector that holds the number of encryption queries q τ e made with stretch τ for every stretch τ ∈ I T , q d = (q τ d |τ ∈ I T ) denotes the same for the decryption queries, σ e = (σ τ e |τ ∈ I T ) denotes the vector that holds the total amount of data σ τ processed in all encryption queries with stretch τ for every τ ∈ I T , while σ d denotes the same for the decryption queries. As indicated by Reyhanitabar et al., a typical analysis will use the resources related to τ c (i.e. q τc e , q τc d , σ τc e , σ τc d ) and aggregate resources q e , q d , q, σ e , σ d , σ We additionally define the per-stretch aggregate variables q τ = q τ e + q τ d and Recall that an nae scheme Π with stretch τ is equivalent with an nvAE scheme with I T = {τ }, so the vector-based adversarial resources become synonymous with the aggregate variables.
Finally, we informally call a scheme Π nvae-secure if for every τ c ∈ I T , for all "practical" values of resources r τc the advantage Adv Key-equivalent separation by stretch. Reyhanitabar et al. proposed a notion that captures the intuition that changing the value of stretch with a single key is equivalent to having an independent key per stretch. The notion was primarily proposed to facilitate nvae security proofs for schemes known to be nae secure. The kess property of an nvAE scheme Π = (K, E, D) is defined through indistinguishability of the games defined Figure 3 by an adversary A . The advantage of A is defined as

Adv kess
The adversarial resources of interest for the kess notion are (t, q e , q d , σ), as defined for the nvae(τ c ) notion in the current Section. Reyhanitabar et al. proved Theorem 1; to prove that an nae-secure AE scheme is nvae-secure, it suffices to show that the kess advantage of any adversary is small.

Counter with CBC-MAC(CCM)
CTR + CBC-MAC is a nonce-based AE scheme proposed by Whiting |Mm| ≤ n and |M i | = n otherwise 110: for i ← 1 to m do 111: |Cm| ≤ n and |C i | = n otherwise 204: and where |T | = τ 205: for i ← 1 to m do 206:  of τ and ν we have There is a seeming inflation of the bound compared to Jonsson's theorems. This is due to the difference in resource counting: while Jonsson defines the equivalents of σ and σ e already including the overhead due to CCM's structure, we prefer to treat σ as data complexity as perceived by the user, who doesn't need to understand the internals of CCM. Combining the bounds of Theorem 2 yields the following nae security bound.

Variable Tag CCM (vCCM)
vCCM is a nonce-based AE scheme parameterized by a blockcipher E : K τ × {0, 1} n → {0, 1} n with n = 128, an ordered set of non-zero tag lengths I T ⊆ {32, 48, 64, 80, 96, 112, 128}, and a nonce length ν ∈ {56, 64, 72, 80, 88, 96}. There is a trade-off between the nonce length ν and the maximal message length of 8 · (2 (112−ν) − 1) bits. The associated data and message space of vCCM[E, I T , ν] are respectively A = 2 64 −1 i=0 The encryption and the decryption algorithms of vCCM[E, I T , ν] are described in Figure 6 and illustrated in Figure 7. The Transform. The nvAE scheme vCCM is obtained as a black box transform of CCM, requiring no modifications of the standard CCM. This property has been our primary design goal, as it is key for the ability of the new nvAE scheme to benefit from existing software and hardware implementations, and thus be instantly used at a massive scale.
The options available for a black-box transform are, informally speaking, injecting the tag length into the nonce, the associated data, or the key (not the message, due to undesirable additional ciphertext expansion). With the option of tag-dependent key undesirable, as discussed in the publication introducing the nvae model [RVV16], one is left with a tag-dependent nonce, tag-dependent associated data, or a combination thereof.
In the particular case of CCM, injecting the stretch in the associated data does not work; when the same message was encrypted with different tag lengths, the change would propagate through the computation of CBC-MAC, and hence only to the last block of the ciphertext. A trivial distinguishing attack could be mounted using the non-tag ciphertext blocks.
On the other hand, injecting the stretch into the nonce propagates to every block of ciphertext, thanks to the structure of CCM, as nonce is encoded into the initial bock B 0 processed by the plain CBC-MAC, as well as into each counter block, used as blockcipher input to compute the key stream blocks. This being an efficient and security-wise sufficient option, our transform consists in simply dedicating the last byte of the CCM nonce to containing an encoding of the used tag length.
Proof. We start by replacing the block cipher in both the nvae(τ c )-R and the nvae(τ c )-I game by a secret random permutation π ←$ Perm(n) to obtain the following inequality  , q e , q d , σ)+|I T |·Adv prp E (t , σ ) , with σ ≤ 2σ + 3q and t ≤ t + σ · γ, as there is a single instance of E used in the game nvae[τ c ]-R and |I T | − 1 instances of E (used with independent keys) in the game nvae[τ c ]-I, each of them processing no more than 2σ + 3q blocks in total. This is given by the fact that a CCM encryption (resp. decryption query) with a AD blocks and m When processing a query , resp. , each primitive input resp. may trigger as: : OK as now samples fresh randomness, but does not From Corollary 1 and the fact that Adv prf f (A ) = 0, the following inequality holds: The proof is finalized by applying Lemma 1 to upper bound Adv kess vCCM[f,I T ,ν] (t, q e , q d , σ e , σ d ).
By using the kess−security definition in Figure 3 and the games in Figure 9, we obtain the following bound. Adv kess vCCM[f,I T ,ν] (q e , q d , σ e , σ d ) ≤ 2 · (2σ + 3q) 2 /2 n . Proof. The proof is based on the games G 0 and G 1 in Figure 9. The code in Figure 9 is obtained by using vCCM to instantiate the Enc and Dec oracles of the kess games (Figure 2), such that in the place of each invocation of the underlying primitive, there is a nested if-else block (lines with gray background). In the following we will refer to the underlying primitive used to instantiate vCCM simply by "primitive". The proof outline is visualized in Figure 8.
Simulating the kess games. Roughly speaking, these if-else blocks ensure that the supposed primitive outputs are being sampled such that they are simultaneously compatible with a single random function f and |I T | independent random functions f τ1 , . . . , f τ |I T | , for as long as possible. When it is no longer possible to maintain this double compatibility, both games set flag bad to true. Formally, we claim that the games G 0 and G 1 are in fact implementing kess-R vCCM[f,I T ,ν] and kess-I vCCM[f,I T ,ν] , respectively. This is easy to see for G 0 . When we omit the boxed statements, then in each query, the primitive-output variable y j , respectively z j , is assigned a previously sampled value whenever the input to the primitive (B j ⊕ Y j−1 , resp. Z j ) is already in Dom(f ), and freshly sampled and used to extend f otherwise.
For G 1 , the boxed statements are included, ensuring that in each query, in the special case when the input to the primitive (B j ⊕Y j−1 , resp. Z j ) is in Dom(f ) but not in Dom(f τ ), the value of the primitive-output variable y j , respectively z j is freshly re-sampled, with τ being the amount of stretch in current query. We note that the seemingly missing case when an input to the primitive (B j ⊕Y j−1 , resp. Z j ) is present in Dom(f τ ) but is missing in Dom(F ) (or else Dom(f τ )\Dom(f ) = ∅) cannot occur. This is shown by a simple induction: at initialization, all functions are undefined everywhere, and so Dom(f τ )\Dom(f ) = ∅. Every time the primitive output is computed in G 1 in a query with stretch τ , if f τ is extended by a preimage-image pair x, f τ (x), then f (x) is defined at the same time, or f (x) has already been defined before.
We thus have for any adversary A that where the event "sets bad" is defined as the flag bad being true at the end of the experiment. When flag bad is set, the game G 1 diverges from G 0 , as visualized in Figure 8: the former samples a fresh primitive output, while the latter reuses a previously sampled one, and their distributions are no longer equivalent.
Characterizing the experiment. We start by characterizing the interaction of an adversary A with its oracles as defined in Figure 9. We let N i , A i , τ i , M i denote C i denote the nonce, the AD, the stretch, the (internally computed) plaintext and the ciphertext of the i th query. We denote by B i 0 , . . . , B i i the input blocks fed to plain CBC-MAC in the This notation is valid irrespective of whether the i th query is an encryption or a decryption query.
We assume that the adversary A makes no pointless queries; i.e., A makes no repeated queries, does not make a decryption query N, A, C after a previous encryption query N, A, M returned the ciphertext C, and does not make an encryption query N, A, M after a previous decryption query N, A, C returned M = ⊥. This is without generality, as the responses to these queries are all trivially known.

CBC collisions. To bound the probability Pr[A G0
sets bad ], we first define an auxiliary flag coll-bad, and extend the games G 0 and G 1 with the lines marked by the symbol in Figure 9. Here, the notation llcp n (i) is a shorthand for llcp n (B 1 , . . . , B i−1 ; B i )−1, the index of the first block of the input to the plain CBC-MAC in the i th query B i = B i 0 . . . B i that is beyond the longest common blockwise prefix with any of such inputs in the previous i − 1 queries. Note that one is subtracted here because the blocks of B i are indexed from zero.
Since A makes no trivial queries and because the string B i is a prefix-free encoding of (N, A, M ) in CCM, and consequently of (N, A, τ, M ) in vCCM, we have llcp n (i) < i for every 1 ≤ i ≤ q. I.e., each B i is distinct from all prefixes of B 1 , . . . , B i−1 . In particular, the value of llcp n (i) is independent of the game's randomness; when the i th query is a decryption query, the value of llcp n (i) can be computed using the ciphertexts stripped of the tag as substitutes for the plaintext as llcp n (B 1 , . . . ,B i−1 ;B This works, because encB 0 (N r , A r , τ r , left |C r |−τ (C r )) only depends on the length of left |C r |−τ (C r ), and because the xor-based CTR encryption preserves the value of If the last condition is not met, the plaintext, resp. ciphertext blocks are irrelevant for the value of llcp n (i).
In a nutshell, the coll-bad corresponds to reusing an input B i j ⊕ y i j−1 to the function f (i.e., evaluating f on a point already in Dom(f )) beyond what is trivially determined by the common prefix with previous queries. We observe that and proceed to bound Pr[A G0 sets coll-bad] by induction on blocks y i j , with the induction assumption being that coll-bad = false when the block is about to be processed. When i th query is being processed, coll-bad can only be set when sampling the values of y i j for llcp n (i) + 1 ≤ j ≤ i . For each such value of j, we bound the probability of collision between B i j ⊕ Y i j−1 and each element already in Dom(f ), with help of the following case analysis.
We have three base cases for the value of j, namely j = 0, 0 < j = llcp n (i) + 1 and llcp n (i) + 1 < j ≤ i . Before examining them, we note that the domain points of f , x ∈ Dom(f ), can be classified in three "types". We say x is of type1 when it has been added as the initial CBC-MAC input block We say x is of type2 when it has been added as an intermediate CBC- We say x is of type3 when it has been added as a counter block ctr(N i , τ i , j ) for i ≤ i. Case1: j = 0. The input B i 0 to f that may set coll-bad has no randomness. We examine collision probabilities with the three types of elements in Dom(f ), determining that the maximal collision probability in this case is 1/2 n : type1: As llcp n (i) < j = 0 implies that N i has not been used in any previous query, the probability of collision with an x of type1 is 0.
type2: This collision is equivalent to the event Y i j −1 = B i 0 ⊕ B i j . Due to the induction assumption, Y i j −1 is statistically independent of all variables returned to the adversary so far (ciphertext blocks and tags), so this collision happens with probability at most 1/2 n .
type3: This collision happens with probability zero, as the ranges of the two encoding functions encB 0 (·, ·, ·, ·) and ctr(·, ·, ·) have an empty intersection, thanks to the dedicated domain separation bits.
Case2: 0 < j = llcp n (i) + 1. The potentially collision-causing input to f here is B i j ⊕ Y i j−1 , such that the value Y i j−1 = Yĩ llcp n (i) has been sampled inĩ th query withĩ = min{r < i | llcp n (B i , B r ) = llcp n (i) + 1}. Examining the collision probabilities with the three types of elements in Dom(f ) reveals the maximal collision probability in this case to be 1/2 n : Even though Y i j−1 has been sampled iñ i th query, the induction assumption implies that it is statistically independent of all variables seen by the adversary, so happens with probability at most 1/2 n . type2: We have two sub-cases here. In the first, special case, the collision event is j by the definition of the longest common prefix, corresponding to a collision probability zero.
In the second, general case, the collision event Y i j−1 ⊕ B i j = Y i j −1 ⊕ B i j with j = llcp n (i) + 1 happens with probability 1/2 n , because the variables Y i j−1 and Y i j −1 , even though both sampled in a previous query, are statistically independent, and unknown to A due to induction hypothesis.
type3: Similarly, as with type1, the collision event Y i j−1 ⊕ B i j = ctr(N i , τ i , j ) happens with probability at most 1/2 n .
Case3: llcp n (i)+1 > j ≤ i . Because of the induction assumption, and because llcp n (i)+1 > j, the variable Y i j−1 is fresh and has not been used in the game before. The probability of collision with an x ∈ Dom(f ) of any type is thus 1/2 n .
As there are at most 2σ + 3q possible values for (i, j) that could set coll-bad, and because when sampling y i j each such value (i, j) we have |Dom(f )| ≤ 2σ + 3q, we have Pr[A G0 sets coll-bad] ≤ (2σ + 3q) 2 /2 n . Bad events. We next define the following fine-grained bad events: ey-bad(i, j) is defined as the event that the i th adversarial query is an encryption query and bad gets set to true when sampling the value y i j (i.e., it has remained false in the previous i − 1 queries and when sampling y i 0 , . . . , y i j−1 ) for 1 ≤ i ≤ q and 0 ≤ j ≤ i .
ez-bad(i, j) is defined as the event that the i th adversarial query is an encryption query and bad gets set to true when sampling the value z i j (i.e., it has remained false in the previous i − 1 queries, when sampling y i 0 , . . . , y i i and when sampling z i 0 , . . . , z i j−1 ) for 1 ≤ i ≤ q and 0 ≤ j ≤ m i . dz-bad(i, j) is defined as the event that the i th adversarial query is an decryption query and bad gets set to true when sampling the value z i j (i.e., it has remained false in the previous i − 1 queries and when sampling z i 0 , . . . , z i j−1 ) for 1 ≤ i ≤ q and 0 ≤ j ≤ m i .
dy-bad(i, j) is defined as the event that the i th adversarial query is an decryption query and bad gets set to true when sampling the value y i j (i.e., it has remained false in the previous i − 1 queries, when sampling z i 0 , . . . , z i mi , and when sampling y i 0 , . . . , y i j−1 ) for 1 ≤ i ≤ q and 0 ≤ j ≤ i .
Denoting by E the event "A G0 sets coll-bad" further on, we have Pr[A G0 sets bad | ¬E] ≤ We first turn to the bad events related to the sampling of y i j . We have that Pr[ey-bad(i, j) | ¬E] = Pr[dy-bad(i, j) | ¬E] = 0. This is because for 1 ≤ j ≤ llcp n (i), the input value Y i j−1 ⊕ B i j has already been used in a previous query i with τ i = τ i , as the latter is a necessary condition for llcp n (i) > −1.
We bound Pr[ez-bad(i, j) | ¬E] which corresponds to ctr(N i , τ i , j) ∈ Dom(f )\Dom(f τi ) through a case analysis of the collision probabilities with individual elements of this set, using the three types of domain points defined before.
type1: This event is equivalent to ctr (N i , τ i , j) Similarly, as with Case1-type3 collision, this event happens with probability 0, thanks to the domain separation of the involved encoding functions.
type2: This is equivalent to ctr( As previously argued, if coll-bad is not set during the experiment, all variables observed by the adversary are statistically independent of Y i j −1 for 1 ≤ i ≤ q and 1 ≤ j ≤ i . This event thus happens with probability 1/2 n .

Experimental Validation of Energy Efficiency
In many applications relying on battery-operated low-power devices, (such as wireless sensor networks), energy is among the most critical resources. In this section, we experimentally validate that the flexible trade-off between security level and communication overhead enabled by vCCM, an nvAE scheme, does indeed translate to measurable energy savings. Our experiment and projections are based on a simple communication scenario where one device always acts as a transmitter and another device as a receiver. We measure and compare the energy consumption of the sending device when using CCM and vCCM, respectively, to encrypt data and transmit it using a sub-GHz transceiver.
Communication scenario. In our scenario, the sender transmits two types of messages, "non-critical" and "critical", periodically. The "non-critical" messages are sent frequently and are assumed to require a lower level of protection. The "critical" messages are sent sporadically and are assumed to require a higher level of security. The sender does not receive any transmissions. This scenario is a simplified model of numerous applications where wireless sensor nodes regularly sense and report on their environment, as for example temperature sensors in smart building systems, parking-lot sensors reporting occupancy, manufacturing-line monitoring sensors in a predictive maintenance system, environmental sensors for prediction of avalanches and so on. Here, the non-critical messages correspond to the sensing data, where the impact of corruption of data due to an occasional forgery is typically low (in the sense of risk management). The critical messages correspond to control traffic (such as reporting a permanent shutdown due to a drained battery), where sporadic forgeries may have a significant impact. Thus, stretching the non-critical ciphertexts less than the critical ones is appropriate, from the perspective of a cost-security trade-off.
Experimental Setup. Our setup consists of two embedded HW platforms, a sender and a receiver. The sender is a custom low-power embedded platform designed at CSEM, called Wisenode VXI (WN), with the nRF52840 SoC by Nordic Semiconductors [nrf] as the main micro-controller unit (MCU) and the Sx1261 transceiver by Semtech [sx1] used for wireless communication. The receiver is a Raspberry Pi 3B board [rpi] extended with the LoRa/GPS hat by Dragino [dra]. For the transmission, we send raw packets using the LoRa PHY modulation. The HW platforms have been selected as follows. The sender platform uses an MCU well-known for its low power consumption, and the LoRa radio is especially well-suited to minimize overhead of transmission in terms of energy consumption. This would be a natural choice for an application that has low-power requirements. The receiver platform is based on widely available components off-the-shelf, relying on open-source libraries, serving as a reference, and indirectly as a validation for the communication stack of the sender.
To determine the energy consumption of the sender, we powered the sender with a lab power supply at 3V, and measured the immediate current from the power supply with a Keysight CX3324A Device Current Waveform Analyzer using a passive 0Ω probe at a sampling frequency of 100MHz. Implementation. The sender is running CSEM's custom embedded real-time, multiprocess operating system µ111 [MFI94,MFG99], which implements most of the embedded and process synchronization primitives, such as semaphores and precise timers, and is designed specifically for low-power applications. Thanks to these features, µ111 lends itself well for the task at hand. For the experiment, a driver for the Sx1261 radio has been integrated in the OS. The application code uses a hardware timer run in one process to precisely schedule data encryption and transmissions, in a second, synchronized process. All unneeded peripherals are disabled, and the MCU and the radio are in sleep mode between transmissions. The receiver is running Raspbian, the native, Linux-based OS for Rasperry Pi, with the default driver for the Dragino LoRa hat. Both the sender and receiver used the same implementation of vCCM, based on the CCM implementation of mbedTLS [mbe].

Measurements.
We carry out an experiment consisting of 10 measurements. In each measurement, we capture the immediate current drawn by the sender, while repeatedly encrypting and sending a payload every 10 seconds, acquiring data from 70 transmissions. Such a measurement is done for each combination of a plaintext length (4 or 16 bytes), and AE-scheme tag-length pair (CCM and 8 byte tags, CCM and 16-byte tags, vCCM and 4-byte tags, vCCM and 8-byte tags, and vCCM and 16-byte tags). In a post-processing scripts, we isolate the consumption "peak" corresponding to the entire wake-up time of the sender during each of the 70 transmissions. This includes the consumption due to encryption, transmission and the OS-incurred processing overhead. In each measurement, we compute the average duration and average energy consumed 3 per single transmission "peak". These average values are displayed in Figure 12 and visualized in Figure 11.  To compute a projected energy consumption, we assume the sender transmits a non-critical message every 10 seconds, and a critical message once every minute. When using vCCM, the non-critical ciphertexts are stretched less than the critical ciphertexts, such that we consider several combinations of plaintext length and two tag lengths. When using CCM, we treat all plaintexts with the longer of the two tag lengths. The approximate energy consumed by a T -second long operation of the sender using the standard CCM E CCM (T ) is computed as shown in (2), whereĪ denotes the average current drawn during a transmission (can be used and computed from Figure 12 w.l.o.g.) andt denotes the average duration of the wake-up state per transmission. The formula assumes an optimal idle state current consumption of 18 µA and that the voltage is 3V.
3 · (T − T /10 ·t) · 18 × 10 −6 + T /10 ·t ·Ī J . (2) The approximate energy consumed by a T -second long operation of the sender using vCCM E vCCM (T ) is computed as shown in (3), whereĪ1 andĪ2 denote the average current drawn during transmission of the non-essential and essential messages respectively, andt1 andt2 denote the average duration of wake-up state per transmission for non-essential and essential messages respectively. The formula again assumes an optimal idle state current consumption of 18 µA and that the voltage is 3V.
By evaluating the formulas (2) and (3) using the experimental results from Table 12 and by setting the time T = 3600s (1 hour), we obtain the results displayed in Table 13, and visualized in Figure 14 for T ≤ 3600s.
From Table 13, we see that we obtain 8% of overall energy saving when using 4B/8B tags with vCCM instead of the 8B tags with CCM, 21% energy savings when using the 4B/16B tags with vCCM instead of the 16B tags with CCM, and 14% energy savings when using the 8B/16B tags with vCCM instead of 16B tags with CCM. This shows that an optimization as simple as adjusting the tag length in communication scenarios with modest frame sizes leads to a noticeable decrease of energy consumption. The relevance for practical applications is even higher due to the fact that these savings are almost for free, owing to the simplicity of the black-box transform underlying vCCM.

Energy Savings
Tag

Yoga Is Not For Everyone
In Sections 4 and 5 we show that it is possible for an nAE secure scheme to be black-box transformed into an nvAE secure scheme. This, together with the potential for substantial energy savings achievable with help of nvAE schemes (showed in Section 6), raises the following questions.

Does this transformation work for all nAE schemes?
This question has previously been answered negatively by Reyhanitabar, Vaudenay and Vizár, who showed that an entire class of nAE secure schemes 5 succumbs to a non-trivial nvAE forgery attack when they are transformed into nvAE schemes by encoding the tag length into their nonce, or into their AD, or even into both their nonce and their AD simultaneously [RVV16]. Moreover, this class of AE schemes includes OCB [RBBK01,Rog04,KR11] and GCM [MV04], two very well known and widely used   constructions, which further highlights CCM as the target whose transformation has the best potential for immediate real-world impact.

Does this transformation work for any other nAE schemes?
We confidently conjecture an affirmative response to this question. The insight learned from the forgery attack presented by Reyhanitabar et al. (and in a less general version by the Ascon team in the CAESAR competition before that [Eic]) is, informally speaking, that for an nvAE construction to be secure, each call to the underlying cryptographic primitive must have a stretch-dependent input. The blakcboxtransformed OCB and GCM do not have this property because the processing of their AD blocks is parallel and independent of the nonce. Interestingly enough, the sequential nature of CCM's authentication tag computation, sometimes criticized for its inefficiency [RW03], provides nvAE security with the nonce-based transform: the nonce is part of the input to the first blockcipher call in the plain CBC MAC, which propagates the "influence" of stretch throughout the computation of the authentication tag. This observation immediately suggests sequential modes as good candidates for the nAE-to-nvAE transform from Section 4. In particular, sponge-based AE modes can be safely conjectured to enjoy nvAE security with tag length encoded in the nonce. The informal proof sketch is that, thanks to the reduction to a duplex object [BDPA11,MRV15] (essentially a blockwise, stateful PRF), queries with different tag lengths will have all internal sponge states sampled "independently", which yields the kess property. A formal proof is required to confirm this intuition, however. Moreover, sponge variants that cannot be reduced to keyed duplex, such as Ascon [DEMS], will require dedicated proofs.
Reyhanitabar et al. [RVV16] also presented vΘCB, an nvAE secure variant of OCB, where the tag length has been included as a tweak component for the underlying tweakable blockcipher, in order to obtain the kess property. This suggests tweakable blockcipher-based nAE modes as candidates for the nAE-to-nvAE transform from Section 4, though this class of AE schemes seems to be without a significant representative construction.

Discussion
We have presented the first nvAE secure scheme that is obtained as a truly black-box transform of a previous AEAD construction, such that the latter is perhaps the most widely supported AEAD standard in embedded computational platforms. We have then experimentally confirmed that the use of such a scheme is of practical interest, and brings measurable improvements of efficiency.
One important question, which has been addressed only partially in the existing literature, is the resistance of CCM to multiple forgeries; i.e., how difficult is it to mount a forgery with a τ -bit tag given that the adversary has already succeeded in making one or more forgeries? This is especially relevant when considering extreme tag lengths (below 32 bits used e.g., in Bluetooth standard), which can be meaningful for certain applications. Forler et al. indicate that CCM does resist to such attacks in nonce-respecting setting [FLLW17], suggesting vCCM may be used with extremely short tags. However, it is necessary to integrate the reforgeability and the nvae security notions and investigate the corresponding relations among notions to fully analyze the impact of simultaneous tag variations and reforgeries on the security of vCCM and other nvAE schemes.
Another interesting open question is to identify which NIST LWC candidates are eligible for an nvAE black-box transform, defining and analyzing these transforms, in order to compensate for the lack of consideration for variable stretch in the standardization project.