Deck-Based Wide Block Cipher Modes and an Exposition of the Blinded Keyed Hashing Model

We present two tweakable wide block cipher modes from doubly-extendable cryptographic keyed (deck) functions and a keyed hash function: double-decker and docked-double-decker. Double-decker is a direct generalization of Farfalle-WBC of Bertoni et al. (ToSC 2017(4)), and is a four-round Feistel network on two arbitrarily large branches, where the middle two rounds call deck functions and the first and last rounds call the keyed hash function. Docked-double-decker is a variant of double-decker where the bulk of the input to the deck functions is moved to the keyed hash functions. We prove that the distinguishing advantage of the resulting wide block ciphers is simply two times the sum of the pseudorandom function distinguishing advantage of the deck function and the blinded keyed hashing distinguishing advantage of the keyed hash functions. We demonstrate that blinded keyed hashing is more general than the conventional notion of XOR-universality, and that it allows us to instantiate our constructions with keyed hash functions that have a very strong claim on bkh security but not necessarily on XOR-universality, such as Xoofffie (ePrint 2018/767). The bounds of double-decker and docked-double-decker are moreover reduced tweak-dependent, informally meaning that collisions on the keyed hash function for different tweaks only have a limited impact. We describe two use cases that can exploit this property opportunistically to get stronger security than what would be achieved with prior solutions: SSD encryption, where each sector can only be written to a limited number of times, and incremental tweaks, where one includes the state of the system in the variable-length tweak and appends new data incrementally.


Introduction
Block ciphers have long been the main building block for symmetric cryptography. However, block ciphers operate on data of fixed and predetermined length. For example, DES had a block size of 64 bits and AES has a block size of 128 bits. One can encrypt data of variable length by using a block cipher in a mode of operation, such as counter mode, CBC or OFB.
Security of such modes typically depends on a nonce. This might be a random value that should never be repeated or a counter. In some applications it is desirable to have security even in the absence of a nonce or in the case of accidental nonce violation. For example, with full disk encryption the nonce serves as the sector index. When the content of a sector changes, this is re-encrypted with the same sector index as nonce. Consider an adversary that may have access to the encrypted sector before and after the change. In the case of stream encryption it can derive from this the bitwise difference of the plaintexts: P ⊕ P = C ⊕ C . When CBC would be used, it still leaks equality of first blocks. A solution would be to store a counter in the sector that would increment with each re-encryption, but often it is undesirable to have a difference between the sector size with or without encryption. Another example is the Tor protocol for anonymity. It enciphers the payload in network packets iteratively with different keys. The encryption should be length-preserving and there is no place for a nonce. For these cases, one can use a tweakable wide block cipher. This encrypts arbitrarily large strings in such a way that each bit of the ciphertext depends on each bit of the plaintext and vice versa. If the plaintext changes, the ciphertext will look completely random, even if the change is just a single bitflip.
The idea of wide block ciphers is gaining traction: the latest contribution to the field is Adiantum [CB18], a mode that is specifically developed for disk encryption but still fares very well as a general wide block cipher mode. Adiantum follows a structure that reminds of Feistel but it is asymmetric. It uses two branches: an arbitrary-length branch for bulk encryption, and a fixed-length branch that is processed using AES and that is used as a seed for encryption of the arbitrary-length branch. A description of Adiantum is given in Appendix A. Adopting arbitrary-length branches appears like a fruitful direction, but decryption requires evaluation of the inverse of AES. Other notable constructions are HEH [NR97], EME [HR04], HCTR [WFW05], AEZ [HKR15], FAST [CGLS17] and Tweakable HCTR [DN18]. We refer to Table 1 for an overview of these modes.

Deck-Based Wide Block Cipher Modes
In this work, we formalize and analyze two similar deck-based tweakable wide block cipher modes: double-decker and docked-double-decker. Double-decker is an immediate generalization of Farfalle-WBC, a wide block cipher construction proposed by Bertoni et al. [BDH + 17]. Double-decker follows the well-established Feistel design, but unlike all previous instances, it is based on two arbitrary-length branches and is built upon two arbitrary-length primitives called doubly-extendable cryptographic keyed (deck) functions. Deck functions, a notion of Daemen et al. [DHVV18], are cryptographic primitives that have arbitrary-length input and output. In this light, they are the natural building block in our wide block cipher mode whose two branches are both arbitrary-length. Double-decker, as such, consists of two Feistel rounds of deck functions surrounded by two Feistel rounds of a keyed hash function.
We also introduce docked-double-decker, a variant of double-decker. It moves the bulk of the input of the deck functions to the keyed hash functions. This illustrates the flexibility that the Feistel structure provides: it does not matter which function absorbs inputs, as long as all input is absorbed by a cryptographic function. Moreover, the input to the deck functions becomes fixed length, as long as the tweak has a fixed length as well. This allows one to conceptually view the deck functions as stream ciphers in this case.
Nonetheless, there is more to the deck-based modes. As deck functions support multiple strings as input, we feed a tweak to the inner rounds. This leads to the actual constructions of Section 4: double-decker depicted in Figure 2 and docked-double-decker depicted in

Blinded Keyed Hashing Model
Security of cryptographic constructions based on keyed hash functions often relies on the universality of the hash function. For example, in the analyses of the schemes listed before, the keyed hash functions are assumed to be ε-XOR-universal for some small value of ε. However, not all keyed hash functions fit well in this model. An example of such function is Xoofffie, whose security claim [DHP + 18, Claim 2] uses a different security notion. Based on this claim, we introduce a more general security model for keyed hash functions: blinded keyed hashes (bkh).
In a nutshell, bkh does not look at single pairs of inputs at a time, as XOR-universality does, but instead looks at a whole list of queries at the same time. As not every pair of inputs is as bad as the worst case, this formalization allows for a more fine-grained security analysis and allows one to improve the security claim in some cases.
We show that an ε-XOR-universal function has a bkh advantage of at most q 2 ε, but the bound is not necessarily tight. We demonstrate the power of bkh based on Xoofffie. In detail, in Section 3.2 we show that the security claim for constructions based on Xoofffie increases from 64 bits to 128 bits in the common case, just by changing the security model from XOR-universality to bkh. This gain propagates through the mode, and henceforth our analyses of double-decker and docked-double-decker will be based on the assumption that the keyed hash functions are sufficiently bkh secure.

Security
In Section 5 we prove that a simplified security bound for the deck-based wide block cipher modes is of the form where σ is the total data complexity, σ W is the data complexity with tweak W , q W the number of queries with tweak W , Adv prf is the prf-advantage of the deck function and Adv bkh is the bkh-advantage of the keyed hash function. Our proof is based on ideas of Iwata and Kurosawa [IK02]. However, a difference is that we base our analysis on the bkh-security of the keyed hash functions rather than their XOR-universality.

Reduced Tweak-Dependence
Noting that the deck-based modes have a n-bit keyed hash function, a naive security bound would have been of the form 2Adv prf (σ) + 2Adv bkh (q, σ) . (2) In particular, the loss of the keyed hash function is Adv bkh (q, σ) directly. This is indeed the case for most existing tweakable wide block ciphers, including Adiantum. However, the deck-based modes are tweakable wide block ciphers, and the tweak turns out to allow for notable improvement of the bound. Different tweaks separate the domain, hence the underlying deck function should ideally produce independent outputs resulting in independent permutations. As given in Section 1.3, the security bound is of the form in (1). The bound of (1) significantly improves over the naive bound of (2) if the maximum number of tweak repetitions is limited. For example, if the cipher is called for q/ tweaks W , each tweak is used times and the hash is a traditional ε-XOR-universal function with Adv bkh (q, σ) = q 2 ε (see Proposition 1), the loss on the keyed hash function in (1) is of the form provided that q. Examples of such functions that would fulfill this include GHASH [NIS07] and Poly1305 [Ber05]. Moreover, if every tweak is used at most once, so = 1, we see that the mode has no security loss to the keyed hash functions. In this way our modes compare favorably with most other tweakable wide block ciphers (see also Table 1). Of course, queries for different tweaks still influence the first term of (1). Some applications can take advantage of (3) and use different tweaks in the deck-based modes. We demonstrate this in detail in Section 6 via two use cases. One use case is in the context of disk encryption, in Section 6.1. The use case sets physical SSD sector numbers as tweaks, and relies on the fact that the number of write operations to particular SSD sectors is physically limited [BD10]. In the use case detailed in Section 6.1, an adversary has an advantage of at most 2 46 ε on the hash functions with the deck-based modes whereas it would be of the order 2 74 ε for the typical security bound that is met by, for example, Adiantum. Here we assume that the differentially uniform function is a traditional ε-XORuniversal function. The second use cases is on incremental tweaks, in Section 6.2. Deck functions have the pleasant property that inputs consist of a sequence of an arbitrary number of strings, and in addition, that the inputs are incremental: appending a string to the sequence costs as much as just processing the new string. One can henceforth use our deck-based modes in a stateful manner, where the tweak is dependent on all or most of prior messages exchanged. This way, each new evaluation of the mode is performed for a new tweak, collisions among hash function evaluations for the same tweak cannot occur (as one needs at least two evaluations), and the loss on the keyed hash function, the second term in (1), vanishes.

Comparison with Prior Solutions
Various tweakable block ciphers have appeared over the last years. The most notable examples include HCTR [WFW05], HEH [NR97], FAST [CGLS17], Tweakable HCTR [DN18] and most recently Adiantum [CB18]. A comparison among these ciphers, double-decker and docked-double-decker can be found in Table 1. Most of these constructions resemble a Feistel network. However, (Tweakable) HCTR and Adiantum use a block cipher on a branch. The size of this branch is bounded by the size of a typical block cipher: 128 bits. As the birthday bound applies in some use cases, this limits their security to 64 bits. Furthermore, they require the use of the inverse block cipher, incurring additional implementation cost for most block ciphers. Double-decker and docked-double-decker do not have this limitation, without requiring extra passes over the bulk of the data.
Furthermore, double-decker and docked-double-decker have the nice security property of reduced tweak-dependence (see Section 1.4). This means that collisions in the keyed hash functions do not reduce the security as long as different tweaks were used. A feature that is not fulfilled by many earlier schemes; only by Tweakable HCTR.
To evaluate the efficiency of the deck-based modes, we need to look into instantiations with concrete deck functions and keyed hash functions. Recently, such functions have been proposed in [BDH + 17], Kravatte and Short-Kravatte, and in [DHVV18, DHP + 18], Xoofff and Xoofffie. The former are built on the 1600-bit permutation Keccak-p with 6 rounds and the latter on the 384-bit permutation Xoodoo. For the latter, [DHVV18, Table  5] gives performance numbers. On ARM Cortex M0 and M3, representative for low-end CPU, Xoofff and Xoofffie process (long) input or output about 4 times faster than AES. So a Xoofff call with N bits of input and M bits of output is about four times as fast as applying AES-128 CBC-MAC to an N -bit input and then AES-128 in counter mode generating an M -bit keystream.
On high-end CPUs like Intel Skylake the presence of dedicated AES instructions makes the difference between Xoofff and AES smaller, but still on the recent Intel SkylakeX, Xoofff beats AES in speed.
Of course Xoofff has not gone through the amount of scrutiny that Rijndael/AES has, but these performance numbers indicate that there is great potential should Xoofff and Xoofffie turn out to stand up to their security claims, that are actually much more ambitious than those of AES. Table 1: Comparison of double-decker and docked-double-decker with state of the art. N is the length of the message, n is a security parameter (i.e., 128 or 256). The cost of keyed hash functions, stream ciphers (SC) and deck functions are displayed as the number of processed bits plus the number of generated bits. The cost of (tweakable) block ciphers ((T)BC) is displayed as only the number of processed bits (or equivalent in the case of AEZ). Refer to Section 1.5 for more information about the properties "Inverse free" and "Reduced tweak-independence". Figure 3

Related Work
Luby and Rackoff considered the Feistel construction with fixed-length independent pseudorandom function in each round [LR88]. They proved that three rounds are sufficient to obtain a pseudorandom permutation that can be used in forward direction, and four rounds are sufficient to obtain a strong pseudorandom permutation that can be used in both forward and inverse direction. A large amount of research has been aimed at reducing the requirements for the underlying primitives. One way to do so is by reducing the number of independent primitives [Pie90,Pat92]. Another avenue is in replacing cryptographically strong pseudorandom functions by (weaker) universal hash functions. The idea was first suggested by Lucks [Luc96] and was further investigated in [PRS02,IK02]. Particularly relevant to our work is the analysis of Iwata and Kurosawa [IK02] that considered a four-round Feistel construction where the two internal rounds are based on a single pseudorandom function and the two outer rounds on two independent universal hash functions with certain conditions.
Above-mentioned results so far achieve n/2-bit security, where n is the size of the branches. Patarin proved security beyond n/2 of the Feistel construction with 5 or more rounds [Pat98,Pat03,Pat04]. We refer to Nachef et al. [NPV17] for a detailed discussion of the security of multiple-round Feistel networks.
One can consider length doublers as a specific type of wide block ciphers. Length doublers use a block cipher with block size n, and transform it to an encryption primitive that allows encryption of strings of length between n and 2n − 1. Due to their flexibility, they suit well as building blocks for arbitrary-length (authenticated) encryption. Ristenpart and Rogaway [RR07] introduced the concept in 2007 and presented XLS, a construction later broken by Nandi [Nan14]. Other length doublers are DE by Nandi [Nan09] and HEM by Zhang [Zha12], both based on block ciphers and both birthday-bound secure in the block size of the primitive, and LDT by Chen et al. [CLMP17,CMN18], beyond n/2 secure but based on a tweakable block cipher.

Security Model
Our schemes will be parameterized by a security parameter n. This security parameter will restrict messages to be at least 2n bits long in our mode. For technical reasons we also limit the maximum message size to some natural number L. This allows us to more easily define a security model, as it is simpler to randomly draw from a finite set. We define message space key space K, and tweak space W, and consider a tweakable wide block cipher For fixed key K, we write E K = E(K, ·, ·). We require it to preserve length (i.e., |E K (W, P )| = |P | for any (K, W, P ) ∈ K × W × M). We also require it to be invertible for fixed (K, W ) ∈ K × W and denote its inverse with abuse of notation by For a tweakable wide block cipher E with security parameter n and maximum message size L, the distinguishing advantage of an adversary D is where TPerm(M) is the set of all length-preserving tweakable permutations over the message space M, i.e., the set of all π : W × M → M such that |π(W, P )| = |P | for all (W, P ) ∈ W × M and π(W, ·) invertible for any fixed W ∈ W. Again, we denote its inverse with abuse of notation by π −1 .

Deck Functions
We adopt a simplification of the definition of a deck function of Daemen et al. [DHVV18]. A deck (doubly-extendable cryptographic keyed) function F takes as input a secret key K ∈ K F , two arbitrarily long strings W, X ∈ {0, 1} * , produces a potentially infinite string of bits and takes from it the range starting from a specified offset q ∈ N and for a specified length n ∈ N. We denote this as As a deck function has an arbitrary long input and a potentially infinite output, a powerful security definition is that it should behave like a pseudo-random function (prf). This leads to the following security definition for a deck function F : where PRF[L] is the set of all pseudo-random functions with two arguments with maximum size L, i.e., the set of all G : is the data complexity of query i and where L = max i σ (i) is the maximum size of the inputs and outputs.

Differentially Uniform Hash Functions
The value ε is typically of the form 2 a−n for a small value a.

H-Coefficient Technique
Our proof will rely on the H-coefficient technique by Patarin [Pat91,Pat08], but we will follow the modernization of Chen and Steinberger [CS14]. Consider two oracles O and P, and any distinguisher D that has query access to either of these oracles. The distinguisher's goal is to distinguish both worlds and we denote by its advantage. If we denote the maximum amount of queries by q, we can define a transcript τ which summarizes all query-response tuples seen by the distinguisher during its interaction with its oracle O or P. Denote by X O (resp., X P ) the probability distribution of transcripts when interacting with O (resp., P). We call a transcript τ ∈ T attainable if P[X P = τ ] > 0, or in other words if the transcript τ can be obtained from an interaction with P.
Lemma 1 (H-coefficient technique). Consider a fixed deterministic distinguisher D. Define a partition T = T good ∪ T bad , where T good is the subset of T which contains all the "good" transcripts and T bad is the subset with all the "bad" transcripts. Let 0 ε 1 be such that for all τ ∈ T good : Then, we have Adv(D) ε + P[X P ∈ T bad ].
Conventionally, O corresponds to the real world (in our case E K , E −1 K ) and P to the ideal world (in our case π, π −1 ). With this lemma, we can prove an upper bound by defining good and bad transcripts. It tells us that the advantage will be small, as long as the good transcripts are almost as likely to occur in the real world as in the ideal world, and the probability that a bad transcript happens in the ideal world is small.

Blinded Keyed Hashes
Consider a keyed hash H = {H K : {0, 1} * → {0, 1} n | K ∈ K H }. We do not look at its output directly, instead we first blind it through a random oracle: a function that returns an independent random value for every new input. The keyed hash is called a blinded keyed hash (bkh) if the distinguishing advantage between the following two worlds, with inputs (X, ∆), is small: and a secret random oracle RO 1 ; • Ideal world P: RO 2 (X, ∆) with a secret random oracle RO 2 . We denote its advantage by Figure 1: The distinguishability setup for bkh. Left is the real world O and right is the ideal world P.
For a fixed list of queries Q, we define where D(Q) is the set of all distinguishers that make the fixed list of queries Q. And for a total number of q queries with data complexity where D(q, σ) is the set of all distinguishers that make at most q queries with data complexity σ = i |X (i) | + |∆ (i) |.

Characterization of the Advantage
The bkh security notion makes use of a random oracle to blind the output of the hash function. This prevents the distinguisher from seeing the output of the hash function. It can only know whether there was a collision in the input to the random oracle, and nothing more. We formally show that this is indeed the case in the following lemma.
Lemma 2. Let Q = (X (1) , ∆ (1) ), . . . , (X (q) , ∆ (q) ) be a list of q distinct queries and let bad(Q) be the event that there exist 1 Proof. Let Q be a list of queries. We prove the equality by proving both corresponding inequalities. Let D be a distinguisher that makes the queries Q. Then

Adv bkh
− P D P = 1 . As this holds for all distinguishers D that makes the queries Q, we find that We now look at the other inequality. For every m ∈ N, we define the distinguisher D m as follows. It makes the queries Q and asks for the first m output bits of the random oracle. Then it returns 1 if there is a collision in these outputs, and 0 otherwise. For this distinguisher, we have the following two properties. First, it always finds a collision in the real world when bad(Q) occurs, so P D O m = 1 | bad(Q) = 1. Second, it only falsely finds a collision in the ideal world when there is a collision in the first m output bits of the random oracle. This happens with probability 2 −m for all pairs, so P D P m = 1 q 2 2 −m , where q is the length of the list. We find that As lim m→∞ q 2 2 −m = 0, this means that

Relation to Differentially Uniform Functions
The more common definition of differentially uniform functions given in Section 2.3 is roughly equivalent to the definition of blinded keyed hashes. Proposition 1 shows the relation between ε-XOR-universal and bkh functions. We can reduce any ε to a bkh advantage bound and vice-versa, but the ε-XOR-universal definition is more strict. It requires an upper bound ε on the probability of collisions, but not all inputs have to be near this ε as the probabilities do not have to be uniform. On the other hand, the bkh definition allows for more flexibility as it looks at a list of queries and makes a claim about an attack as a whole. Indeed, some functions have much better guarantees with the bkh definition than the ε-XOR-universal one. For example Xoofffie has the following security claim [DHP + 18, Claim 2]: Adv bkh Xoofffie (q, σ) where M = σ/384 is the data complexity expressed in the number of input blocks and where n can be chosen variably. If n − 4 is larger than 256, the quadratic term is negligible and Xoofffie claims to have around 128 bits of security. We would not get this level of security in the claim when we look at the traditional ε-XOR-universal definition. By Proposition 1, Xoofffie is ε-XOR-universal with ε = Adv bkh Xoofffie (2, 2 · 384) 2 2 128 + 2 2 2 n−4 ≈ 2 −127 .
Here, we assume that Xoofffie is only called with one block inputs. If it is called with more input, its ε-XOR-universal security becomes even worse. However, with this definition we approximate the advantage of Xoofffie as q 2 /2 127 M 2 /2 127 , which would suggest that Xoofffie only has around 64 bits of security. By using the bkh definition, we can make use of these better security properties.
To our knowledge, the only currently known cryptographic primitive with a bkh-model security claim is Xoofffie [DHP + 18, Claim 2]. As a bkh claim is more general than a differential-uniformity one, proofs using the bkh-model are also valid for the latter. However, conversely, a bkh-model security claim can give a better security bound for high data complexity. In particular, in Farfalle-like keyed hash functions, similar claims can be made when using permutations such as the Ascon permutation [DEMS16]

If H is ε-XOR-universal, then H is bkh with advantage
where l is the maximum allowed query length.
Proof. We prove the two properties separately.
1. As the bkh-definition uses a distinguisher between two worlds, we can apply the H-coefficient technique on it. As the distinguishers are computationally unbounded, we can assume without loss of generality that they are deterministic. The transcripts are the queries the distinguisher makes. Let Q be a list of q queries with data complexity σ. It is called bad if bad(Q) holds as in Lemma 2. Otherwise, it is called good. First, we analyze the bad queries. For every 1 i < j q we have that ε .
Since there are q 2 of these pairs, we find that P[bad(Q)] ε q 2 . Now we look at the good queries. If a list of queries Q is good, then there are no internal collisions. This means that all outputs are independent outputs from a random oracle, in both the real and ideal world. In other words, they are indistinguishable in this case, hence
2. Let H be bkh and let l be the maximum allowed query length. For any distinct elements X, X ∈ {0, 1} * with |X| + n, |X | + |Y | l and any element Y ∈ {0, 1} n we can consider a list of queries Q = {(X, 0 n ), (X , Y )}. From Lemma 2 we get This means that H is ε-XOR-universal with ε = Adv bkh H (2, 2l).

Deck-Based Wide Block Cipher Modes
We introduce two modes: double-decker, directly based on Farfalle-WBC [BDH + 17], and docked-double-decker, a slight modification of double-decker that moves the bulk of the input from the deck functions to the hash functions. Double-decker and docked-double-decker are tweakable wide block ciphers that take as input three keys (K, K 1 , K 2 ) ∈ K = K H × K F × K F , a tweak W ∈ W, and a message P ∈ M, and transform it to a ciphertext C ∈ M. Instead of using two keys K 1 , K 2 ∈ K F , it is also possible to use a single key and apply domain separation between the functions, as that gives two independent functions as well. Both double-decker and docked-double-decker are a generalized four-round Feistel construction based on two independent deck functions F K1 and F K2 and two evaluations of a bkh function H K .
Double-decker is illustrated in Figure 2. Here, the message P is first split into the inputs to the two branches U and V using a split function split. This function split takes as input the length of the message P and outputs the length of the left branch, |U |. The states U and V are further split into U L , U R , V L and V R with the outside branches length n, so |U L | = |V R | = n. This means that we have a condition that n split(|P |) |P | − n for all P . In other words, split can be an arbitrary function, as long as it is defined for input strings of at least 2n bits, and its corresponding branches are of length at least n.
Docked-double-decker is illustrated in Figure 3. It is similar to double-decker, but with two differences. First, the branch V L is removed, i.e. split(|P |) = |P | − n. Second, the bulk of the input to F K1 , on branch U R , is moved from the deck function F K1 to the first bkh function H K . This illustrates the flexibility that the Feistel structure provides: it does not matter which function absorbs inputs, as long as all input is absorbed by a cryptographic function. Moreover, the input to the deck functions F K1 and K K2 becomes fixed length, as long as the tweak has a fixed length as well. This allows one to conceptually view the deck functions as stream ciphers in this case.

Analysis
We prove security of both the double-decker and the docked-double-decker construction of Section 4. Theorem 1. Let E be either the double-decker construction or the docked-double-decker construction of Section 4. For any distinguisher D making at most q queries with a length of at most L, we have Figure 2: Double-decker tweakable wide block cipher. The parsing of P into states where σ F1 , σ F2 are the total data complexities on each F and where σ H1,W , σ H2,W are the data complexities on each H and q W is the number of queries with tweak W made by distinguisher D. For double-decker we have For docked-double-decker we have The proof of Theorem 1 will be given in the remainder of this section. We first use the triangle inequality to transform the deck functions to random oracles G 1 , G 2 . This happens at a cost of Adv prf F (σ F1 ) + Adv prf F (σ F2 ), as in query i the first deck function is called with complexity σ The rest of the proof is based on ideas of Iwata and Kurosawa [IK02]. It is performed using the H-coefficient technique, which we will first outline in Section 2.4. For this technique, we have to define our transcripts and the way we partition them in bad and good ones. We do this in Section 5.1. The analysis of the bad transcripts is done in Section 5.2, and the analysis of the good ones is done in Section 5.3, which completes the proof.

Double-Decker
Distinguisher D makes q queries to its oracle (O or P) and these queries are summarized in a transcript of the form Note that in the transcript, P (i) is already parsed into U (1) and V (1) implicitly. Additionally, after all the queries are made, the key K is added to the transcript in order to be able to define the bad transcripts. Furthermore, since both worlds are consistent, we can assume without loss of generality that D does not make any pointless queries that encrypt a plaintext or decrypt a ciphertext already queried before.
For every 1 i q we define (i) = U (i) + V (i) as the length of the query. We also define L = (i) | 1 i q as the set of used lengths, q W = # i | 1 i q, W (i) = W as the number of queries with tweak W , and finally q ,W = # i | 1 i q, (i) = , W (i) = W as the number of queries with length and tweak W .
A transcript is called bad if there exist 1 i < j q with W (i) = W (j) such that at least one of the following holds: Absence of these events implies that a good transcript has no collision in an input to G 1 or G 2 .

Docked-Double-Decker
Distinguisher D makes q queries to its oracle (O or P) and these queries are summarized in a transcript of the form Note that in the transcript, P (i) is already parsed into T (1) , U (1) and V (1) implicitly. Additionally, after all the queries are made, the key K is added to the transcript in order to be able to define the bad transcripts. Furthermore, since both worlds are consistent, we can assume without loss of generality that D does not make any pointless queries that encrypt a plaintext or decrypt a ciphertext already queried before.
For every 1 i q we define (i) = T (i) + U (i) + V (i) as the length of the query. We also define L = (i) | 1 i q as the set of used lengths, q W = # i | 1 i q, W (i) = W as the number of queries with tweak W , and finally q ,W = # i | 1 i q, (i) = , W (i) = W as the number of queries with length and tweak W .
A transcript is called bad if there exist 1 i < j q with W (i) = W (j) such that at least one of the following holds: Absence of these events implies that a good transcript has no collision in an input to G 1 or G 2 .

Double-Decker
Let W ∈ W be a tweak and let Q W = {(V (i) , U (i) L ) | 1 i q, W (i) = W } be the set of input queries with tweak W . If event bad 1 occurs with tweak W , that means that for some 1 i < j q with W (i) = W (j) = W and U (i) R . It cannot be the case that V (i) = V (j) and U (i) L as well, as that would mean that j is a pointless query. In other words, all elements of Q W are distinct, so bad(Q W ) occurred. By Lemma 2 we get Since bad 1 can only occur between queries with the same tweak, we can estimate the final probability P[bad 1 ] by summing over all tweaks. This means that Bounding the probability of bad 2 is similar, but with queries

Docked-Double-Decker
Let W ∈ W be a tweak and let Q W = {(U (i) V (i) , T (i) ) | 1 i q, W (i) = W } be the set of input queries with tweak W . If event bad 1 occurs with tweak W , that means that Bounding the probability of bad 2 is similar, but with queries

Analysis of Good Transcripts
Let τ be a good transcript.
Lemma 3. G 1 and G 2 never receive the same input with the same tweak twice.
Proof (Proof (for double-decker)). Suppose that G 1 receives the same input with the same tweak for 1 i < j q. This immediately implies that W (i) = W (j) and U (i) R . Furthermore, the first n bits of the input have to be the same, which means that But this means that bad 1 occurred, which is impossible as τ is a good transcript.
The case for G 2 is similar, but with bad 2 .
Proof (Proof (for docked-double-decker)). Suppose that G 1 receives the same input with the same tweak for 1 i < j q. This immediately implies that W (i) = W (j) . Furthermore, the inputs to G 1 have to be the same, which means that But this means that bad 1 occurred, which is impossible as τ is a good transcript.
The case for G 2 is similar, but with bad 2 .
By Lemma 3, there are no collisions in the inputs to G 1 and G 2 . This means that for every query every output of these functions is independent and equally likely. By choosing these outputs, every ciphertext is possible in an unambiguous way in the case of a forwards query, and similar for the plaintext for a backwards query. This means that the probability that the specific output in query i occurs is equal to 2 − (i) . As these probabilities are independent, the probability that transcript τ occurs is equal to In the ideal world, a random permutation is chosen for every length and tweak. For every length and tweak W , there are q ,W values fixed. This means that there are 2 − q ,W ! possible permutations out of a total of 2 !. Hence the probability that the transcript τ occurs, equals This means that We have that x i for non-negative x i , which can be shown by induction to k. Hence 6 Application

Disk Encryption
In our deck-based modes, a naive security loss to the hash functions would be of the form 2Adv bkh H (q, σ), where q is the total number of queries and σ the total data complexity. However, the more fine-grained security loss given in Theorem 1 is of the form where q W is the number of queries and σ W the data complexity with tweak W . This is a significant improvement over the naive security loss for some functions if the tweak varies a lot. It does not give a real improvement for hash functions with a more linear security loss in the bkh model. However, some hash functions have a quadratic security loss, even in the bkh model. Examples of such functions include GHASH [NIS07] and Poly1305 [Ber05]. In these cases, the naive security loss to the hash functions would be of the form 2 q 2 ε and the more fine-grained one given in Theorem 1 would be of the form W ∈W 2 q W 2 ε. This turns out to give a significant gain in the context of disk encryption, if the deck-based modes are called with the physical sector number as tweak. In this case, every sector basically gets its own dedicated wide block cipher. This is especially useful in the context of SSDs. As SSDs get damaged every time data is written to a sector, the firmware of the SSD tries to distribute the data uniformly over its sectors [GT05]. If a sector is written to too many times, the sector cannot be used anymore [BD10]. In the context of the bound of Theorem 1, this means that the distinguisher is limited in its attack: every tweak can only be used at most a constant number of times.
We outline the gain with a concrete example. The Kingston UV500 960 GB [Kin18] has a Total Bytes Written specification of 480 TB. This means that every sector can be written at most 480TB/960GB = 500 times. For a sector size of 4 KiB, which implies N = 960GB/4KiB ≈ 2 28 sectors, this concretely means a security loss on the hash functions of 2N 500 2 ε ≈ 2 46 ε . This is an improvement over the typical birthday bound of 2 500N 2 ε ≈ 2 74 ε that would we achieved by, for example, Adiantum [CB18]. Concretely, the security level goes up log 2 N bits, and the gain would become particularly meaningful if ε is not very small.

Incremental Tweak
Following Daemen et al. [DHVV18], deck functions have two features: (i) their inputs consist of a sequence of an arbitrary number of strings, each of arbitrary length; (ii) their inputs are incremental: appending a string to the input sequence costs only the processing of the new string.
These features, along with the reduced tweak-dependence of the bounds of our constructions, leads to pleasant use cases for double-decker and docked-double-decker. In more detail, in our constructions we already make use of feature (i) by presenting the tweak and intermediate result as a sequence of two strings. We do not yet use feature (ii). However, suppose we make the tweak dependent on prior messages exchanged, or in general on the history of the protocol. It suffices to clone the state of the deck function after absorbing the new tweak string and in the next construction call start from this state. The tweak used in the computation of a construction call will consist of the sequence of the tweak strings that were absorbed in all previous construction calls. This tweak is different for each evaluation of double-decker/docked-double-decker, which, in the context of the security bound, implies q W = 1 for each tweak W . Note that this use case comes at modest cost only: by incrementality (feature (ii)), appending data to the earlier sequence costs only the processing of the new string.
We remark that this use case is not far-fetched: similar ideas may be found in several modern protocols. One example is the TLS handshake [TD08], where a hash is computed over all steps and where only this hash is signed later on. Another example is the STROBE protocol framework of Hamburg [Ham17]. STROBE makes use of stateful objects in such a way that all strings absorbed by the object impact its state. Hence, any output generated by the stateful object depends on the concatenation of all that came before.