Errata to Sound Hashing Modes of Arbitrary Functions, Permutations, and Block Ciphers

. In ToSC 2018(4), Daemen et al. performed an in-depth investigation of sound hashing modes based on arbitrary functions, permutations, or block ciphers. However, for the case of invertible primitives, there is a glitch. In this errata, we formally ﬁx this glitch by adding an extra term to the security bound, q/ 2 b − n , where q is query complexity, b the width of the permutation or the block size of the block cipher, and n the size of the hash digest. For permutations that are wider than two times the chaining value this term is negligible. For block cipher based hashing modes where the block size is close to the digest size, the term degrades the security signiﬁcantly.


Introduction
In [DMA18], Daemen, Mennink, and Van Assche performed a thorough investigation of cryptographic hashing modes.They considered a very large class of hashing modes built on top of arbitrary functions, permutations, or block ciphers, and derived sufficient conditions for these modes to be hard to differentiate from a random oracle.Their analysis generalized earlier attempts of Dodis et al. [DRRS09] and Bertoni et al. [BDPV14].Most importantly, the contribution of Daemen et al. consisted of cleaner sufficiency conditions and analyses for permutation based cryptographic hashing modes.While the conceptually cleaner sufficiency conditions simplified the security analyses, a level of complication was introduced by the fact that more general modes than in [DRRS09,BDPV14] were taken into consideration.
After publication of the original article, it turned out that there was an error in the proof of the mode for a truncated permutation or a block cipher [Nev19].At a high level, the attack consisted of (i) querying the construction oracle for an arbitrary message M m to get a hash digest h, (ii) querying the inverse primitive on input of the hash outcome h (possibly appended, as truncation is involved), and (iii) using the previous result to compute the hash of M m , without having to know M but only h, m and m .Intuitively, the attack would succeed after 2 b−n attempts, where b is the width of the permutation or block length of the block cipher, and n the hash digest size.The problem is discussed at a higher level of technicality in Section 2.
Fortunately, the glitch is quite simple to fix.In this errata to the original article of Daemen et al. [DMA18], we correct the analysis for the case of modes based on a permutation or block cipher.The original analysis carries over with an additional term q/2 b−n , where b is the width of the primitive or block size of the block cipher, and n the hash digest size.The updated indifferentiability bounds for the relevant hashing modes are given in Table 1.The updated analysis is described in Section 3.For truncated permutations, Table 1: Updated indifferentiability bounds for hashing modes of an arbitrary function, a truncated permutation or a (truncated) block cipher.The rectangles denote the additional terms.The conditions SF, RD, MD, and LA stand for subtree-freeness, radical-decodability, message-decodability, and leaf-anchoring, respectively (see the original article [DMA18] for their definitions).q is the adversarial complexity expressed as the number of primitive queries either direct or indirect, n the CV length, and b ≥ n the width of the permutation resp.the block length of the block cipher.compression function type SF+RD+MD LA bound arbitrary function - the impact of the additional term is negligible if b − n n, i.e., if b 2n.This is the case for wide permutations such as Keccak-p[1600] or Keccak-p[800], or the permutation used in MD6.For lightweight permutations, the presence of the term becomes problematic.However, these do not lend themselves easily to the tree hashing modes considered in this work in the first place, as at least two CVs should fit in a single permutation input, hence b 2n.For the modes based on block ciphers b is the block length.In most cases there is no truncation, i.e., implying b = n and all security evaporates.However, if there is truncation and n b/2, the additional term is negligible.For example, using the block cipher underlying SHA-512/256 [SHA15] we get b = 512 and n = 256, which gives 128 bits of security both without and with the extra term.Concluding, the extra term leads to an extra requirement for modes to be secure, in that sufficient truncation has to be done.

Problem in Original Analysis
For simplicity, consider a block cipher based tree hashing mode without truncation, i.e., with b = n, for the processing of a message M m, where M is of arbitrary length and m of fixed length.The computation of the final node when computing T (M m) = h would typically be of the following form: In short, an attacker can do the following: The problem for the simulator is that it does not remain consistent.We assume that at the end of the interaction, the distinguisher will always verify its queries to the random oracle.In above case, it will query the simulator for the compression of M .The simulator will return a random value CV , which is unlikely to be equal to CV.However, to be consistent with the random oracle, the simulator has to return h for E m (CV ) as well.This means that it is no longer consistent as a block cipher, as both CV and CV are mapped to h under E m .Note that the simulator is already inconsistent without querying E m (CV), but we do need that additional query to exploit the weakness in a real hashing mode.
The same problem holds when a truncated permutation is used and b − n is small.In general, the attacker gets an advantage of q/2 b−n by guessing the truncated bits.
In general, the problem is caused by the fact that the final primitive call in the hashing mode is invertible and that the attacker succeeds in inverting it in 2 b−n attempts, as it must correctly guess the b − n-bit truncated part.The attack therefore does not affect the analysis of [DMA18] for arbitrary functions, but only for permutations and block ciphers.

Updated Analysis
As a reference, we first restate the simulator and bad views of [DMA18, Theorem 2] for the case of a permutation.The simulator is given in Algorithm 3.For the bad views, we denote by M = {(M 1 , Z 1 , h 1 ), . . ., (M r , Z r , h r )} the view seen by distinguisher D on interaction with the construction oracle, and by Ł = {(x 1 , y 1 ), . . ., (x q , y q )} the view seen by D on interaction with the primitive oracle.The set Ł is split into forward queries L fwd and inverse queries L inv .We further split L fwd into Denote ν = (M, L rad , L other , L inv ).The set V denotes any attainable view that can be observed by D .
An attainable view ν is called bad if: (i) There exist distinct (x i , y i ), (x j , y j ) ∈ L rad with y i n = y j n ; (ii) There exist distinct (x i , y i ), (x j , y j ) ∈ L other with y i n = y j n ; (iii) There exist (x i , y i ) ∈ L rad and (x j , y j ) ∈ L other with i < j such that y j n = radicalValue[Ł i−1 ](x i ); (iv) There exist (x i , y i ) ∈ L inv such that x i n = IV; (v) There exist (x i , y i ) ∈ L inv and (x j , y j ) ∈ L fwd such that x i n = y j n ; (vi) There are distinct (x i , y i ), (x j , y j ) ∈ (L fwd ∪ L inv ) with x i = x j or y i = y j .
The error of [DMA18] happens in the analysis of bad event (vi).The original paper assumes that the simulator always returns random values from Z b 2 for new inputs, which leads to the term |Ł| 2 /2 b = q 2 /2 b .However, when a query completes a tree, the first n bits of its result are taken from the random oracle, whose bits are not random when the distinguisher has queried it earlier.Below a corrected analysis of bad event (vi) is given.
Let (x i , y i ), (x j , y j ) ∈ (L fwd ∪ L inv ) be arbitrary distinct queries with i < j.We do some case separation based on what kind of queries i and j are.
is the intermediate result when compressing M .Because the block cipher is invertible, an attacker can compute CV = E −1 m (h).It can use this information to compute T (M m ) = h = E m (CV) based on just h, m and m , thus without any knowledge of M .It can use this trick to differentiate the hashing mode from a random oracle.