Short Non-Malleable Codes from Related-Key Secure Block Ciphers, Revisited

. We construct non-malleable codes in the split-state model with codeword length m +3 λ or m +5 λ , where m is the message size and λ is the security parameter, depending on how conservative one is. Our scheme is very simple and involves a single call to a block cipher meeting a new security notion which we dub entropic fixed-related-key security, which essentially means that the block cipher behaves like a pseudorandom permutation when queried upon inputs sampled from a distribution with sufficient min-entropy, even under related-key attacks with respect to an arbitrary but fixed key relation. Importantly, indistinguishability only holds with respect to the original secret key (and not with respect to the tampered secret key). In a previous work, Fehr, Karpman, and Mennink (ToSC 2018) used a related assumption (where the block cipher inputs can be chosen by the adversary, and where indistinguishability holds even with respect to the tampered key) to construct a non-malleable code in the split-state model with codeword length m + 2 λ . Unfortunately, no block cipher (even an ideal one) satisfies their assumption when the tampering function is allowed to be cipher-dependent. In contrast, we are able to show that entropic fixed-related-key security holds in the ideal cipher model with respect to a large class of cipher-dependent tampering attacks (including those which break the assumption of Fehr, Karpman, and Mennink).


Introduction
Consider the classical setting in which a message µ is encoded via an algorithm Encode, yielding a codeword σ. The decoding algorithm Decode allows one to recover µ from σ, efficiently. The goal of such an encoding procedure is to prevent modifications to the codeword, either benign (e.g., because of errors introduced by the communication medium) or adversarial (e.g., because of malicious tampering attacks). In more detail, letσ = f (σ) be a modified codeword for some tampering function f over the codeword space. The property of error correction guarantees that Decode(σ) still results in the original message µ. The property of error detection, instead, guarantees that Decode(σ) either results in the original message µ or in an error symbol ⊥ (denoting that tampering occurred but the original message cannot be recovered). While the goals of error correction/detection are very well understood, it is well known that these guarantees are simply impossible for certain classes of tampering functions, particularly so for those classes that model adversarial tampering (e.g., the family of constant functions).
Motivated by this shortcoming, Dziembowski, Pietrzak, and Wichs introduced the beautiful notion of non-malleable codes [DPW10,DPW18], which guarantees that Decode(σ) either results in the original message or in a completely unrelated value. While being weaker than error correction and error detection, non-malleability can be achieved for much larger classes of tampering functions. Moreover, a non-malleable code can be used to protect arbitrary cryptographic primitives against tampering attacks targeting the memory (a.k.a. related-key attacks). The latter is achieved by simply storing the secret key in encoded form, and by decoding it prior to invoking the underlying cryptographic algorithms. Intuitively, this ensures that memory tampering either results in the same key (and thus has no effect) or to a completely unrelated key (which does not harm 1 security).
In this work, we focus on the so-called split-state model in which a codeword consists of two parts (σ 0 , σ 1 ) that can be tampered arbitrarily yet independently; namely, a tampering function f has a type f = (f 0 , f 1 ) and the mauled codeword is of the form σ = (σ 0 ,σ 1 ) = (f 0 (σ 0 ), f 1 (σ 1 )). While non-malleable codes in the split-state model exist unconditionally, Cheraghchi and Guruswami [CG16] established that the best achievable rate for such codes in the information-theoretic setting is 1/2, where the rate refers to the (asymptotic) ratio between the length of the message and the length of the codeword when the message length goes to infinity. This lower bound motivated cryptographers to build more efficient codes under (as weak as possible) computational assumptions. We refer the reader to Section 1.3 for a summary of known results.

The Work of Fehr, Karpman, and Mennink
The starting point of our work is a paper by Fehr, Karpman, and Mennink [FKM18] (improving previous works by Aggarwal, Agrawal, Gupta, Maji, Pandey, and Prabhakaran [AAG + 16] and Kiayias, Liu, and Tselekounis [KLT16]), where the authors show how to construct non-malleable codes in the split-state model assuming sufficiently strong block ciphers. Their construction is the simplest possible cipher-based split-state nonmalleable code: The left part of the codeword is the key κ for a block cipher, and the right part of the codeword is the ciphertext γ encrypting the message. Namely, Encode(µ) → (κ, Encrypt(κ, µ)) .
The length of the codeword in their candidate construction is m + 2λ, where m is the size of the message and λ is a security 2 parameter, which is the shortest known today.
One of the explicit goals of Fehr et al. was to understand the assumptions needed from the block cipher in order to prove non-malleability of the above simple construction without relying on trusted setup or other (non-falsifiable) assumptions. Their main technical result is that the latter is indeed possible assuming the underlying block cipher is: (i) a pseudorandom permutation (PRP) under leakage (a.k.a. PRP-with-leakage security), and (ii) related-key secure with respect to an arbitrary but fixed key relation (a.k.a. FRK security). Property (i) means that the block cipher behaves as a PRP even given arbitrary leakage on the secret key, so long as the latter is still unpredictable given the leakage. Property (ii) means that the block cipher behaves as a PRP even if the adversary is allowed to ask en-/decryption queries under a single related key (next to the original key), so long as the related key is hard to guess.
As observed by an anonymous ToSC 2021 reviewer, the notion of FRK security as defined in [FKM18] is impossible to achieve whenever the tampering function f is allowed to depend on the block cipher. 3 There is a simple attack (originally due to Bernstein [Ber10]) against FRK security. Before describing the attack, we give more details on the definition of FRK security from [FKM18]. As mentioned, the attacker is allowed to specify a tampering function f that is applied to the secret key. The security experiment samples a uniformly random key κ and computesκ = f (κ). At this point, depending on the challenge bit b, the adversary gets oracle access either to the oracles Encrypt(κ, ·), Decrypt(κ, ·), Encrypt(κ, ·) and Decrypt(κ, ·) or to two uniformly random permutations π and π ′ and their respective inverses. The security guarantee states that, for any tampering function f , the adversary cannot tell the difference between these two scenarios except with negligible advantage. It is easy to see that, without further restrictions, this notion is not achievable. Indeed, an adversary can fixκ to be the 0 k (where k is the key length) string and trivially distinguish the two scenarios. Thus, the FRK security experiment additionally first checks that the pre-image ofκ under the function f is a set with at most 2 k/2 elements, and, if not, the experiment aborts and the adversary is not allowed to query the oracle. Thanks to this restriction the related keyκ is hard to guess if κ is chosen uniformly at random. Now we are ready to describe the attack. Fix a block cipher Π = (Encrypt, Decrypt) and consider the function f that outputs ⌈Encrypt(κ, 0 m )⌉ k , where k is the key size, m is the input size, and ⌈w⌉ k denotes the first k bits of a string w. Since f (κ) is a truncated evaluation of the block cipher, which behaves like a PRP, the tampering function f satisfies the property that f (κ) is hard to predict for a random κ, and so the check on the preimage size described in the previous paragraph does not cause the experiment to abort except with extremely small probability. In fact, if there were many keys that map 0 m to the same value then we would have a distinguisher against the (standard) PRP security of Π. The attacker against FRK security using this tampering function f behaves as follows: 1. Extract Tampered Key. Obtain an encryption of 0 m under κ from the oracle, call it x.
2. Test Oracle. Obtain an encryption of 0 m under the tampered key f (κ) from the oracle, call it y; compute offline the encryption z = Encrypt(⌈x⌉ k , 0 m ) and check if z = y.
In the real world, we have z = y with probability one, while if Encrypt is replaced by an independent and truly random permutation in the oracle, then z = y only with very small probability. This contradicts the FRK security assumption from [FKM18], as it ranges over all (hard to guess) functions, even ones that can depend on the block cipher. Note that this attack applies even in the ideal cipher model (i.e., assuming that the block cipher behaves like a truly random permutation for every choice of the key). An analogous argument shows that the notion of PRP-with-leakage security as defined in [FKM18] is impossible to achieve whenever the leakage function g is allowed to depend on the block cipher. The latter can be seen by considering the leakage function g that returns the first bit of Encrypt(κ, 0 m ) and later obtains an encryption of 0 m from the oracle. Clearly, the secret key is still unpredictable given the leakage; yet, the attacker can distinguish the block cipher from a truly random permutation by comparing the output of the leakage function with the first bit of the output obtained from the oracle.
The above attacks fit into a class of cipher-dependent attacks studied by Albrecht, Farshim, Paterson, and Watson [AFPW11] in the context of modelling related-key attacks in the ideal cipher model. Their class includes the attack of Bernstein [Ber10], as well as another attack by Harris [Har09]. This discussion showcases the subtle challenges imposed by cipher-dependent attacks, and we find it interesting to study how to handle such attacks with as little impact as possible on the performance of the candidate schemes.
3 An earlier version of this paper which made use of the faulty assumptions from [FKM18] was submitted to ToSC 2021.

Our Contributions
In this paper, we put forward a meaningful weakening of FRK security which intrinsically rules out the above cipher-dependent attacks, while still being sufficient to formally prove security of a slight tweak of the original construction by Fehr, Karpman, and Mennink. The codeword size in our construction can be as small as m + 3λ. Furthermore, we provide evidence of the robustness of our new assumption by proving that it holds unconditionally in the ideal cipher model with respect to a broad class of tampering functions covering, in particular, Bernstein's attack. We elaborate on these contributions below.

Entropic Fixed-Related-Key Security
In Section 3, we consider a different form of FRK security in which the attacker has a limited access to the encryption and decryption oracle under the original key κ. In particular, our notion of security relaxes the definition of [FKM18], which we discussed in Section 1.1, in two ways: 1. The attacker has arbitrary oracle access to the encryption and decryption oracle under the tampered key, but it is allowed to observe the output of the block cipher under the original key κ only for random inputs sampled from a distribution with sufficiently high min-entropy.
2. We do not require indistinguishability from a random permutation for the block cipher instantiated with the tampered key.
We refer to our notion as entropic FRK security. Briefly, the rationale for the first relaxation is that such a limited access to the encryption oracle under the original key would rule out the "Extract Tampered Key" part of the aforementioned classes of cipher-dependent attacks from Section 1.1; the rationale for the second relaxation is that having access to the real-world encryption and decryption oracles under the tampered key, independently of the challenge bit, would rule out the "Test Oracle" part of the aforementioned classes of cipher-dependent attacks. To understand why this is indeed the case, consider the following scenario: The oracle samples n messages µ 1 , . . . , µ n independently from a distribution D with s bits of min-entropy, i.e., Then, the adversary learns the message/ciphertext pairs (µ i , γ i = Encrypt(κ, µ i )) i∈ [n] ; the attacker has no further oracle access to Encrypt(κ, ·). In order to carry out the cipherdependent related-key attack described in Section 1.2, the adversary musts learn the encryption under κ of a message that was also queried by the tampering function f on input κ. Let τ tamp denote the running time of f (so that f (κ) can compute encryptions and decryptions of at most τ tamp inputs). By a union bound, the probability that f queries the cipher on one of the messages µ 1 , . . . , µ n or ciphertexts γ 1 , . . . , γ n is at most Therefore, if the min-entropy parameter s satisfies s ≫ log τ tamp + log n, it follows that the probability that the cipher-dependent attack above succeeds is extremely small. Moreover, we notice that the second relaxation means that we do not require any privacy guarantee from the block cipher instantiated with the tampered key, besides that oracle access to Encrypt(f (κ), ·) and Decrypt(f (κ), ·) for a tampering function f cannot help in breaking the privacy of the ciphertexts computed under the original key. In contrast, Fehr, Karpman and Mennink require indistinguishability of the block cipher from a random permutation to hold even with respect to the tampered key. For this reason, their definition requires that the tampered key cannot be easily guessed by the adversary, as otherwise there would be a trivial distinguisher. By giving up on the indistinguishability of the block cipher under the tampered key, we additionally gain that we do not need anymore any restriction on the unpredictability of the tampered key. Informally speaking, the attacker could decide to tamper the original key and set it to an "easy to guess" tampered keyκ, thus receiving oracle access to Encrypt(κ, ·) and Decrypt(κ, ·) (independently of the challenge bit). If such a tampered key is easy to guess, however, the same oracle access could have been simulated by the adversary in its head. Thus, predictable tampered keys cannot harm the security definition and can be allowed.

Our Construction
The non-malleable code construction we consider is a slight variation of the original construction by [FKM18]. Namely, in Section 4, we consider the non-malleable code in the split-state model that encodes a message µ as described below: where κ is a uniformly random secret key and ρ is a uniformly random λ-bit string. Notice that the only difference between our construction and the construction of Fehr, Karpman, and Mennink is that we additionally sample a random string ρ and encrypt the concatenation of µ∥ρ.
As a bonus, we substantially simplify the security analysis. Indeed, the original security proof involves a case analysis according to the unpredictability of the tampered codeword: if the tampered keyκ is unpredictable, then the security of the non-malleable code reduces to the FRK security of the block cipher. Otherwise, it reduces to the PRP-with-leakage security of the block cipher. In the latter case, the reduction leaks the full tampered key and uses this leakage to simulate oracle access to Decrypt(κ, ·).
In our case, entropic FRK security directly provides oracle access to Decrypt(κ, ·), and it considers such an oracle as the only leakage an adversary can get from a tampering attack. This allows us to bypass the case analysis and reduce directly to the entropic FRK security of the block cipher (independently of the unpredictability of the tampered key), without explicitly assuming any form of leakage resilience from the block cipher.

Security in the Ideal Cipher Model
Recall that Bernstein's cipher-dependent attack on the FRK security notion from [FKM18] described in Section 1.1 applies even in the ideal cipher model. To further validate our approach, in Section 5, we prove that, unlike the notion of FRK security, entropic FRK security does hold (unconditionally) in the ideal cipher model, albeit with respect to a restricted (but still broad) family of tampering functions which includes cipher-dependent attacks such as Bernstein's.
The main ideas behind our analysis follow what we already described in Section 1.2.1. The intuition is that the tampering function, even with access to the original key κ, cannot query the ideal cipher on the challenge messages, because those messages are sampled from a distribution with high min-entropy. Thus, the tampered key is independent of the challenge ciphertexts, and so are the queries to the tampered encryption and decryption oracle. It is easy to see that, in the ideal cipher model, the only way for the adversary to distinguish the ideal from the real experiment is to query the ideal cipher on the original key. Thus, we give a bound on the unpredictability of the original key when the adversary has oracle access to the tampered encryption and decryption oracle. However, we need to make a simplifying assumption on the structure of the tampering functions. The problem is that the tampered key could be chosen as a function of the outputs of the ideal cipher queried on the tampered key itself. In principle, this allows for rejection-sampling adversarial strategies that can leak partial information about the original key. For example, consider the tampering function that sets the tampered key to an arbitrary stringκ such that the first bit of an encryption of the all-zero string underκ matches the first bit of the original key. The adversary can leak the first bit of κ through its oracle access. While these kind of tampering functions do not seem to help breaking entropic FRK security, nevertheless they make the analysis more complicated as they can bias in unexpected ways the distribution of the original key given oracle access to the ideal cipher and the tampered encryption and decryption oracle. Specifically, we cannot anymore easily argue that the output of the ideal cipher queried on the tampered key is independent of the tampered key.
Our solution to avoid these contrived tampering attacks is to additionally assume that the tampering functions do not query the ideal cipher on the tampered key. We notice that this additional assumption holds true for the tampering functions of Bernstein's [Ber10] and Harris' attacks [Har09] as discussed in [AFPW11]. Moreover, we point out that such a restriction was already considered by Albrecht, Farshim, Paterson and Watson [AFPW11] under the more generic notion of oracle-independence. We conjecture that full entropic FRK security holds in the ideal cipher model, and leave a formal proof of this fact as an interesting open problem.
Remark 1. It is natural to wonder whether one can prove the security of our proposed construction directly in the ideal cipher model. While we believe that this would indeed be possible, we do not pursue this direction. Our main goal is to base security on a falsifiable assumption which is plausibly satisfied by real-world block ciphers and is easier to evaluate in practice. Moreover, we believe that the notion of Entropic FRK security might have other applications (for example, hybrid encryption and tamper-resilient secret-key encryption).

Parameter Instantiations
In Section 6, we give two possible parameter instantiations for our construction. The first instantiation simply assumes that practical block ciphers (such as AES-128 and SHACAL-2) directly have good entropic FRK security; this yields codewords of size close to m + 3λ at about λ bits of security.
Alternatively, we can be more conservative and consider the advantage upper bound on entropic FRK security we establish in the ideal cipher model; this yields slightly longer codeword size close to m + 5λ at about λ bits of security.

Related Work
A long line of research explores constructions of non-malleable codes in the split-state model, both with information-theoretic [DPW10, DKO13, ADL14, CG16, CG17, ADKO15, CGL16, Li17, Li19, AO20, AKO + 22] and computational [LL12, AAG + 16, KLT16] security. Currently, the best explicit non-malleable code in the information-theoretic setting achieves rate 1/3 [AKO + 22] (versus 1/2, which is the best possible rate in the information-theoretic setting [CG16]). More precisely, this means that if m denotes the message length and n = n(m) denotes the codeword length corresponding to m-bit messages, then show how to compile an information-theoretic non-malleable code in the split-state model with rate bounded away from 1 into a nonmalleable code with much lower redundancy in the computational setting. Their construction encodes the secret key κ of a secret key encryption scheme under the poor-rate non-malleable code, obtaining a codeword (σ 0 , σ 1 ), and then encrypts the message µ under the key κ obtaining a ciphertext γ. The final encoding is σ ′ 0 = (σ 0 , γ) and σ ′ 1 = σ 1 . Therefore, encoding m-bit messages using a k-bit key under this construction leads to where n(k) is the codeword length of the underlying information-theoretic non-malleable code on k-bit messages. The security proof requires the encryption scheme to be nonmalleable, i.e., a so-called authenticated encryption scheme, and the underlying nonmalleable code to satisfy a slightly stronger non-malleability flavor known as augmented non-malleability, which is satisfied by the construction in [AKO + 22]. Given the above, it is natural to compare our construction with the one obtained by combining the compiler of Aggarwal et al. and the best known information-theoretic non-malleable code. In general, known constructions of information-theoretic non-malleable codes rely heavily on tools from pseudorandomness, such as randomness extractors, which suffer from large hidden constants in their various parameters and from an impractical running time (although asymptotically polynomial), such as the GUV seeded extractor [GUV09]. With this in mind: where the o(k) term satisfies o(k) k → 0 when k → ∞ but hides a large constant. 2. As mentioned above, the running time of our encoding/decoding algorithms essentially only involve evaluating the block cipher, while the encoding/decoding of [AKO + 22] involves objects from pseudorandomness whose running time is impractical.
3. The resulting security error would be at least the statistical security error ϵ(k) of the underlying information-theoretic non-malleable code on k-bit messages. The rate-1/3 code from [AKO + 22] achieves statistical error 2 −Ω(k/ log 3 k) , with Ω(·) hiding a big constant. Hence, even ignoring hidden constants, one needs to take the key length k to be k > λ · log 3 (λ) in order to get overall (computational) security error 2 −λ .
For usual values of the security parameter (say, λ = 256), this implies an extra multiplicative factor of at least 8 3 = 512.
(3) with the last item, we conclude that, even ignoring the large hidden constants and impractical encoding/decoding procedures, achieving security error comparable to 2 −λ would require codewords of length larger than m + 200λ for currently reasonable values of λ. In contrast, our construction only requires codewords of length close to m + 3λ or m + 5λ, depending on how conservative one is, to achieve the same security level, and is easy to implement in practice since it only involves encoding/decoding via a single call to a block cipher. Without considering [FKM18], the best previous construction [KLT16] with concrete security required codewords of length m + 18λ (or m + 9λ + 2 log 2 λ) to obtain security 2 −λ , and relied on non-falsifiable assumptions.

Notation
We denote by [n] the set {1, . . . , n}; for any a ≤ b, we let [a, b] := {a, . . . , b}. For a string x ∈ {0, 1} * , we denote its length by |x|. We denote sets by calligraphic letters such as X . The size of a set X is denoted by |X |. When x is chosen randomly from X , we write x ←$ X . We denote the family of permutations over a set X by P(X ). Sometimes we will denote the family P ({0, 1} m ), for a natural number m, with P(m). Similarly, we denote the family of keyed-permutations with keys ranging over {0, 1} k and permutation set {0, 1} m by P(k, m). If I ⊆ [n] is a set and x ∈ S n is a string, we define the projection x I = (x i ) i∈I . When A is a randomized algorithm, we write y ←$ A(x) to denote a run of A on input x (and implicit random coins ρ) and output y; the value y is a random variable and A(x; ρ) denotes a run of A on input x and randomness ρ.

Non-Malleable Codes in the Split-State Model
We start by giving the definition of coding schemes in the split-state model.
The non-malleability advantage of Σ is where the maximum is over all µ 0 , µ 1 ∈ M, all algorithms A τ running in time at most τ , and all (f 0 , f 1 ) with running time at most τ .
Informally, the goal of an adversary A τ is to learn whether the encoded message is µ 0 or µ 1 , and the only information A τ gets to see is the result on the reconstructed message corresponding to the tampering functions (f 0 , f 1 ) of his choice. Intuitively, by requiring for Adv nm Σ (τ ) to be small, we are saying that no adversary (running in time at most τ ) is able to tell the difference between µ 0 and µ 1 with meaningful advantage.
We require the block cipher satisfies perfect correctness, namely for all κ ∈ {0, 1} k and µ ∈ {0, 1} m it holds that Decrypt(κ, Encrypt(κ, µ)) = µ. We proceed to discuss the notion of tamper resilience we require for the block ciphers we use. Before presenting the main definition of this section, we introduce the notion of (samplable) s-entropic distributions.
Definition 3 (Entropic distribution). We say that a family of distributions D(1 λ , aux), parameterized by a security parameter λ and an auxiliary input aux, is s-entropic for a function s(·) if for any λ, aux, and any possible output x we have We say that D is τ samp -samplable if there is a randomized algorithm running in time τ samp which generates a sample of D(1 λ , aux) given (1 λ , aux) as input.
Let O Π,f,D,n (b) be an oracle depending on block cipher Π, an arbitrary tampering function f : {0, 1} k → {0, 1} k , an arbitrary s-entropic distribution D supported on the message space {0, 1} m , and a natural number n. Whenever it is clear from context, we omit the parameters Π and n from the definition of the oracle. The oracle acts as follows when interacting with some adversary A who is allowed to make multiple queries: • It chooses a uniformly random key κ ←$ {0, 1} k .
Note that this oracle differs from the one used in the FRK security definition from [FKM18] (discussed in Section 1.1) because it does not allow unrestricted query access to the functions Encrypt(κ, ·) and Decrypt(κ, ·). Namely, the adversary only observes outputs of these functions on messages sampled independently from some distribution with enough min-entropy. We note that in this work we only require Entropic FRK security with respect to q = 1 oracle queries and n = 1 message-ciphertext pairs. Nevertheless, we define this notion with respect to general q and n as we believe it may find applications elsewhere at this level of generality.

Our Construction
Our construction of non-malleable codes in the split-state model is based only on block ciphers and is depicted in Fig. 1. The encoding of a message µ is (κ, γ), where κ is a Let Π = (Encrypt, Decrypt) be a block cipher and consider the split-state coding scheme Σ = (Encode, Decode) defined below.
Encode(µ): The encoding algorithm proceeds as follows:  random block cipher key and γ is an encryption of the string µ||ρ for a random λ-bit string ρ. The decoding algorithm simply decrypts the ciphertext and discards the last λ bits of the resulting plaintext.
The theorem below characterizes the non-malleability of our code in terms of the security of the underlying block cipher.
Theorem 1. Let Π be a block cipher and let τ ′ = 2τ + O(1). Then, the split-state coding scheme Σ described in Fig. 1 satisfies We give the main ideas behind the proof of the theorem. Given a tampering function (f 0 , f 1 ) against the non-malleable code, our reduction uses f 0 to define the related keyκ in the Entropic FRK security experiment, and uses f 1 to compute a ciphertextγ that can be decrypted using its oracle access to the tampered decryption oracle, thus providing a decoding of the codeword. We can then replace the real codeword from (κ, Encrypt(κ, µ∥ρ)) with a fake codeword (κ, ω), where ω is an uniformly random string. Clearly, such a fake codeword is independent of the encoded message. We now present the formal proof.
3. Query the oracle O frk with (dec,γ), thus receiving the messageμ∥ρ. Note that f 0 is used implicitly in this step, since the oracle O frk attempts to decryptγ using the related key f 0 (κ).
Observe that B computes f 1 , runs A and performs some constant-time operation. Therefore, its running time is bounded by τ + τ tamp + O(1) ≤ τ ′ . Also notice that, by design of the FRK experiment, when bothκ = κ andγ = γ the adversary B sets the messagẽ µ to be the output of the decryption oracle that is equal to ⋄. Finally, we observe that the simulation performed by B is perfect. We can verify this by inspection. Indeed, when B is interacting with O frk (0), by definition of the security game for FRK we have that γ = Encrypt(κ, µ∥ρ) as in experiment Exp nm . On the other hand, when B is interacting with O frk (1), we have that γ = π(µ∥ρ) as in experiment Exp 1 . Therefore, Finally, we show that, for all b ∈ {0, 1}, In fact, the only value that might depend on the bit b in the experiment Exp 1 (µ 0 , µ 1 , 0) is the random variable γ = π(µ b ∥ρ). However, such random variable is uniformly distributed over {0, 1} m .

Entropic FRK Security in the Ideal Cipher Model
In this section, we show that entropic FRK security holds in the ideal cipher model for a natural family of cipher-dependent tampering functions (generalizing Bernstein's attack, as described in Section 1). Since Bernstein's attack applies equally well in the ideal cipher model, this result stands in sharp contrast with the fact that the notion of FRK security considered in [FKM18] is unachievable even in the ideal cipher model. Before we proceed to state and prove the result, we formally define what is meant by entropic FRK security in the ideal cipher model w.r.t. a tampering family. The definition is almost identical to the one in Section 3, except that both the adversary and the tampering function are given oracle access to a random keyed permutation Π ←$ P(k, m). The latter essentially means that a random permutation is chosen for every possible key, and the attacker as well as the tampering function, can query such permutations in the forward direction (via queries (enc-ideal, κ ′ , µ)) and in the backward direction (via queries (dec-ideal, κ ′ , γ)). An identical formalization was used in [AFPW11]. As discussed earlier, our security analysis in the ideal cipher model only holds w.r.t. a large class of cipher-dependent tampering functions which we name oracle-independent. The latter roughly means that f Π (κ) does not output a key κ ′ which was used as part of an (enc-ideal, κ ′ , ·) or a (dec-ideal, κ ′ , ·) query. A similar notion (which was probabilistic in nature and considered multiple tampering functions) was considered in [AFPW11].
We say that f is oracle-independent if for any κ and Π we have f Π (κ) ̸ ∈ Q f,κ,Π . Moreover, we call F * the set of all oracle-independent tampering functions.
Theorem 2. For any parameters q O , q Π , k, m, s, n such that q Π ≤ 2 k/4 and q O ≤ 2 m−1 we have Proof. At a high level, we begin by introducing a hybrid experiment H(b) parameterized by a challenge bit b. Then, we show that the advantage of any given adversary A in distinguishing between H(0) and H(1) is appropriately close to the advantage of the same adversary in distinguishing between the original Entropic FRK security experiment described in Definition 5 with challenge bit b = 0 and b = 1, respectively. Finally, we argue that the advantage in distinguishing between the experiments H(0) and H(1) is small, which concludes the proof.
Let A be an adversary, f be a tampering function, and D be an s-entropic distribution such that A asks at most q O oracle queries to O frk , f is oracle-independent, A and f ask cumulatively at most q Π oracle queries to Π, and A, f , and D maximize the advantage in the FRK security experiment, i.e., We now define the hybrid experiment H(b) where we run A, f , and D in an experiment that is similar to the original FRK experiment with challenge bit b but where the hybrid experiment aborts and outputs ⊥ if certain events hold. We describe the hybrid experiment in the following and in pseudo-code in Fig. 2. Specifically, the oracles O frk and Π in the hybrid H(b) are modified to raise flags flg 1 , flg 2 , and flg 3 . The hybrid experiment returns ⊥ if at least one of the flags is set to 1 during the experiment, otherwise it returns the response bit output by A. Notice that in the pseudo-code description we assume that all the variables are shared between the hybrid experiment and the two oracles. Thus, for example, the flags are initially set to 0 by the hybrid experiment and might be updated (and set to 1) at each invocation of the oracles by the adversary. For i = 1, 2, 3 we define the event E i to be the event that the flag flg i is set to 1. The events are as follows:
• Event E 3 . The adversary A finds a collision. Namely, it holds that f Π (κ) = κ and either the adversary sends a query of the form (enc, x) with x ̸ ∈ {µ * 1 , . . . , µ * n } and receives Encrypt(κ, x) ∈ {γ * 1 , . . . , γ * n }, or sends a query of the form (dec, y) with y ̸ ∈ {γ * 1 , . . . , γ * n } and receives Decrypt(κ, y) ∈ {µ * 1 , . . . , µ * n }. Note that the events E i may at first sight have different probabilities in H(0) and H(1). In fact, as we shall argue, E 1 and E 2 have the same probability of happening in H(0) and H(1), while E 3 does not. To avoid overloading the notation, we avoid explicitly writing down whether we are referring to event E i in H(0) or H(1) as this will always be clear from context. By inspection of the hybrid experiment we can notice that if the flags are not raised, namely if (¬E 1 ∧ ¬E 2 ∧ ¬E 3 ), then the hybrid experiment and the FRK experiment are exactly the same. In fact, the changes introduced in the hybrid do not influence the outputs of the oracles. Thus it follows that for b ∈ {0, 1} we have We proceed to bound the three rightmost terms appropriately. First, notice that for any choice of Π, index j, and key κ ′ , the probability that the i-th query of f Π is of the form (enc-ideal, κ ′ , µ * j ) or (dec-ideal, κ ′ , γ * j ) is at most 2 −s(λ) since D is s-entropic and the message samples from D are i.i.d. for different j. Taking a union bound over all j ∈ [n] and the q Π queries made by f Π , it follows that Notice that the event E 1 is independent of the challenge bit b, i.e., Pr E 0 1 = Pr E 1 1 . In fact the event E 1 depends only on f Π (κ) and it is independent of the query made by A to the oracle O frk (b).
We now move to bound the probability of the event (E 2 |¬E 1 ). To bound this event we make use of our assumption that the tampering function is oracle-independent (recall Definition 6). Since Π is an ideal cipher, the tuples (µ * j , γ * j ) for any j are independent from κ. Moreover, conditioned on ¬E 1 such tuples are independent of the tampered keyκ, becauseκ = f Π (κ) and f Π (κ) has not queriedκ on µ * j for any j. Additionally, since f is oracle-independent and Π is an ideal cipher, the queries of A to O frk and the oracle answers are independent ofκ and κ. This means that the key κ is uniformly distributed over a subset given the values (µ * j , γ * j ) j∈ [n] . Also, note that if the i-th query of A to Π features a key κ ′ different from κ and a message/ciphertext of its choice, then, because Π is an ideal cipher, it only learns whether κ ′ ∈ {κ,κ} and whether κ ′ ∈ (f Π ) −1 (κ). To see the latter, namely that the adversary can learn whether κ ′ is in the pre-image of the tampered key or not, notice that the tampering function could setκ according to an arbitrary predicate 4 that depends on the queries it does to the ideal cipher on the original key κ. Thus, by querying on κ ′ , the adversary could verify if the predicate is satisfied or not by κ ′ . For any key κ ′ we set Λ(κ ′ ) to be the event that is true if and only if We begin by bounding the leftmost term in the right hand side of Eq. (7). More precisely, we show that We can assume that Pr [Λ(κ)|¬E 1 ] > q Π ·2 −k/2 , as otherwise Eq. (8) holds trivially. For each κ and Π such that Λ(κ) holds there are at most 2 k/2 values κ ′ such that f Π (κ) = f Π (κ ′ ), and furthermore Λ(κ ′ ) = 1 for all such κ ′ . This comes readily by the definition of the event Λ(κ). We are interested in bounding the probability that A queries Π on κ for the first time in the i-th query conditioned on Λ(κ) holding. Note that there are exactly 2 k · Pr [Λ(κ)|¬E 1 ] possible values for the key κ conditioned on Λ(κ) holding. Furthermore, each previous query to κ ′ ̸ = κ such that Λ(κ ′ ) holds rules out at most 2 k/2 key values by definition of Λ(κ ′ ). More precisely, it rules out κ ′ along with all values in the preimage (f Π ) −1 (κ ′ ), which are fewer than 2 k/2 . Therefore, the probability that A queries Π on κ for the first time in the i-th query conditioned on Λ(κ) holding is at most where the leftmost inequality uses our assumption that Pr [Λ(κ)|¬E 1 ] ≥ q Π · 2 −k/2 . Taking a union bound over all q Π queries yields Eq. (8).
We now bound the rightmost term in the right hand side of Eq. (7). Conditioned on the event ¬Λ(κ), we consider the worst-case scenario where the adversary A knows the valueκ and that all its queries are in (f Π ) −1 (κ). Since ¬Λ(κ) holds, we know that there are at least 2 k/2 keys κ ′ such that f Π (κ ′ ) = f Π (κ). Moreover, each query to such a key κ ′ ̸ = κ only rules out κ ′ itself, and κ is still uniformly distributed over the remaining set of keys. Therefore, the probability that A queries Π on κ for the first time in the i-th query is at most where the last inequality uses our hypothesis that q Π ≤ 2 k/4 . Taking a union bound over all q Π queries shows that Combining this inequality with Eqs. (7) and (8) yields as desired. Notice that the event E 2 is independent of the challenge bit b, namely the probability of E 2 is the same in the distributions defined by H(0) and H(1). In fact, the tuples (µ * j , γ * j ) j∈ [n] and the key κ are identically distributed in H(0) and H(1) given the full views before the first query of A that triggers the event E 2 .
Finally, we bound the probability of the event (E 3 |¬E 2 ). The only way to find collisions conditioned on ¬E 2 is for the adversary to query O frk . Notice that if b = 0 then the probability of E 3 is 0 because Π is a keyed permutation and we must have f Π (κ) = κ for E 3 to hold. We now focus on the case b = 1. In this case, for any choice of (µ * i , γ * i ) i∈ [n] the probability that the i-th distinct query of the form (enc, µ) with µ ̸ ∈ {µ * 1 , . . . , µ * n } yields π κ (µ) ∈ {γ * 1 , . . . , γ * n } is at most because π κ (x) is uniformly random and distinct from the answers to all the previous encryption queries. The last inequality uses our hypothesis that q O ≤ 2 m−1 . An analogous argument shows that the probability that the i-th distinct query of the form (dec, y) with γ ̸ ∈ {γ * 1 , . . . , γ * n } yields π −1 κ (γ) ∈ {µ * 1 , . . . , µ * n } is also at most n 2 m −i ≤ n · 2 −m+1 . Combining these bounds with a union bound over all q O queries to O frk yields From Eqs. (6), (9) and (10) combined with Eq. (5) it follows that It remains to upper bound the advantage in distinguishing between H(0) and H(1). We claim that We first argue that Pr H(0) = 1|Ē = Pr H(1) = 1|Ē . Conditioned on ¬E 1 to hold, we have thatκ whenκ ̸ = κ is independent of (µ * i , γ * i ) i∈[n] even fixing Π, and thus the queries to O frk made by A are independent of (µ * i , γ * i ) i∈ [n] (and thus of b). Conditioned on ¬E 2 to hold, the joint distribution of (µ * i , γ * i ) i∈ [n] and the outputs of the queries to Π is independent of b, because Π is an ideal cipher and none of the queries by the adversary A to Π intersect with the queries made by the oracle O frk to Π. Conditioned on ¬E 3 to hold, when f Π (κ) = κ the joint distribution of (µ * i , γ * i ) i∈ [n] and the outputs of the queries to O frk is independent of the challenge bit b, because both when b = 0 and b = 1 these values are uniformly random from the set {0, 1} m and distinct. Therefore, Eq. (12) follows if we show that Pr where we stress that the first probability in the left hand side is over the probability space induced by H(0), while the second probability is over the probability space induced by H(1). Notice that, as we have already argued, the events E 1 and E 2 are independent of the challenge bit b (and so they have the same probabilities in both probability spaces), while E 3 is not. As a result, we have Pr as desired. The second equality holds because Pr H(0) [E 3 |¬E 1 ∧ ¬E 2 ] = 0 as argued above, and the last inequality follows from Eq. (10). Combining Eqs. (4), (11) and (12) with the triangle inequality concludes the proof.

Setting Parameters
We provide two possible parameter instantiations for our coding scheme. The first instantiation, based on arguments from [FKM18], leads to codewords of length close to m + 3λ and non-malleability advantage comparable to τ tamp · 2 −λ . A more conservative instantiation, based on Theorem 2, leads to codewords of length close to m + 5λ for the same non-malleability advantage. In both cases, we achieve codewords significantly shorter than the state of the art. Fehr, Karpman, and Mennink [FKM18, Remarks after Definitions 3 and 4, and Section 6] argue that a good cipher, such as AES-128 and SHACAL-2, with keylength k should have advantage close to τ tamp · 2 −k/2 against fixed related-key attacks, with τ tamp denoting the runtime of the tampering function. Although, as we have shown, this argument breaks down with respect to their fixed relatedkey security assumption, we believe that the cipher-dependent attacks which break their assumption are necessarily contrived. Therefore, since our weaker entropic fixed related-key security assumption precludes the relevant attacks, we find it reasonable to assume that good ciphers Π with keylength k satisfy Adv frk Π (1, τ tamp , k, s, 1) ≈ τ tamp · (2 −k/2 + 2 −s ).
The additional rightmost term τ tamp · 2 −s stems from the discussion surrounding Eq. (1)the probability that the adversary learns encryptions (under the true key κ) of messages which were also queried by the tampering function is at most τ tamp · 2 −s via a union bound and the fact that padding the original message with an s-bit random string yields an s-entropic distribution. Therefore, taking into account Theorem 1 and Eq. (13), in order to obtain non-malleability advantage Adv nm Σ (τ tamp ) ≈ τ tamp · 2 −λ it is enough to set k ≈ 2λ and s ≈ λ in our coding scheme Σ described in Fig. 1. This leads to codewords of overall length m + k + s ≈ m + 3λ.
As an alternative, more conservative instatiation method, we may instead consider the advantage upper bound provided by Theorem 2 in the ideal cipher model. This result states that an ideal cipher with keylength k has entropic fixed related-key security advantage roughly τ tamp · (2 −k/4 + 2 −s ) when the message is padded with an s-bit random string. We extrapolate that a good cipher should satisfy this property in practice, and remark that it is our belief that this bound is loose and the true advantage in the ideal cipher model should be τ tamp · (2 −k/2 + 2 −s ). We leave it as an interesting open problem to prove this conjecture. Similarly to the previous paragraph, taking into account Theorem 1 and Eq. (14), in order to obtain non-malleability advantage Adv nm Σ (τ tamp ) ≈ τ tamp · 2 −λ it is enough to set k ≈ 4λ and s ≈ λ in our coding scheme Σ described in Fig. 1. This more conservative instantiation leads to codewords of overall length m + k + s ≈ m + 5λ.

Conclusions and Future Directions
We have given a construction of non-malleable codes in the split-state model with codeword length m+3λ, where m is the message size and λ is the security parameter. Our construction involves a single call to a block cipher, and can be proven secure under a form of related-key security which we named entropic FRK security. Previous work either achieved rather worse codeword length under non-falsifiable assumptions [KLT16], or a similar (in fact, slightly better) codeword length under a much stronger form of related-key security [FKM18] that unfortunately does not hold in the presence of cipher-dependent tampering attacks. In contrast, entropic FRK security holds unconditionally in the ideal cipher model w.r.t. a large class of oracle-independent tampering functions (which includes cipher-dependent tampering attacks which break the assumption from [FKM18]).
Natural directions for future work include reducing the codeword length even further, for example through a better analysis of the entropic FRK assumption in the ideal cipher model or through a different set of assumptions (indeed, while the assumptions of [FKM18] do not hold against cipher-dependent tampering functions, the adaptation of Bernstein's attack [Ber10] to the context of non-malleable codes does not seem to trivially violate the non-malleability of Fehr, Karpman, and Mennink's construction) and establishing that entropic FRK security holds in the ideal cipher model even for tampering functions that are not oracle-independent. It would also be interesting to extend our techniques to obtain practical non-malleable secret sharing [GK18] with very short shares based on related-key secure block ciphers.