Breaking HALFLOOP-24

. HALFLOOP-24 is a tweakable block cipher that is used to protect automatic link establishment messages in high frequency radio, a technology commonly used by government agencies and industries that need highly robust long-distance communications. We present the first public cryptanalysis of HALFLOOP-24 and show that HALFLOOP-24, despite its key size of 128 bits, is far from providing 128 bit security. More precisely, we give attacks for ciphertext-only, known-plaintext, chosen-plaintext and chosen-ciphertext scenarios. In terms of their complexities, most of them can be considered practical. However, in the real world, the amount of available data is too low for our attacks to work. Our strongest attack, a boomerang key-recovery, finds the first round key with less than 2 10 encryption and decryption queries. In conclusion, we strongly advise against using HALFLOOP-24.


Introduction
Protocols for automatic link establishment (ALE) were developed during the 1980s to simplify communications via high frequency (HF) radio. The current ALE protocols are described in the US standards MIL-STD-188-141 and FED-STD-1045 and by NATO in STANAG 4538. To prevent spoofing and to protect the transmitted data, ALE includes an optional linking protection mode. Second (2G) and third generation (3G) ALE use the same block cipher, named SoDark, which operates on 24-and 48-bit states under a 56-bit key [JKF + 12]. In [Dan21], Dansarie described several practical key-recovery attacks against the 8-round version of this cipher, corresponding to the exact number of rounds standardized for 2G ALE. However, a key length of 56 bits is way too short to offer a decent security level against modern computers and GPUs. For that reason, a replacement for SoDark has been specified since 2017. This new block cipher, named HALFLOOP, has been standardized in the latest revision of MIL-STD-188-141 with a key size of 128 bits and block sizes of 24, 48 and 96 bits [DoD17]. particles in the upper atmosphere. Through this effect, two radios can communicate across very large distances without any external infrastructure. This makes HF radio attractive to users who need communications to work even when conventional infrastructure is unavailable, such as after disasters, in war and in the polar regions where geostationary satellites can not be reached. Examples of users include the military, diplomatic services, disaster management agencies, humanitarian non-governmental organizations, and the air and maritime industries. Large HF antennas are a common sight on embassies throughout the world.
Skywave propagation is heavily dependent on a number of constantly changing factors, such as season, time of day and space weather. Additionally, transmitter and receiver locations as well as technical characteristics of the radio equipment also affect propagation. For that reason, HF radio has historically required trained and experienced operators. The first attempts at reducing this dependence on operators were made during the 1980s, when the first ALE systems were developed. Since different HF radio manufacturers developed their own standards, they were not interoperable. This was solved with the introduction of 2G ALE, which became a US military standard in 1988. 3G ALE was introduced in 1999. HALFLOOP is a tweakable block cipher and the tweak is simply xored to the key, there is a generic related-tweak key-recovery attack. For HALFLOOP-24, this means that we can recover the key with time, data and memory complexity 2 64 . To do so, we would simply query a message for all 2 64 tweaks, store the resulting ciphertexts in a table and then brute force the 64 bits of key not influenced by the tweak. However, as we show in this work, there are significantly more powerful attacks against HALFLOOP-24. Our attacks rely on differential cryptanalysis which is one of the most powerful cryptanalysis techniques. It was proposed by Biham and Shamir in [BS91] and has generated much attention since then. The aim of differential cryptanalysis is to study the propagation of differences through a cipher to highlight unexpected behaviors compared to a random permutation. Typically it concerns the existence of a differential characteristic with a high probability but we may also search for impossible transitions [BBS99]. In particular, HALFLOOP-24 is very weak against boomerang attacks [Wag99] and there exists a related-tweak boomerang distinguisher of probability 1 against its full version.

Results
The attacks presented in this paper break HALFLOOP-24. However, because the amount of data is extremely limited in the real world, we do not expect that these attacks can be used in practice. Table 1 shows a summary of the attacks and their complexities. Note that the complexity of our ciphertext-only attack heavily depends both on how callsigns are assigned and on the traffic intensity in the attacked network. The complexities we provide in Table 1 assume that callsigns are randomly assigned and that the rate of traffic is very high. The complexities improve significantly as the randomness of callsigns decreases. Ignoring the actual complexities, ciphertext-only attacks are the most simple ones since an attacker only needs a radio to eavesdrop ciphertexts. In the case of ALE, plaintexts consist mostly of callsigns which can be obtained with moderate effort. Chosen-plaintext and chosen-ciphertext attacks require (temporary) access to a radio that holds the desired key. All attacks are significantly more efficient than brute force. Therefore, we strongly advise against the use of HALFLOOP-24. We implemented and verified our chosen-plaintext and chosen-ciphertext attacks. The source code is available at https://doi.org/10.5281/zenodo.7043329.
Structure of the paper The next section gives the HALFLOOP-24 specification and clarifies related notation. In Section 3, we give an attack against HALFLOOP-24 in a theoretical model and in Section 4 we describe how we can adapt this attack for the real world. After that, Section 5 presents our boomerang key-recovery before we conclude the paper in Section 6.

Description of HALFLOOP-24
Here, we briefly describe the tweakable block cipher HALFLOOP-24. HALFLOOP-24 operates on 24-bit blocks and features a 128-bit key and 64-bit tweak. We write c = HL-24 k (p, t) to denote the encryption of a plaintext p ∈ F 24 2 under the key k ∈ F 128 2 and the tweak t ∈ F 64 2 which results in the ciphertext c ∈ F 24 2 . HALFLOOP-24 borrows many operations from AES. The state consists of three bytes, arranged in a 3 × 1 matrix (instead of 4 × 4 for AES). The SBox is the same as for AES. Instead of ShiftRows, the second byte is rotated by six and the third byte is rotated by four bits to the left. This operation is called RotateRows. For MixColumns, we multiply the state with c(x) = x 2 + 2x + 9 modulo x 3 + 1, which can be seen as a 3 × 3 MDS matrix over F 2 8 . Note that [DoD17] gives x 4 + 1 as modulus which apparently is a typing error since the test vectors indeed work for x 3 + 1. Finally, as for AES-128, the number of rounds is ten and the last round does not involve the MixColumns operation.

Key Schedule
The key schedule is the same as for AES-128 with one crucial difference. For HALFLOOP-24, there is a 64-bit tweak t (called seed) that is xored to the master key k before the key schedule takes place. Furthermore, we only need 24-bit round keys and hence a far shorter expansion. We depict the key schedule in Fig. 1. The internal function g is the same as for AES, i.e. it applies the AES SBox to all four input bytes, rotates the results by one byte and adds a round constant to the most significant byte. For more details, we refer to [DR02]. Note that, if we consider two tweaks t 1 and t 2 for the same key k, then we obtain different round keys. However, the differences in the round keys depends only on the difference ∆t = t 1 ⊕ t 2 except for the last byte of rk 10 . Hence, we often transform a round key for a tweak t to the round key we would obtain for the all-zero tweak and call this a normalized round key rk.  Example To make the notation, especially the ordering of bytes clear, we give a test vector from [DoD17] as an example. Consider key k, tweak t and plaintext p as follows: The first round key is 0x7f45cd. When we arrange p into the state matrix, 0x01 is the first, i.e. topmost entry. We often refer to it as p[0:8] or the most significant byte of p.
Analogously, p[8:16] = 0x02 and p[16:24] = 0x03. We use the same notation for (round) keys and tweaks, i.e. we always count bits from the left-hand side, inspired by the notation in the Python programming language. Below we give the first steps of HL-24 k (p, t), i.e. the encryption of plaintext p with key k and tweak t. For convenience, we combine the RotateRows and MixColumns steps into a single linear layer L in the following.

A Related-Tweak Attack
Our attack builds upon some simple observations. First, the state size of HALFLOOP-24 is only 24 bits. Therefore, it is not too unlikely that multiple round key differences cancel each other. Second, except for the least significant byte of the last round key, a difference in the tweak only has linear influence on the round key differences. Hence, we can easily craft tweaks which skip multiple rounds in a differential attack. Lastly, 10 rounds seem not adequate given that there is a simple meet-in-the-middle attack against the full cipher. This is possible because guessing 5 round keys is easier than guessing the complete key. We visualize the influence of the tweak difference on the round keys in Fig. 2. There, an empty white block indicates that the round key is not influenced by this part of the tweak. A gray block shows that the difference in the byte given in the box is given by the byte of the tweak difference in the head of that column. When there is a light and a dark gray block with the same byte number, the round key difference is the xor of both. Notice, we can use Fig. 2  On the other hand, the visualization makes clear that differences in the least significant byte of the tweak only influence ∆rk 2 , ∆rk 7 , ∆rk 9 and ∆rk 10 . But beware that the difference in the least significant byte of rk 10 is not visualized there. As depicted in Fig. 1, the last byte of the round key is derived as the xor of the second byte of rk 5 and the first output byte of the g function applied to the bytes of rk 9 and rk 10 . Hence, it does not only depend on the difference in the tweak, but on the concrete values of the tweaks as well as on the value of the ninth round key rk 9 [16:24]. Since this itself depends on the choice of the tweak, to avoid ambiguity, we use rk 9 to denote the ninth round key obtained by running the key schedule with the all zero tweak. Then, for two tweaks t 1 , t 2 , we have (1) Now, more concretely, consider a plaintext x ∈ F 24 2 , a tweak t ∈ F 64 2 and a difference δ ∈ F 8 2 . When we encrypt x with tweak t under a key k ∈ F 128 2 and do the same for x ′ = x ⊕ 0 16 ||δ and t ′ = t ⊕ 0 16 ||δ||0 40 , we observe the following differences in the round The differences in the plaintexts and in rk 0 cancel and hence, there is no state difference after the addition of rk 0 . It stays like this until we add rk 6 . Now, our hope is that the differences in the state introduced by the difference in rk 6 is canceled by the differences in rk 7 and rk 8 so that after round eight there is no state difference again. The linear layer L is MDS. Hence, the one-byte difference δ in rk 6 is transitioned to a three-byte difference. The same holds in the backward direction for the difference in rk 8 . Now, after the addition of rk 7 and the SBox layer those differences must be the same for the cancellation to take place. Therefore, the probability for cancellation is about 2 −24 . Further notice, this cancellation happens if and only if we observe the ciphertext difference δ||0 16 because there is no difference in rk 9 and difference δ||0 16 in rk 10 . Pairs that lead to such a cancellation can only stem from a rather small set of middle states, which eventually allows us to restore the last three round keys.

Overview of the Attack
We divide our attack into four steps, which we list below. Each step is then discussed in more detail in the following subsections. Fig. 3 gives an overview of the attack. We implemented our attack and executed it in a lab setting. We report the results in Section 3.6. 1. Gather Data: Gather three tuples (p σ , t σ , δ σ , c σ ) ∈ F 24 2 ×F 64 2 ×F 8 2 ×F 24 2 , σ ∈ {x, y, z} such that δ σ ̸ = 0 and the ciphertexts

Build Tables:
We build two tables. T L stores all middle states (s x , s y , s z ) that can lead to a difference-less state after round eight and the corresponding candidates for rk 7 . T R contains the two most significant bytes of the states (v x , v y , v z ) immediately before the last SBox layer together with corresponding choices of rk 10 [0:16].

First Enumeration:
We enumerate all possible rk 8 and (s x , s y , s z ) ∈ T L to compute forward to (q x , q y , q z ). Then, we check for hits in T R to restore rk 9 and rk 10 .

Second Enumeration:
We brute force the remaining bits of the key.

Gathering Data
Our attack needs three tuples (p x , t x , δ x , c x ), (p y , t y , δ y , c y ) and (p z , t z , δ z , c z ). For each σ ∈ x, y, z, this constitutes a plaintext pair (p σ , p ′ σ ), a tweak pair (t σ , t ′ σ ) and a ciphertext pair (c σ , c ′ σ ) with corresponding differences Depending on whether we are in a known-plaintext or chosen-plaintext setting, there are different approaches to obtain these. Here, we give rough theoretical estimation for the amount of data needed. In Section 4, we discuss what data we would use in practice and how long it would take to obtain it.
In the known-plaintext setting, two random plaintexts fulfill the input difference 0 16 ||δ with probability 2 −16 since δ is arbitrary. The tweak difference is fulfilled with probability 2 −64 but, as we discuss in more detail later, 30 bits of the tweak can be assumed to be constant for a given frequency. As described above, the spontaneous cancellation happens with probability roughly 2 −24 and if and only if the ciphertexts have difference δ||0 16 . Hence, the overall probability that a random pair is good is 2 −74 . Therefore, considering birthday effects, our attack needs about 2 37 random known plaintexts.
In the chosen-plaintext setting, we could simply query random plaintexts and tweaks and then add the right input difference. Thereby, only the output difference must be conformed which happens with probability 2 −24 and hence we need around 3 · 2 24 queries on average. Notice that we can improve this using birthday effects again. For a random plaintext p and a random tweak t, we query (p⊕0 16 ||δ, t⊕0 16 ||δ||0 40 ) for all δ ∈ F 8 2 to obtain 2 8 data and therefore 2 15 pairs that have good input differences. Hence, the probability to obtain at least one pair that also has the correct output difference is approximately 2 −9 . Repeating this process 2 10.42 times, i.e. 2 18.42 queries, the probability of obtaining at least three good pairs is around 50%, assuming a binomial distribution.

Building Tables
Building the Left-Hand Table T L Recall, except for the first and last round, the only round key differences are ∆rk 6 , ∆rk 7 and ∆rk 8 . In particular, ∆rk 9 = 0. Consequently, the observed output difference ∆c σ = δ σ ||0 16 implies that there is no difference after the addition of rk 8 . That is, the differences in rk 6 , rk 7 and rk 8 spontaneously canceled each other. Now, consider Fig. 4. It is clear that there must be difference 0 16 ||δ σ before we add rk 8 since this is the difference ∆rk 8 for t σ and t ′ σ . Further, we swap the addition of rk 7 and the linear layer and hence transform the round key difference accordingly as L −1 (0 8 ||δ σ ||0 8 ) which we refer to as α σ ||β σ ||γ σ as depicted in Fig. 4. Observe that there is no difference in the two least significant bytes of the input to round seven. Hence, when we compute backward, β σ and γ σ must cancel the differences induced by the difference in the least significant byte of s σ . Similarly, L −1 (rk 7 )[0:8] must be such that we obtain difference δ σ in front of the most significant SBox. Now, to construct T L , we first construct three sub tables. For each pair σ ∈ {x, y, z}, we enumerate all middle states s σ ∈ F 24 2 and check if the differences in round seven in front of the two least significant SBoxes are canceled by β σ and γ σ respectively. If so, we compute all possible values of the most significant byte of L −1 rk 7 such that we obtain difference δ σ in front of the most significant SBox. We store these in a table [( rk 7 , s)] σ where rk 7 is the normalized most significant byte of L −1 rk 7 . Here, normalized means that we remove the influence of the tweak, i.e., we compute the value that we would obtain with an all-zero tweak. We sort the table by rk 7 . All three tables are of size about 2 8 since we enumerate 2 24 middle states, check a 16-bit condition and on success store the solutions of the equation where u σ is the most significant state byte derived by computing s σ back and hence all values but rk 7 are given. S is the AES SBox, for which it is well-known that this equation has either 0, 2 or 4 solutions. On average, there is one solution and hence the size of roughly 2 8 . Now, we combine the three sub tables into T L . To do so, we iterate all rk 7 ∈ F 8 2 and look up s x , s y and s z in the previously generated tables. For all hits, we add an entry (s x , s y , s z , rk 7 ) to the table T L . The table T L is also of size around 2 8 . This is because we are enumerating 2 8 values for rk 7 and the average number of hits is roughly one, since we check an 8-bit condition on 2 8 entries per small table.

First Enumeration
Now, we connect T L and T R by enumerating rk 8 . More precisely, we enumerate all ( rk 8 , (s x , s y , s z , rk 7 )) ∈ F 24 2 × T L . For each of those tuples, we compute q σ and search for ((q x ⊕ q y )[0:16], (q y ⊕ q z )[0:16]) in T R . For a hit, we restore the candidate for the two most significant bytes of rk 10 from T R and compute the candidate for rk 9 [0:16] as Restoring the least significant byte of rk 9 and rk 10 is only slightly more complicated. Recall that the difference in the least significant byte of rk 10 depends on the tweak difference and the value of the least significant byte of rk 9 . Hence, we enumerate all possible rk 9 [16:24] and compute the corresponding differences in rk 10 [16:24]. We denote these as ∆ xy and ∆ yz . Then, we compute all q σ [16:24] forward to w σ and check If both equations are fulfilled, we restore the least significant byte of the candidate for rk 10 by normalizing w x ⊕ c x as in Eq. (1), i.e.  Average execution times and other metrics averaged over 10 runs on a 16-core computer. The steps refer to the numbered list in Section 3.1. #Queries is the number of chosen-plaintext queries in step 1. #Candidates is the number of key candidates generated in step 3.
The total cost of this enumeration is |T L | · 2 24 ≈ 2 32 . The table T R is of size 2 16 and we check a 32-bit condition, i.e. we encounter about 2 24+8+16−32 = 2 16 hits in T R . For these, we enumerate one more byte and then check a 16-bit condition. Therefore, we expect 2 16+8−16 = 2 8 candidates for 80 bits of the key. In other words, we learn 72 bits of the key using time 2 32 and memory 2 16 .

Second Enumeration
We repeat this step for all candidates from the last step, i.e. around 2 8 for three tuples (2 32 times for two tuples, and only once for four tuples). The least significant bytes of rk 10 and rk 9 allow us to compute rk 5 [8:16]. Hence, we are only missing 48 bits of the second 128-bit block of the key schedule. We simply enumerate all possibilities to find the correct key. So, in total, the complexity of this step is 2 56 (2 80 and 2 48 respectively) and hence dominates the costs of our attack.
Notice, a straightforward time-memory trade-off does not seem possible because the missing key bits are, informally speaking, too scattered. Nevertheless, we leave it to future work to find more clever ways to recover the remaining bits of the key. We do not, since 2 56 is a reasonable complexity, especially for large-scale adversaries, and also this approach does not require any more data. Notice that in total we use six plaintexts and six ciphertexts (of course many more are simply ignored) which is nearly optimal since each pair contains roughly 24 bits of information about the key.

Experimental Verification
We wrote a bitsliced [Bih97] implementation of the attack for processors with the AVX instruction set and verified our chosen-ciphertext attack in a lab setting. The output of one example run is given in Appendix A. We executed our attack 10 times, each time using four randomly generated good plaintext-ciphertext-tweak tuples, on a 16-core computer to obtain the averaged metrics presented in Table 2.
Step 4 clearly dominates the overall complexity. However, since that step is a simple brute force search, parallelization is trivial and we expect it to be at least an order of magnitude faster when implemented on modern GPUs. Hence, we conclude that the cost of the attack is practical even for a small-scale adversary with access to four good tuples.

Plaintexts in 2G ALE
Radio stations in 2G ALE can transmit a number of different frames between them. The most fundamental of these are the frames involved in making a call, i.e. setting up a transmission channel between two radios, since this is the primary purpose of the ALE standards. An ALE network will typically have a number of different assigned frequencies and the radios will scan them sequentially while listening for calls. The typical ALE call scenario is illustrated in Fig. 7. An operator or computer that wishes to start a voice call or transmit a message initiates the call (3). The calling radio then selects a suitable frequency based on a radio propagation model (4) and transmits a three-word call frame on the selected frequency (5). When the called radio receives the call frame, it transmits an equivalent frame in return (6). The three-way handshake is completed by a second frame from the calling radio (7). After this step, both radios will be in the linked state and the radio channel is available for higher-level protocols to use. Either station can end the call. This is done using a single three-word frame (11), after which both radios return to the unlinked state and resume scanning frequencies. Notice that, in all frames, the contents of the first and second words are identical.
All 2G ALE words are 24 bits long. The three words used in the typical calls shown here are TO, TIS and TWAS. They all have the same structure, shown in Fig. 5, and start with three bits that identify the word type. This is followed by three seven-bit characters that represent a station callsign. Instead of the full ASCII character set, 2G ALE only uses the BASIC 38 character set which consists of the uppercase characters A-Z, numbers 0-9, @ and ?. The latter two characters are the null and wildcard characters which are not used in ordinary three-letter callsigns. The tweak used in the encryption of any ALE word is generated from the date, time, message word number and transmission frequency, as shown in Fig. 6. All these properties are immediately observable by anyone capable of intercepting frames. This is deliberate, since the ALE standard does not have a method for transferring the tweak. The required tweak differences for the attack described in Section 3 occur when two transmitted words are encrypted with tweaks that differ only in the third byte, i.e. in bits 17-24. This byte contains the four least significant bits of the coarse time (minutes since midnight) and the four most significant bits of the fine time (seconds). This means that, in every 16-minute time window, any call frames that are sent during the same number of seconds modulo 4 will provide three word pairs with the required tweak difference. For example, the first word of a frame transmitted at 1755 kHz on May 8 at 15:57:34 will have the tweak 54 3b d8 80 00 01 75 50 and the first word of a frame transmitted at 1755 kHz on May 8 at 15:59:58 will have the tweak 54 3b fe 80 00 01 75 50. Similarly, the second and third words in the two frames will have the same differences.

Gathering Data
As mentioned in Section 3.2, the probability that a pair of words will have the required ciphertext difference is 2 −24 . If all captured frames have three words, i.e. the lowest possible number of words per frame, the probability that a pair of two frames with the  required tweak difference also has at least one pair of words with the required ciphertext difference is We assume that the lower two bits of the second a radio transmits a message is uniformly random. Then, the number of frames with each of the four possible numbers of seconds modulo 4 will follow a multinomial distribution. An approximation of the expected number of pairs in a 16-minute time window with n captured frames that have the required tweak difference is each with probability 2 −22.4 of having the required ciphertext difference in at least one of the three words. A day has 90 such 16-minute bins. For example, a 16-minute bin with 96 captured frames (one every 10 seconds), corresponding to very high intensity traffic, is expected to have about 1104 pairs with the required tweak difference. The probability that at least one of those pairs has the required ciphertext difference is 1 − 1 − 2 −22.4 1104 ≈ 2 −12.3 .

Plaintext differences
Having the required difference in ciphertext and tweak is not enough. The two callsigns in the plaintext must also have the same difference in the last eight bits of the word, i.e. in the last character and the least significant bit of the middle character, as in the ciphertexts and tweaks. The callsigns must be identical in the other 13 bits. The probability of this ultimately depends on how callsigns are assigned in the particular radio network. A network could have no callsigns that differ only in the last eight bits. This would mean that the required plaintext difference can never appear. This would however significantly limit the number of possible callsigns in a network. In the case of randomly assigned callsigns, the probability of a plaintext pair with the same difference as the tweak is 36 −1 · 18 −1 · 2 −8 ≈ 2 −17.3 . In practice, the probability may actually be higher for a number of reasons. For example, the number of stations in a given network are often not close to the maximum number of possible callsigns, callsigns may be assigned sequentially according to organizational structure (as opposed to randomly), and stations that are close to each other organizationally are more likely to communicate with each other [Cal89].
A document available online [ALE21] contains, among other things, a large list of observed callsigns in a number of unencrypted ALE networks, mostly belonging to government agencies in the United States and Europe. While the data is of uncertain origin, it is the only source available that provides any insight in how callsigns are assigned in ALE networks. We sorted all valid three letter callsigns from the list by network name and enumerated all possible pairs of three-character callsigns for each network. We then calculated how many possible pairs in each network that differed only in the least significant  Figure 7: Sequence diagram of a typical 2G ALE call between two radio stations with callsigns AAA and AAB. The users can be either humans or computers, depending on the application. The results for the networks in the list with more than 20 observed callsigns are presented in Table 3. All but one of the networks have significantly more possible pairs of callsigns that differ in only the least significant byte than what would be expected if the callsigns were randomly assigned using all 36 available characters.
Combining the assumption of 96 messages per 16-minute window and with the highest observed fraction of plaintext pairs that differ only in the LSB from Table 3  that a given 16-minute window contains at least one ALE word with good plaintext, ciphertext and tweak differences. This can be considered the best-case scenario for an attacker only capable of recording transmitted frames and their corresponding plaintexts. Since three good pairs are needed to mount the attack, an attacker would have to collect frames for 3 · 2 22.5 90 · 365.25 ≈ 541 years to have a 50% probability of success. (Two pairs can be leveraged for a complexity 2 80 attack. This would lower the expected time by one third.) In other words, although getting pairs of messages with a good tweak difference is fairly easy, the low probability of getting the same difference in ciphertext and plaintext, even under favorable conditions, makes the attack unlikely to work in practice. Fig. 8 shows the probability of success as a function of time for three different traffic intensities, representing high to very high network usage. In the high-traffic scenarios, the probability of success after a year may still be higher than what is acceptable for some users. In the worst case presented, it is 1.5% after a year.

A Ciphertext-only Attack
By leveraging knowledge of the structure of the plaintexts, the attack described in Section 3 can be adapted into a ciphertext-only attack that is performed as follows. 1. Gather Data: In each 16-minute window, we collect all observed ciphertexts and store all pairs that have the required ciphertext and tweak differences. We do this until we have a sufficiently high probability of having collected four pairs with good plaintexts. The probability p P T that a stored pair will have the required plaintext difference varies between networks as described in Section 4. The probability of having found at least four good pairs after n collected ciphertexts is where p = 2 −8 · p P T is the probability that the plaintexts have the correct difference (see Section 4.3).
2. Enumerate candidate keys: For each possible combination of three collected pairs, we build the left and right tables as described in Section 3.3 and perform the first enumeration as described Section 3.4. We store all candidate keys in a sorted list. For n ciphertext pairs with the required ciphertext and tweak differences, this means performing the enumeration step for n 3 choices of three ciphertext pairs. Each search for candidate keys has a time complexity proportional to 2 32 and will yield approximately 2 8 random 80-bit candidate keys, meaning that approximately m = n 3 · 2 8 candidate keys will be generated in total. The correct candidate appears at least four times since there are four combinations of three good pairs whereas the number of wrong key candidates appearing four times is roughly m 4 · 2 −240 , assuming they are distributed randomly. Thus, it is easy to identify the correct key candidate. With a time complexity of n 3 · 2 32 , this step dominates the overall time complexity of the attack.
3. Full key search: We run the full key search as described in Section 3.5. Instead of known plaintexts, we verify that the computed plaintexts have the correct type  bits, contain only allowed callsign characters, that words one and two of each frame are identical, and that the plaintext pairs have the expected differences. The time complexity of this step is 2 48 . Fig. 9 shows the number of required good ciphertext pairs that are required for a 50% probability of success as a function of p P T . We list the time, data and memory complexity for some values of p P T in Table 4.
A Note on 3G ALE In addition to its use in 2G ALE, HALFLOOP-24 is also used for encrypting 26-bit robust link setup (RLSU) frames, called protocol data units (PDU), in 3G ALE. This is achieved by transmitting the two most significant bits unencrypted and encrypting only the least significant 24 bits. There are five types of RLSU PDUs. All have linear checksums in the least significant bits. In three cases, the checksums are eight bits long, meaning that plaintexts that differ in only the least significant byte are impossible. The other two, the call PDU and notification PDU, have four bit checksums. The notification PDU is, among other things, used for regular sounding calls [JKF + 12]. In those cases, ten of the 24 bits will be known with certainty to an outside observer. Since four of the remaining bits are a checksum, the transmitter's ten-bit identity is the only uncertain part of an intercepted 3G ALE RLSU sounding. Three of those bits are in the least significant byte, meaning that a pair of intercepted such PDUs will have probability 2 −7 of differing only in that byte, if the identities are distributed uniformly random. This could be leveraged for ciphertext-only attacks against 3G ALE RLSU networks with complexities similar to those for 2G ALE.

A boomerang Attack
In this section, we describe a boomerang attack, i.e. a chosen-ciphertext attack, against HALFLOOP-24. This requires an encryption and also decryption oracle which for example could be achieved by gaining physical (but only temporary) access to a radio that stores the desired key.
Regard Fig. 10 where we restore the first key byte. To do so, we utilize a boomerang that returns with probability one if the plaintext difference δ cancels the tweak-induced difference β in the first round key. More specifically, consider two plaintexts p 0 , p 1 with p 0 ⊕ p 1 = δ||0 16 and two corresponding tweaks t 0 , t 1 with t 0 ⊕ t 1 = 0 24 ||β||0 16 whereβ is such that L −1β = β||0 16 . For convenience, we assume t 0 = 0 in the following, so that we can omit the normalization. We swap the first linear layer with the addition of rk 1 . Then, the only state difference after the addition of L −1 rk 1 is in the most significant byte and its value is Detecting that this difference is zero for specific values of δ and β enables us to restore rk 0 . Now recall Fig. 2 and notice that there is no round key difference until rk 6 . Hence, if Eq. (2) is zero, then the state difference before adding rk 6 is zero too. After adding rk 6 , the difference of the middle state s, is therefore 0||β[8:16]. Now consider the resulting ciphertexts c 0 and c 1 . For the backward part of our boomerang, we do not change these, i.e. we have c ′ 0 = c 0 and c ′ 1 = c 1 . The tweaks are changed as follows: where γ ∈ F n 2 \ {0} is arbitrary. Now, we decrypt (c ′ 0 , t ′ 0 ) and (c ′ 1 , t ′ 1 ). The difference of t 0 and t ′ 0 is chosen such that it does not influence the last four round keys and since c 0 = c ′ 0 , we obtain the same middle state s ′ 0 = s 0 . The same holds for s ′ 1 = s 1 . Hence, the difference s ′ 0 ⊕ s ′ 1 is again 0||β[8:16] and therefore canceled by the addition of rk 6 which has the same difference as before as t ′ 0 ⊕ t ′ 1 = t 0 ⊕ t 1 . There is no difference for the round keys rk 5 , rk 4 , rk 3 , rk 2 and so we end up with the difference β induced by the addition of L −1 rk 1 in the most significant byte again. Finally, we obtain two plaintexts p ′ 0 and p ′ (2) is also fulfilled if we replace δ with δ ′ . Summing up, we need at most 2 8 encryption and decryption queries to find the first key byte. In the same way, we can find the second and third key byte. Notice, for the third key byte we have to add γ to the input of both SBoxes in Eq. (2) for the back direction. This of course already breaks the security of 128 bits since only 104 unknown key bits remain, but it is also not hard to see that one can recover these more efficiently than brute force. We leave the details to future work.

Experimental Verification
We implemented our boomerang attack and tested it in a lab setting. The results are depicted in Appendix A. Executing our attack takes less than a second on a modern laptop. In other words, the time needed to run the attack essentially only depends on the rate of encryption and decryption queries. Therefore, we only average the number of needed queries. For 1000 runs, we obtain an average number of 385 encryption plus 385 decryption queries to restore the first round key which is in line with our theoretical estimation of a maximum of 3 · 2 8 = 768 needed queries per round key.

Conclusion
We have presented theoretical and practical attacks against the HALFLOOP-24 cipher which is used to protect the automatic link establishment in HF radio. We revealed that HALFLOOP-24 is far from providing 128 bits of security, although it is not known whether that is indeed the intended security level. In fact, the design requirements for HALFLOOP that have been published [Joh16] mostly focus on the requirement that the encryption should not decrease ALE performance, along with a note that the US government has considered the use of AES-128 in approved implementations sufficient to protect classified information. In any case, it is apparent that HALFLOOP-24 was not subjected to a thorough security analysis before its introduction. This, as well as the fact that its predecessor SoDark is only a 56-bit cipher, leaves many open questions about the design goals, decisions and the process of the authorities in charge. Surely, one could try to mitigate the presented weakness, e.g. by increasing the number of rounds as was done when the number of rounds in SoDark was increased from eight in 2G ALE to sixteen in 3G ALE. Alternatively, one could avoid using the 2G ALE or 3G ALE RLSU modes which use HALFLOOP-24, instead using only modes that use HALFLOOP-48 or HALFLOOP-96. However, the security of the latter has not yet been studied and is therefore unknown. We leave this as future work. In the meantime, we advise against the use of any HALFLOOP variant. Instead, security should rely on ciphers that have undergone rigorous public security analysis.
Listing 3: Output of one run of the boomerang attack described in Section 5.
[ 1 0 : 2 8 : 1 1 ] I n i t i a l i z i n g HALFLOOP−24 l i b r a r y .