Comparing Large-unit and Bitwise Linear Approximations of SNOW 2.0 and SNOW 3G and Related Attacks

. In this paper, we study and compare the byte-wise and bitwise linear approximations of SNOW 2.0 and SNOW 3G, and present a fast correlation attack on SNOW 3G by using our newly found bitwise linear approximations. On one side, we reconsider the relation between the large-unit linear approximation and the smaller-unit/bitwise ones derived from the large-unit one, showing that approximations on large-unit alphabets have advantages over all the smaller-unit/bitwise ones in linear attacks. But then on the other side, by comparing the byte-wise and bitwise linear approximations of SNOW 2.0 and SNOW 3G respectively, we have found many concrete examples of 8-bit linear approximations whose certain 1-dimensional/bitwise linear approximations have almost the same SEI (Squared Euclidean Imbalance) as that of the original 8-bit ones. That is, each of these byte-wise linear approximations is dominated by a single bitwise approximation, and thus the whole SEI is not essentially larger than the SEI of the dominating single bitwise approximation. Since correlation attacks can be more eﬃciently implemented using bitwise approximations rather than large-unit approximations, improvements over the large-unit linear approximation attacks are possible for SNOW 2.0 and SNOW 3G. For SNOW 3G, we make a careful search of the bitwise masks for the linear approximations of the FSM and obtain many mask tuples which yield high correlations. By using these bitwise linear approximations, we mount a fast correlation attack to recover the initial state of the LFSR with the time/memory/data/pre-computation complexities all upper bounded by 2 174 . 16 , improving slightly the previous best one which used an 8-bit (vectorized) linear approximation in a correlation attack with all the complexities upper bounded by 2 176 . 56 . Though not a signiﬁcant improvement, our research results illustrate that we have an opportunity to achieve improvement over the large-unit attacks by using bitwise linear approximations in a linear approximation attack, and provide a new insight on the relation between large-unit and bitwise linear approximations.


Introduction
A stream cipher ensures the privacy of the message transmitted over a communication channel. In such algorithms, the ciphertext is usually the XOR sum of the plaintext and the generated keystream, resembling the one-time pad primitive. Among these, the binary LFSR-based stream cipher is a classical class, while such designs have been the main target of correlation attacks [Sie84,Sie85]. With the development of modern computer science, many word-oriented stream ciphers are proposed, which are based on LFSR over the extension fields of the binary field F 2 , and a non-linear combiner, with or without memory, to generate the keystream from the underlying LFSR sequence. The typical examples include SOSEMANUK [BBC + 08], SNOW 2.0 [EJ02], SNOW 3G [SAG] and SNOW-V [EJMY19]. SNOW 2.0 and SNOW 3G are both members of the SNOW family stream ciphers. SNOW 2.0 was proposed by Ekdahl and Johansson in 2002 as an improved version of SNOW 1.0 [EJ00], and selected as an ISO standard in 2005. It consists of two main components: a Linear Feedback Shift Register (LFSR) and a Finite State Machine (FSM), based on operations on 32-bit words, with high efficiency in both software and hardware environment. SNOW 3G was designed in 2006 by ETSI/SAGE, which differs from SNOW 2.0 by introducing a third 32-bit register to the FSM and a corresponding 32-bit nonlinear transformation for updating this register. SNOW 3G serves as the core of 3GPP Confidentiality and Integrity Algorithms UEA 2 & UIA2 for UMTS and LTE networks. It is currently in use in 3-4G mobile telephony systems. As a new member in the SNOW family, SNOW-V has kept most of the design from SNOW 3G in terms of the LFSR and the FSM, but both components are updated to better align with vectorized implementations.
Linear attacks have been widely used to analyze stream ciphers, and many research results have shown that SNOW ciphers are vulnerable to the class of linear approximation attacks, like distinguishing attacks and correlation attacks. The basic technique is to first approximate the nonlinear operations in the cipher and then derive a linear approximation relation involving the keystream symbols. If the linear approximation also involves symbols from the LFSR states, a correlation attack can be mounted by exploring some correlation between the keystream and the LFSR states. Fast correlation attack was first introduced by Meier and Staffelbach in 1989 by presenting two algorithms [MS89], and later evolved constantly and steadily [CT00, CJS00, CS91, CJM02, JJ99, JJ00], with wide applications to a lot of concrete constructions [LLP08,LV04,ZGM17]. However, the previous bitwise fast correlation attacks are not considered to work well for word-oriented stream ciphers, due to the complex form of the reduced LFSR recursion from the extension field to F 2 . As a big step, fast correlation attacks over extension fields are proposed in [ZXM15] based on linear approximations over larger alphabets, i.e., large-unit linear approximations, providing the best key recovery attack against SNOW 2.0 by using the byte-wise linear approximations. Later in [YJM19], inspired by the results of [ZXM15], a fast correlation attack on SNOW 3G is given using the method in [ZXM15] with the byte-wise linear approximations. In the design document [EJMY19] of SNOW-V, the designers present the linear approximation attacks by using the byte-wise linear approximations. All these works seem to show that the correlation attacks can be improved by using large-unit linear approximations.
Related Work. The resistance of SNOW 2.0 against linear approximation attacks have been widely studied. In [WBDC03], the bitwise linear approximations over two rounds of the FSM of SNOW 2.0 were constructed through linear masking method, and a distinguishing attack was given with the complexity 2 225 . At FSE 2006, Nyberg and Wallén [NW06] presented an improved distinguishing attack with the complexity 2 174 by using the bitwise linear mask 0x00018001 for the two-round linear approximation of the FSM. Later in [LLP08], the same bitwise mask was applied to launch a correlation attack on SNOW 2.0 with the time complexity 2 212.38 by using linear approximation relations between the keystream words and the LFSR states and combining the technique of fast Walsh transform (FWT). All these attacks in [WBDC03,NW06,LLP08] were launched by using the bitwise linear approximations. At CRYPTO 2015, Zhang et al. [ZXM15] introduced the terminology "large-unit" linear approximations, and mounted a fast correlation attack on SNOW 2.0 by building the two-round byte-wise (8-bit) linear approximations and adopting the k-tree algorithm [Wag02], giving the significantly reduced complexities all below 2 164.15 . Recently, [GZ20] investigated the bitwise linear approximation of a certain type of composition function present in SNOW 2.0 and proposed a linear-time algorithm to compute the correlation for an arbitrary given linear mask. Based on this algorithm, they carried out a wider range of search for bitwise masks and found some strong linear approximations which enable them to slightly improve the data complexity of the previous fast correlation attacks by using multiple bitwise linear approximations.
Several linear attacks on SNOW 3G have been proposed. In [NW06], Nyberg and Wallén devoted one section to SNOW 3G, where the bitwise linear approximations over three rounds of the FSM were depicted, but only rough estimates of the upper bounds of their correlations were given. In [GZ20], a fast correlation attack was given with the time complexity 2 222.33 by constructing bitwise linear approximations whose correlations were accurately computed. In [YJM19], inspired by the results of [ZXM15] where the large-unit approach was used to achieve improvements over the previous attacks on SNOW 2.0, Yang et al. constructed the three-round byte-wise (vectorized) linear approximations for the FSM of SNOW 3G and performed the searches for finding actual byte-wise masks that gave high SEI values for the approximations. The byte-wise linear approximations found in [YJM19] were also applied to launch a fast correlation attack against SNOW 3G by the method in [ZXM15] with all the complexities upper bounded by 2 176.56 .
For SNOW-V, the byte-wise linear approximation attacks and the bitwise ones are studied in [EJMY19] and [GZ21] respectively.
Our Contributions. In this paper, we study and compare the large-unit and bitwise linear approximations of SNOW 2.0 and SNOW 3G, and present a bitwise fast correlation attack on SNOW 3G by using our newly found bitwise linear approximations. On one hand, we first show that approximations on large-unit alphabets have advantages over all the smaller-unit/bitwise ones in linear approximation attacks, and meanwhile, the results on SNOW 2.0 in [ZXM15] gave the impression that large-unit approximations lead to larger SEI and also to better attacks. However, by studying and comparing the byte-wise and bitwise linear approximations of SNOW 2.0 and SNOW 3G, we have found many concrete examples of byte-wise linear approximations whose certain 1-dimensional/bitwise linear approximations have almost the same SEI as that of the original 8-bit ones. That is, each of these byte-wise approximations is dominated by a single bitwise approximation, and thus the whole SEI is not essentially larger than the SEI of the dominating single bitwise one. Since correlation attacks can be more efficiently implemented using bitwise approximations rather than large-unit approximations, improvements over the large-unit linear approximation attacks [ZXM15,YJM19] are possible for SNOW 2.0 and SNOW 3G. For SNOW 3G, we make a careful search of the bitwise masks for the linear approximations of the FSM and obtain many mask tuples which yield high correlations. By using these bitwise linear approximations, we mount a fast correlation attack to recover the initial state of the LFSR with the time/memory/data/pre-computation complexities all upper bounded by 2 174.16 , improving slightly the previous best one in [YJM19] which mounted a fast correlation attacks by using the 8-bit (vectorized) linear approximation with all the complexities upper bounded by 2 176.56 .
Organization of the paper. Some basic notations and definitions are presented in Section 2, together with the description of SNOW 2.0 and SNOW 3G. In Section 3 and Section 4, we study and compare the byte-wise and bitwise linear approximations of SNOW 2.0 and of SNOW 3G, respectively. In Section 5, we describe the detailed process on how to search for bitwise linear mask tuples for the linear approximation of SNOW 3G that yield high correlations. In Section 6, a bitwise fast correlation attack on SNOW 3G is given by using the bitwise masks. Finally, some conclusions are provided with the future work pointed out in Section 7.

Notations and Definitions
The following notations and definitions are used throughout this paper.
• The modular addition is denoted by " " and the bitwise exclusive-OR by "⊕".
• The binary field is denoted by F 2 and its m-dimensional extension field by F 2 m . Besides, we denote by F * 2 m the multiplicative group of nonzero elements of F 2 m .
• Given two binary vectors a = (a 0 , a 1 , ..., • Given two m-dimensional vectors x = (x 0 , x 1 , ..., x m−1 ) and y = (y 0 , y 1 , ..., y m−1 ) over F 2 8 with the same defining polynomial, e.g., p(z), the inner product of x and y over F 2 8 is defined as • Let n, m be two positive integers such that m divides n. For x ∈ F 2 n , it can be written as , and x 0 is the least significant part.
• For a set S, the number of elements in S is denoted by |S|.
• An n-variable Boolean function f (x) is a mapping from F 2 n to F 2 , i.e., f : .., f m−1 ), where f i s are n-variable Boolean functions, called the coordinate functions of F . F is also called an m-dimensional vectorial Boolean function.
• Given an (n, m)-function F : F 2 n → F 2 m , and a nonzero vector is called a (non-zero) component of F . In our analysis, the Boolean function v·F is denoted by F v for v ∈ F 2 m and v = 0.
Definition 1. Let X be a binary random variable, the correlation between X and zero is defined as (X) = Pr{X = 0} − Pr{X = 1}. Given a Boolean function f : F 2 n → F 2 , the correlation of f to zero is defined as where X is a uniformly distributed random variable in F 2 n .
Note that "correlation" is often used to evaluate the efficiency of bitwise linear approximations in a linear approximation attack, where the data complexity is proportional to 1/ 2 (f ).
Definition 2. The correlation of an (n, m)-function F : F 2 n → F 2 m with a linear output mask Γ ∈ F 2 m and a linear input mask Λ ∈ F 2 n is defined as where X is a uniformly distributed random variable in F 2 n .
At CRYPTO 2015, Zhang et al. [ZXM15] introduced the terminology "large-unit" linear approximations, and achieved improvements over the previous attacks on SNOW 2.0 by providing a fast correlation attack over F 2 8 using the byte-wise (8-bit) linear approximations. Given an (n, m)-function F : Here comes the definition of Squared Euclidean Imbalance (SEI), which is usually used to evaluate the efficiency of a large-unit linear approximation in a linear approximation attack.
Definition 3. The Squared Euclidean Imbalance (SEI) of a distribution D F is defined as which measures the distance between the target distribution and the uniform distribution. Especially for m = 1, ∆(D F ) is closely related to the correlation of F by ∆(D F ) = 2 (F ).
Note that the "SEI" of a distribution D F over a general alphabet is used to evaluate the efficiency of large-unit linear approximations in a linear approximation attack, where the data complexity is proportional to the value of 1/∆(D F ). Besides, there is a fundamental fact about the SEI of a distribution [BJV04,NH07] shown as follows.

Lemma 1. For an (n, m)-function F with the probability distribution vector
For brevity, we adopt the simplified notation ∆(F ) to denote ∆(D F ) hereafter. Then we have ∆(F ) = 2 (F ) when m = 1, and ∆(F ) = v∈F * 2 m . With the notations we just introduced, we now study the relation between a large-unit linear approximation and some certain smaller-unit/bitwise linear approximations derived from the large-unit one.
Let F : F 2 n → F 2 m be a large-unit linear approximation relation such that x → (f 0 , ..., f m−1 ). Let low l (x) be the l least significant bits of x. For m > 1 and 1 ≤ m < m, we define another linear approximation relation F (m ) according to F as 1 Then we call F (m ) a smaller-unit linear approximation function, which is actually the low-dimensional projective function derived from F . According to Lemma 1, we get that the SEI of the distribution D F of F is equal to the sum of the SEIs of the distributions where v runs through all non-zero vectors in F 2 m . When 1 ≤ m < m, the SEI of the distribution D F (m ) is a partial sum consisting of those SEIs of the distributions D F v , where v has non-zero terms only in the m least significant bits. The following conclusion then follows immediately.
Property 1. Let F be an m-bit (m > 1) large-unit linear approximation, and F (m ) be an m -bit (1 ≤ m < m) linear approximation derived from F as above. Then for any integer m : 1 ≤ m < m, we have ∆(F (m ) ) ≤ ∆(F ).
Property 1 shows a theoretical relation between different size linear approximations. Since the data complexity in a linear approximation attack is proportional to the value of 1/∆(F ), Property 1 seems to suggest that the larger the unit, the better complexity result we can get in a linear approximation attack.
For the large-unit linear approximation relation F : F 2 n → F 2 m , we can get that F v : F 2 n → F 2 is a bitwise linear approximation for any nonzero vector v ∈ F 2 m . The following conclusion follows directly from Lemma 1, which depicts the relation between the distributions of any bitwise linear approximations and that of the original large-unit one.
Property 2. Let F be an m-bit (m > 1) large-unit linear approximation, and F v be the bitwise linear approximation derived from F as above. Then for any nonzero vector v ∈ F 2 m , we have ∆(F v ) ≤ ∆(F ).
Property 2 seems to suggest that approximations on large-unit alphabets have advantages over all the bitwise ones in linear approximation attacks. However, as shown in Section 3.3 and Section 4.3 for the linear approximations of SNOW 2.0 and SNOW 3G, we have found that there are many concrete examples of byte-wise linear approximations whose certain 1-dimensional/bitwise linear approximations have almost the same SEI as that of the original large-unit ones. Since correlation attacks can be more efficiently implemented using bitwise approximations rather than large-unit approximations, improvements over large-unit linear approximation attacks [ZXM15,YJM19] are possible for SNOW 2.0 and SNOW 3G.

Description of SNOW 2.0
Both SNOW 2.0 and SNOW 3G are word-oriented stream ciphers and contain two main components: a Linear Feedback Shift Register (LFSR) and a Finite State Machine (FSM). SNOW 3G differs from SNOW 2.0 by introducing a third 32-bit register to the FSM and a corresponding 32-bit nonlinear transformation for updating this register. The keystream generation phase of SNOW 2.0 is depicted in Fig.1. For more details on the design, please refer to the original design documents [EJ02].
where α ∈ F 2 32 is a root of the primitive polynomial y 4 + β 23 y 3 + β 245 y 2 + β 48 y + β 239 ∈ F 2 8 [y], and β is a root of the polynomial z 8 + z 7 + z 5 + z 3 + 1 ∈ F 2 [z] (field constant 0xA9). Let (s t+15 , s t+14 , ..., s t ), s t+i ∈ F 2 32 , denote the LFSR state at time t. For the FSM part, it has two 32-bit registers R1 and R2. The LFSR state feeds into the FSM with the input (s t+15 , s t+5 ), and the output of the FSM is F t = (s t+15 R1 t ) ⊕ R2 t . The keystream z t is generated by z t = F t ⊕ s t . The registers R1 and R2 are updated according to where S(·) is a 32-bit to 32-bit mapping composed of four parallel AES S-boxes (denoted by S R ) followed by the AES MixColumn operation (denoted by M 1 ). Here M 1 is defined over F 2 8 with the polynomial z 8 1 + z 4 1 + z 3 1 + z 1 + 1 ∈ F 2 [z 1 ] (field constant 0x1B) for field multiplication. Precisely, let w = (w 0 w 1 w 2 w 3 ), w i ∈ F 2 8 , be the 32-bit input to S(·), then S(w) is computed as where the operation " * " is taken over G 1 = F 2 [z 1 ] z 8 1 + z 4 1 + z 3 1 + z 1 + 1 . The keystream generation phase of SNOW 3G is depicted in Fig.2. For more details on the design, please refer to the original design documents [SAG]. SNOW 3G preserves all features of SNOW 2.0, but adds a third register R3 to the FSM and a transformation S 2 . The FSM part has three 32-bit registers R1, R2 and R3. The output of the FSM is F t = (s t+15 R1 t ) ⊕ R2 t . The keystream z t is generated by z t = F t ⊕ s t . The registers R1, R2 and R3 are updated according to

Comparing Large-unit and Bitwise Linear Approximations of SNOW 2.0
In this section, we frist make a recap of the previous bitwise linear approximations of SNOW 2.0 used in [WBDC03, NW06, LLP08, GZ20, FTIM18], then review on the byte-wise linear approximations in [ZXM15], and finally illustrate the relationship between the bitwise and byte-wise linear approximations by some concrete examples.

Recap on the Bitwise Linear Approximations of the FSM
In [WBDC03], the linear masking method was applied to SNOW 2.0, and the bitwise linear approximations over two rounds of the FSM were constructed, as depicted in Fig.3. In [WBDC03], it is always assumed that all masks Γ i ∈ F 32 2 used in the linear approximations in Fig.3 have the same value. Denote all the masks by Γ, the bitwise linear approximations of the FSM take the following form: where n (t) is the binary noise.  [NW06] considered the case when the output masks at time t and t + 1 are different, and used the following bitwise linear approximation relation (1) four times according to the feedback polynomial of the LFSR to build the linear distinguisher: two times with the mask tuple (Γ, Λ) at time t + 2 and t + 16, then once with the mask tuple (Γα, Λα) at time t, and finally once with the mask tuple (Γα −1 , Λα −1 ) at time t + 11. (1) Let FSM (Γ, Λ) denote the correlation of the linear approximation relation (1) under the bitwise mask tuple (Γ, Λ). The total correlation for the linear distinguisher is calculated by combining these four approximations as 2 FSM (Γ, Λ) · FSM (Γα, Λα) · FSM (Γα −1 , Λα −1 ) according to the Piling-up Lemma. By using the bitwise masks Γ = Λ=0x00018001 for approximating the FSM, they constructed their best linear distinguisher and presented a distinguishing attack accordingly. When Γ = Λ=0x00018001, they obtained FSM (Γ, Λ) = 2 −14.496 , FSM (Γα, Λα) = 2 −26.676 and FSM (Γα −1 , Λα −1 ) = 2 −30.221 , and thus the correlation of their best linear distinguisher is 2 −85.89 . Using this mask tuple, a distinguishing attack on SNOW 2.0 was given with time complexity 2 174 , given 2 174 keystream words. Later in [LLP08], the same bitwise masks Γ = Λ =0x00018001 for the linear approximation (1) was applied to launch a correlation attack on SNOW 2.0 with the time complexity 2 212.38 , given 2 193.77 keystream words. Table 1: The best bitwise mask tuples (Γ, Λ) for the linear approximation (1) in [GZ20] No. Ref.
Recently, [GZ20] investigated the bitwise linear approximation of a certain type of composition function present in SNOW 2.0 and proposed a linear-time algorithm for computing the correlation. Based on this algorithm, they found some strong bitwise mask tuples for the linear approximation (1) yielding high correlations, of which the best three ones were listed in Table 1. Note that the mask tuple Γ = Λ =0x01800001 numbered (1) in Table 1 is the same as the one in [FTIM18], which is found by using the MILP-aided automatic search algorithm, and Γ = Λ = 0x00018001 numbered (2) is the same as the one found in [NW06]. They also applied these bitwise masks to launch a fast correlation attack with the total time complexity 2 162.86 and data complexity 2 159.62 , improving the previous fast correlation attacks in [ZXM15,FTIM18].

Recap on the Byte-wise Linear Approximations of the FSM
In [ZXM15], the large-unit approach was used to achieve improvements over the previous attacks on SNOW 2.0, where two-round byte-wise linear approximations for the FSM of SNOW 2.0 were constructed, as depicted in Fig.4.
Let T = (T 0 , T 1 , T 2 , T 3 ) and N = (N 0 , N 1 , N 2 , N 3 ) be the 4-byte linear masks defined over the AES MixColumn field F 2 8 , i.e., G 1 F 2 [z 1 ] z 8 1 + z 4 1 + z 3 1 + z 1 + 1 , where T 0 and N 0 are the least significant bytes. First, T and N are applied to z t and z t+1 respectively as follows: where " * " is operated in G 1 . Then, the following two byte-wise linear approximations are used, where n 2 be the folded noise introduced by the above two linear approximations, the following byte-wise linear approximations for the FSM of SNOW 2.0 are obtained It should be noted that the operation " * " in the linear approximation (2) and also the byte-wise masks T, N are defined in the AES MixColumn field which is denoted by G 1 , while the state elements s t+i are generated by the LFSR defined by another polynomial . Thus it is necessary to unify the two fields for an efficient decoding. In [ZXM15], a general routine is described to solve this problem, where they try to find an equivalent representation of the LFSR part theoretically so that it is defined over the new F 2 32 field. In this paper, as illustrated in Section 4.2 for SNOW 3G case, we describe in more details on how to unify the fields defined by different polynomials from a different perspective.
In [ZXM15], two algorithms are provided to compute the distributions 2 of n 1 and n 2 over the AES MixColumn field F 2 8 with the complexities 2 26.58 and 2 33.58 respectively for each given byte-wise mask tuple. In the following parts, we will provide two slightly improved algorithms to compute these distributions, whose complexities are 2 20.25 and 2 27.33 respectively.

Improving the Computation of the Distribution of n 1
For the given 4-byte masks T and N, let N = N * M 1 , the noise variable n 1 can be expressed as where X and Y are the uniformly distributed random variables in F 2 32 , which can be written into 4 bytes as X = (X 0 , X 1 , Following the basic idea of Algorithm 3 in [ZXM15], we first split the expression of n 1 into 4 sub-expressions n j 1 (j = 0, 1, 2, 3) as follows: where " 8 " represents the addition modulo 2 8 , and cr j ∈ {0, 1} are local carries introduced by the addition modulo 2 32 such that cr j = (X j + S −1 R (Y j ) + cr j−1 )/2 8 for j = 0, 1, 2, 3 (cr −1 = 0 by default) with " " being the floor function of integers. The sub-expressions are connected with each other by the one direction information propagation from the least significant n 0 1 to the most significant n 3 1 , caused by the local carries introduced by the addition modulo 2 32 and the output of n j 1 .
Based on the above, we describe our algorithm for computing the distribution of n 1 as follows. We first describe a sub-algorithm called ComputeMatrix(cr j−1 , T j , N j ), which computes the values of , and stores all the output and carry information in a 256 × 2 matrix Mat. Then we present Algorithm 1 to compute the distribution of n 1 by using the sub-algorithm ComputeMatrix.

New Results of the Byte-wise Linear Approximations
As illustrated in the above sections, the distributions of the noise variables n 1 and n 2 can be accurately computed by Algorithm 1 and Algorithm 2. Then the distribution of the folded noise variable n = n 1 ⊕ n 2 can be derived by the convolution of the above two noise distributions. In our experiments, we have carried out a wide range of search for good byte-wise masks (T, N) for SNOW 2.0. One important observation from our experiments is that the best byte-wise mask tuple (T, N) given in [ZXM15], i.e., T = N = (0x03, 0x00, 0x01, 0x00) is not optimal. In our experiments, we have found two more independent byte-wise masks which give larger SEIs values, as shown in Table 2.

Examples of Relations Between Large-unit and Bitwise Linear Approximations
According to Property 1, the SEI of a smaller-unit linear approximation is not larger than that of the original large-unit one. Besides, Property 2 indicates that approximations on large-unit alphabets have advantages over all the bitwise ones derived from the large-unit one. For SNOW 2.0, we let F (T,N) denote the byte-wise linear approximation with the 4-byte mask tuple (T, N), and f (Γ,Λ) denote the bitwise linear approximation with the 32-bit mask tuple (Γ, Λ), such that Note that F v (T,N) = v·F (T,N) is a bitwise linear approximation for any nonzero vector v ∈ F 2 m , and we always have ∆(F v (T,N) ) ≤ ∆(F (T,N) ). For SNOW 2.0, we have m = 8.
Let I = (1, 0, 0, 0, 0, 0, 0, 0), then ∆(F I (T,N) ) ≤ ∆(F (T,N) ) for any given 4-byte mask tuple (T, N). Below we compare the bitwise masks in Table 1 and the byte-wise masks in Table  2, and give some concrete examples to show the relationship between the 8-bit linear approximation F (T,N) and the bitwise linear approximation F I (T,N) .
• Similarly, for T = N = (0x03, 0x00, 0x01, 0x00) numbered (3) in Table 2, and Γ = Λ =0x00010081 numbered (3) in Table 1, we have verified that The results on SNOW 2.0 in [ZXM15] gave the impression that large-unit approximations lead to larger SEI and also to better attacks. In our experiments, however, we have found many concrete examples of 8-bit large-unit linear approximations for SNOW 2.0 whose certain 1-dimensional bitwise linear approximations have almost the same SEI as that of the original large-unit ones. That is, each of these byte-wise linear approximations is dominated by a single bitwise approximation, and thus the whole SEI is not essentially larger than the SEI (squared correlation) of the dominating single bitwise approximation. Since correlation attacks can be more efficiently implemented using bitwise approximations rather than byte-wise approximations, this provides an opportunity to achieve improvement over the large-unit linear approximation attack on SNOW 2.0 in [ZXM15]. Actually, a bitwise fast correlation attack on SNOW 2.0 has been mounted in [GZ20] by using multiple bitwise masks as listed in Table 1 for linear approximation, with the total time complexity 2 162.86 and data complexity 2 159.62 , slightly improving the large-unit correlation attack of [ZXM15] whose total time complexity is 2 164.15 and data complexity is 2 163.59 .

Comparing Large-unit and Bitwise Linear Approximations of SNOW 3G
In this section, we frist study the bitwise and the byte-wise linear approximations of the FSM of SNOW 3G respectively, and then illustrate the relationship between the bitwise and byte-wise linear approximations by some concrete examples.

Bitwise Linear Approximations of the FSM
In [NW06], the bitwise linear approximations over three rounds of the FSM were depicted, as shown in Fig. 5, but only rough estimates of the upper bounds of their correlations were given. We consider a similar approach to analyze SNOW 3G against linear approximation attacks. That is, we try to approximate the FSM part through linear masking and then to cancel out the contributions of the registers R1, R2 and R3 by combining expressions for several keystream words. Generally, to build the bitwise linear approximation of the FSM of SNOW 3G, we consider to apply the 32-bit linear masks Φ, Γ and Λ to z t−1 , z t and z t+1 respectively through the linear masking as follows: Let u t = R1 t−1 , v t = R2 t−1 and w t = R1 t . According to the update expressions for the registers of the FSM, the first register R1 is updated according to R1 t+1 = (s t+5 ⊕ R3 t ) R2 t . We then have R1 t+1 = (s t+5 ⊕ S 2 (v t )) S 1 (u t ), R2 t = S 1 (u t ) and R2 t+1 = S 1 (w t ), and thus Regarding to the internal states and keystream words, we consider the following four associated linear approximations by introducing a 32-bit intermediate linear mask Θ, and write them as follows: Basically, we want to find mask tuples (Φ, Γ, Λ) for (3) which yield highly biased linear approximations, and then employ them in a bitwise fast correlation attack. Before that, we present the following illustrations for the above four linear approximation relations.
1. For the linear approximation relation (1), we write u t = (u t,0 u t,1 u t,2 u t,3 ), where u t,j ∈ F 2 8 for j = 0, 1, 2, 3. Let x t = (x t,0 x t,1 x t,2 x t,3 ) be the output of four parallel AES S-boxes S R , i.e., where M 1 is expressed as the 32 × 32 binary matrix and x t is viewed as a 32-bit variable. Given the 32-bit linear mask Θ, we let Θ be the mask such that Θ · x t = Θ · (M 1 · x t ). (We refer to Appendix A for the computation of Θ from the given mask Θ). Based on these expressions, we have 2. Similarly, for the linear approximation relation (2), we let y t = Sbox(w t ), and Λ be the mask such that Λ · y t = Λ · (M 1 · y t ). Then the noise e (t) 2 can be expressed as

For the linear approximation relation (3), let
4. For the linear approximation relation (4), note that the transformation S 2 of SNOW 3G is composed of four parallel 8-bit to 8-bit substitutions S Q , followed by the AES MixColumn transform M 2 . We will use Sbox (·) to denote the output of four parallel substitutions S Q . Given the mask Λ, let Λ be the mask such that Λ ·x = Λ·(M 2 ·x) for all 32-bit x, where M 2 is expressed as the 32 × 32 binary matrix. (See Appendix B for the computation of Λ from the given mask Λ). Then we have To sum up, the four linear approximation relations can be rewritten as follows: Since the distributions of e (t) j are independent of the time instance t, we will simplify them by writing e j , and denote (e j ) the corresponding correlations to zero, i.e., (e j ) = Pr{e j = 0} − Pr{e j = 1} for j = 0, 1, 2, 3. Let Results. In Section 5, we will present the detailed process on how to search for linear mask tuples (Φ, Γ, Λ) such that | FSM (Φ, Γ, Λ)| are as large as possible. The results we obtained are as follows: • Let Λ = 0x1014190f, there exist three linear mask tuples (Φ, Γ, Λ) such that | FSM | ≥ 2 −21 , the linear masks Φ and Γ are listed in Table 3; • Let Λ = 0x1014190f, there exist 16 linear mask tuples (Φ, Γ, Λ) such that 2 −22 < | FSM | < 2 −21 , the linear masks Φ and Γ are listed in Table 4.

Byte-wise Linear Approximations of the FSM
Recent work in cryptanalysis of stream ciphers paid more attentions on approximations on larger alphabets, and showed that approximations on larger alphabets can improve the attacks. In [ZXM15], the large-unit approach was used to achieve improvements over the previous attacks on SNOW 2.0. In [YJM19], inspired by the results of [ZXM15], Yang et al. constructed three-round byte-wise linear approximations for the FSM of SNOW 3G and performed the searches for finding actual byte-wise masks that gave high SEI values for the approximations. The byte-wise linear approximations found in [YJM19] were also applied to launch a fast correlation attack against SNOW 3G. In this section, we will study directly the byte-wise linear approximations of SNOW 3G, following strictly a similar procedure as that in [ZXM15] for approximating the FSM part. In this part, all the 32-bit words are divided into 4 bytes and regarded as 4-dimensional vectors over F 2 8 . As defined in Section 2.1, for two 4-dimensional vectors x = (x 0 , x 1 , x 2 , x 3 ) and y = (y 0 , y 1 , y 2 , y 3 ) over F 2 8 with the defining polynomial p(z), we have x * y = x 0 y 0 ⊕ x 1 y 1 ⊕ x 2 y 2 ⊕ x 3 y 3 with the multiplications x i y i taken over To build the byte-wise linear approximation for the FSM of SNOW 3G, we first apply the 4-byte linear masks Q, T and N to z t−1 , z t and z t+1 respectively, Regarding to the internal states and keystream words, we consider the following four linear approximations by introducing a 4-byte intermediate linear mask Ω, and write them as follows: where e (t) j for j = 1, 2, 3, 4 are 8-bit noises introduced by the above linear approximations. With these relations, the byte-wise linear approximations of the FSM of SNOW 3G have the following form, where e (t) = e (t) 4 is the folded noise introduced by the above four linear approximations.
We will try to find the byte-wise mask tuples (Q, T, N) for linear approximation (4) such that the SEIs of the distributions of e (t) are as large as possible, and then they can be used in the fast correlation attack over F 2 8 . It is important to note that the field F 2 8 involved in e (t) have three different defining polynomials, they are • z 8 + z 7 + z 5 + z 3 + 1 ∈ F 2 [z] (field constant 0xA9), used for defining the LFSR of SNOW 3G. We denote the corresponding field by G = F 2 [z] z 8 + z 7 + z 5 + z 3 + 1 . For two 4-dimensional vectors x = (x 0 , x 1 , x 2 , x 3 ) and y = (y 0 , y 1 , y 2 , y 3 ), where x i , y i ∈ G, the inner product x * y = x 0 y 0 ⊕ x 1 y 1 ⊕ x 2 y 2 ⊕ x 3 y 3 is operated with the multiplications x i y i taken over G.
We need to unify all the involved multiplications in F 2 8 by specifying only one defining polynomial. Different from the method in [ZXM15] for unifying two fields for SNOW 2.0, here we will unify three fields for SNOW 3G during the course of approximating the FSM. We will use G = F 2 [z] z 8 + z 7 + z 5 + z 3 + 1 as the specified field, that means all the masks are defined over G, and the multiplications over F 2 8 are taken by modulo the polynomial z 8 + z 7 + z 5 + z 3 + 1. Let M denote the AES MixColumn operation defined over G, then • For any 4-byte value w = (w 0 , w 1 , w 2 , w 3 ), let W = (W 0 , W 1 , W 2 , W 3 ) be the output of four parallel AES S-boxes S R (·), i.e., W = Sbox(w) = (S R (w 0 ), S R (w 1 ), S R (w 2 ), S R (w 3 )). According to the definition of S 1 (·), we have S 1 (w) = M 1 * 1 W, where M 1 is the AES MixColumn operation defined over G 1 and the operation " * 1 " is taken over G 1 . Let W l 1 (W) be another 4-byte value such that M * W = M 1 * 1 W, where the involved multiplications in the left and right sides are taken over G and G 1 respectively. Then we derive S 1 (w) = M * l 1 (W). We refer to Equation (12) in Appendix C on how to compute W from W.

• For any 4-byte value
). According to the definition of S 2 (·), we have S 2 (v) = M 2 * 2 V, where M 2 is the AES MixColumn operation defined over G 2 and the operation " * 2 " is taken over G 2 . Let V l 2 (V) be another 4-byte value such that M * V = M 2 * 2 V, where the involved multiplications in the left and right sides are taken over G and G 2 respectively. Then we derive S 2 (v) = M * l 2 (V). We refer to Equation (13) in Appendix D for the computation of V from V.
After unifying all the involved operations to G = F 2 [z] z 8 + z 7 + z 5 + z 3 + 1 , we now have the following illustrations for the above four linear approximation relations.
1. For the linear approximation relation (1), we let U t = Sbox(u t ) be the output of four parallel AES S-boxes S R (·), then S 1 (u t ) = M 1 * 1 U t . Let U t = l 1 (U t ) be the 4-byte value such that M * U t = M 1 * 1 U t computed by Equation (12) in Appendix C, we then have where Ω = Ω * M, with the involved multiplications taken over the unified field G.
2. Similarly, for the linear approximation relation (2), we let W t = Sbox(w t ), and thus where N = N * M, with the involved multiplications taken over G.
3. For the linear approximation relation (3), we set ξ t = s t+5 ⊕ S 2 (v t ) and η t = S 1 (u t ), then e (t) 4. For the linear approximation relation (4), we let V t = Sbox (v t ) be the output of four parallel 8-bit to 8-bit substitutions S Q (·), then S 2 (v t ) = M 2 * 2 V t . Let V t = l 2 (V t ) be the 4-byte value such that M * V t = M 2 * 2 V t computed by Equation (13) in Appendix D. Then we have where N = N * M, with the involved multiplications taken over G.
To sum up, the four linear approximation relations can be rewritten as follows: As described in Section 3.2, two-round byte-wise linear approximations for the FSM of SNOW 2.0 were constructed and searches were performed for finding byte-wise masks in [ZXM15]. Based on this, we have provided two improved algorithms in Sections 3.2.1 and 3.2.2, namely Algorithm 1 and Algorithm 2, to compute the distributions of two types of byte-wise linear approximations, whose complexities are 2 20.25 and 2 27.33 respectively. For SNOW 3G, we have computed the SEI of the distributions 3 of e j for j = 1, 2, i.e., ∆(e j ), by modifying Algorithm 1, due to the introduction of the linear transform l 1 (·) when unifying the field F 2 8 , and computed ∆(e 3 ) by Algorithm 2. For e 4 , since Sbox −1 (·) consists of four parallel applications of S −1 Q , which do not affect the independency among the bytes of V t , the distribution of e 4 can be derived by using similar method as that in Algorithm 1 and Algorithm 2. After this, the SEI of the distribution of e, i.e., ∆(e), can then be obtained by the convolution of the above four distributions.

Examples of Relations Between Large-unit and Bitwise Linear Approximations
For SNOW 3G, we let G (Q,T,N) denote the byte-wise linear approximation with the 4-byte mask tuple (Q, T, N , and g (Φ,Γ,Λ) denote the bitwise linear approximation with the 32-bit mask tuple (Φ, Γ, Λ) such that For I = (1, 0, 0, 0, 0, 0, 0, 0), we always have ∆(G I (Q,T,N) ) ≤ ∆(G (Q,T,N) ). As was done in Section 3.3, below we will give some concrete examples of byte-wise mask tuple (Q, T, N) for the linear approximation G (Q,T,N) such that ∆(G I (Q,T,N) ) = ∆(G (Q,T,N) ) by comparing the bitwise and byte-wise masks in Table 3 and Table 5, meaning that each of these byte-wise linear approximations has a single dominating bitwise linear approximation such that the SEI of the whole thing is almost equal to the squared correlation of the dominating single bitwise approximation.
In our experiments, we have found many concrete examples of 8-bit linear approximations for SNOW 3G whose certain 1-dimensional bitwise linear approximations have almost the same SEI as that of the original large-unit ones. As explained in Section 3.3, correlation attacks can be more efficiently implemented using bitwise approximations rather than large-unit approximations. Thus we have an opportunity to improve the attack against SNOW 3G in [YJM19] which mounted the fast correlation attacks by using the 8-bit (vectorized) linear approximations. We will show in Section 6 the detailed process of the bitwise fast correlation attack on SNOW 3G.

Search for Bitwise Masks of SNOW 3G
In this section, we describe how to search for bitwise mask tuples (Φ, Γ, Λ) of the linear approximation relation (3) for the FSM of SNOW 3G.

Computing the Bitwise Linear Approximations of the FSM
As described in Section 4.1, the bitwise linear approximations for the FSM of SNOW 3G have the form of (3) by using four linear approximations with the following approximation noise variables, For any given bitwise mask tuple (Φ, Γ, Λ), the correlation of the linear approximation relation (3) is computed as FSM (Φ, Γ, Λ) = (e 2 ) (e 4 ) Θ (e 1 ) (e 3 ). We try to find (Φ, Γ, Λ) such that | FSM (Φ, Γ, Λ)| is as large as possible. Accordingly, we need to compute the correlations of e 1 , e 2 , e 3 , e 4 for given masks. In the following parts, we will show in detail how to compute these correlations in linear-time by utilizing the existing algorithms in [GZ20,NW06].

Computation of the Correlations of e 1 and e 2 .
Note that the noises e 1 and e 2 have the same form but different 32-bit mask tuples, which is (Φ, Θ ) for e 1 and (Γ, Λ ) for e 2 . As was done in [GZ20] for the linear approximation of SNOW 2.0, a certain type of function is derived from the expressions of e 1 and e 2 as G : F 2 32 × F 2 32 → F 2 32 such that where x (1) , x (2) are both 32-bit (4-byte) random variables, and the notation " " denotes the addition modulo 2 32 . We note that (e 1 ) is exactly the correlation of the linear approximation of G with the output mask Φ and the input masks Φ and Θ , and (e 2 ) equals to the correlation of the linear approximation of G with the output mask Γ and the input masks Γ and Λ , thus we have (e 1 ) = G (Φ; Φ, Θ ), (e 2 ) = G (Γ; Γ, Λ ).
Let A be the 32-bit output mask of G, and A, B be the 32-bit input masks, then the correlation of the linear approximation of G under the bitwise mask tuple (A; A, B) is defined as In [GZ20], a linear-time algorithm is proposed to compute the correlation of the linear approximation of G for an arbitrary bitwise mask tuple, and then used to mount attacks on SNOW 2.0 and SNOW 3G. The general idea is to divide the 32-bit values into four 8-bit values according to the specific structure of the underlying function S −1 R (·), and then pre-compute and store some useful matrices independent of the input/output masks, and finally compute the correlation under an arbitrary bitwise mask tuple by doing some matrix multiplications using these pre-computed matrices. To be specific, for any 32-bit mask tuple (A; A, B) of G, we write A and B in bytes as A = (A 0 A 1 A 2 A 3 ) and B = (B 0 B 1 B 2 B 3 ) with A j , B j ∈ F 2 8 for j = 0, 1, 2, 3, then G (A; A, B where M (Aj ,Aj ,Bj ) are 2 × 2 matrices pre-computed by Algorithm 3 as shown below, l 2 = (1, 1) is a row vector, and e 0 = (1, 0) T is a column vector.

Computation of the Correlation of e 3 .
Note that the correlation of e 3 is closely related with the correlation of the addition modulo 2 32 with three inputs. The k-input addition modulo 2 n is defined as F : F 2 n × ... × F 2 n → F 2 n such that F (x (1) , ..., x (k) ) = x (1) · · · x (k) , where " " denotes the addition modulo 2 n . Let + (Γ (0) ; Γ (1) , ..., Γ (k) ) denote the correlation of F with respect to the n-bit output mask Γ (0) and the n-bit input masks Γ (1) , ..., Γ (k) . In [NW06], the authors have proposed a linear-time algorithm to accurately compute the correlation of the linear approximation of F for any given mask tuple, we describe it in the following theorem. Theorem 1. [NW06]. Let k > 1 be a fixed integer. For all R ∈ F k+1 2 , let D R be the k × k matrices where the (oc, ic)-element is computed as Let l k be the row vector of length k with all elements equal to 1, and let e 0 be the column vector of length k with a single 1 in 0-th row and 0 otherwise. For any given mask tuple (Γ (0) , Γ (1) , ..., Γ (k) ) of the k-input addition modulo 2 n , write Γ (i) in bits as for j = 0, 1, ..., n − 1.

Computation of the Correlation of e 4 .
Note that the correlation of e 4 is exactly the correlation of the function Sbox (·) with respect to a linear output mask Λ and a linear input mask Φ, i.e., (e 4 ) = Sbox (Λ ; Φ).
where Λ j , Φ j ∈ F 2 8 for j = 0, 1, 2, 3. Since Sbox (·) is composed of four parallel applications of S Q , according to the Piling-up Lemma, we have Complexity Analysis. We need to pre-compute a linear approximation table (LAT) to store all the linear approximations of S Q by trying all the possibilities of a, b values, i.e., all the values S Q (a; b) for all a, b ∈ F 2 8 are stored in the row of LAT indexed by (a, b). For this, we loop for all x ∈ F 2 8 , and compute a · S Q (x) ⊕ b · x, which requires a time complexity of 2 8 × 2 8 × 2 8 = 2 24 . Using the LAT, the accurate value of (e 4 ) with the given masks Λ and Φ can be obtained according to Equation (7) by table lookups 4 times, which costs a linear-time complexity.
With the above method, the accurate value of FSM (Φ, Γ, Λ) is obtained as whose complexity is essentially proportional to the number of the terms of the sum over Θ.

Search for Bitwise Masks
In this part, we hope to find 32-bit mask tuples (Φ, Γ, Λ) for the linear approximation (3) such that | FSM (Φ, Γ, Λ)| computed by Equation (8) are as large as possible. Obviously, executing the search for all possible mask values is impractical. Therefore, we consider to use a clever search strategy trying to find some potential linear masks. For ease of description, we define a subset S of all the 32-bit masks such that S = {Λ = (Λ 0 Λ 1 Λ 2 Λ 3 ) : Λ 0 = 0xa ∈ F * 2 8 and Λ k = 0x00 for k = 1, 2, 3}.
There are totally 255 values in S. In our attempt to find good linear approximations, we have observed according to Corollary 1 and our experiments that the term | (e 2 )| = | G (Γ; Γ, Λ )| is more likely to have high value when both Γ ∈ S and Λ ∈ S are satisfied. Besides, taking into consideration the term (e 4 ) = S Q (Λ 3 ; Φ 3 ) S Q (Λ 2 ; Φ 2 ) S Q (Λ 1 ; Φ 1 ) S Q (Λ 0 ; Φ 0 ), we have confined the search to Λ and Φ such that Λ ∈ S and Φ ∈ S to ensure | (e 4 )| is as large as possible. Furthermore, we deduce from Corollary 1 that | (e 1 )| = | G (Φ; Φ, Θ )| have nonzero values if and only if Θ ∈ S. Above all, we have confined the search to mask tuples (Φ, Γ, Λ) such that Φ ∈ S, Γ ∈ S, Λ ∈ S and Λ ∈ S, and the terms of the sum over Θ such that Θ ∈ S are included in Equation (8). According to the computations of Λ and Λ from Λ in Appendix A and B, we obtained 31 choices for Λ which are listed in Table 6 of Appendix F.
Step 3: For all the 255 × 255 × 31 ≈ 2 21 choices of (Φ, Γ, Λ), we compute the values of Following the above procedure, we obtained some newly found mask tuples for the bitwise linear approximation of the FSM of SNOW 3G, some of the results are presented in Table  3 and Table 4 of Section 4.1. We have obtained two best results, which correspond to the linear mask tuples (Φ, Γ, Λ) where Φ=0x00000020 or 0x00000030, Γ=0x00000001 and Λ=0x1014190f, and the best bitwise linear approximations have the correlation ±2 −20.48 and thus the SEI 2 −40.96 .

Bitwise Fast Correlation Attack on SNOW 3G
In this section, we will first present a fast correlation attack on SNOW 3G by using the bitwise linear approximations, and then make a brief discussion about the bitwise approximation attacks compared the large-unit approximation attack in [YJM19].

Using the Bitwise Masks in a Fast Correlation Attack
The bitwise linear approximations of the FSM of SNOW 3G have the following form: Given a mask tuple (Φ, Γ, Λ), we let ϕ t = Φ · z t−1 ⊕ Γ · z t ⊕ Λ · z t+1 . For a given parameter N (to be determined later), set D = N/2+2, provided the keystream words z 0 , z 1 , ..., z D−1 , we can obtain ϕ t for t = 1, 2, ..., N/2. Let (x 0 , x 1 , ..., x l−1 ) be the LFSR initial state of SNOW 3G in bits (l = 512). With the feedback polynomial we can express the above linear approximations in the initial state form as where g t is the corresponding l-bit coefficient column vector. For SNOW 3G, we will use the best two bitwise mask tuples for approximations, both yielding the correlation of FSM (Φ, Γ, Λ) = ±2 −20.48 . In such a case we can obtain N parity checks in total written as follows where Z is the N -bit row vector computed from the given keystream words z 0 , z 1 , ..., z D−1 , G is the l × N generator matrix, and E is the N -bit noise vector with the correlation α ±2 −20.48 . We present a high-level description of our attack on SNOW 3G in Algorithm 4. Generally, the fast correlation attack is modelled as a decoding problem, i.e., the keystream segment Z can be seen as the transmission result of the LFSR sequence U through a Binary Symmetry Channel (BSC) with the error probability p, as shown in Fig. 6. Our bitwise fast correlation attack on SNOW 3G is divided into the preprocessing phase and the processing phase. In the preprocessing phase, we first collect N samples of (9) involving only the keystream words and l = 512 LFSR initial state bits, and then try to reduce the number of the involved LFSR initial state bits to l (< l) bits at the expense of a folded noise level by employing Wagner's k-tree algorithm to generate parity check equations. After this, we enter the processing phase to recover the target l bits by using the fast Walsh transform as was done in [CJM02,LV04], and further the whole LFSR initial state of SNOW 3G.
Algorithm 4 the bitwise fast correlation attack on SNOW 3G Parameters: N , m 2 , l(= 512) and l (< l). Let D = N/2 + 1 Input: the keystream words z 0 , z 1 , ..., z D−1 , and the best two mask tuples (Φ, Γ, Λ) 1: Compute Z and G; 2: Follow the Preprocessing Phase to derive m 2 parity check equations involving only l bits of the LFSR initial state, e.g., (x 0 , x 1 , ..., x l −1 ); 3: Follow the Processing Phase to recover (x 0 , x 1 , ..., x l −1 ) by using the FWT; 4: Recover the remaining l − l LFSR initial state bits using a similar method; 5: Recover the secret key according to the reverse of the initialization; Output: the correct secret key.
Rewriting the matrix G in column vectors as G = (g 1 , g 2 , ..., g N ), where g j is the j-th column vector, we try to find some linear combinations of columns which vanish on l − l bits to reduce the dimension of the secret (i.e., the number of the involved LFSR initial state bits). For SNOW 3G, the number of folded noise variables is set to 4. Specifically, we look for a number of 4-tuples from G which add to 0 on their most significant l − l bits. This is usually solved using Wagner's k-tree algorithm [Wag02]. It is stated in [GZ20] that a small technique can be combined when applying the k-tree algorithm. Below we illustrate this process using the method in [GZ20].
Preprocessing Phase. Let l 1 and l 2 be two positive integers such that l 1 + l 2 = l − l , and high n (a) be the value of the vector a on the most significant n bits. Collecting these N vectors in one single list L, we carry out the following steps: Step 1. Create a new list L 1 from the original list L composed of all the XORs of g j1 and g j2 with g j1 = g j2 , g j1 , g j2 ∈ L such that high l1 (g j1 ⊕ g j2 ) = 0. We say that l 1 bits are eliminated. For j = 1, 2, ..., N , we will regard the column vectors g j as random vectors, thus L 1 has an expected size of m 1 ( N 2 ) 2 −l1 ≈ N 2 2 −(l1+1) . This step is fulfilled by a sort-and-merge procedure as follows: First, sort the N vectors into 2 l1 equivalence classes according to their values on the most significant l 1 bits, thus any two vectors in the same equivalence class have the same value on these bits. Then, look at each pair of vectors (g j1 , g j2 ) in each equivalence class to create L 1 .
Step 2. Create a new list L 2 from L 1 by further eliminating l 2 bits using the same sortand-merge procedure as that in Step 1. That is, first sort the m 1 vectors in L 1 into 2 l2 equivalence classes according to their values on the next most significant l 2 bits, and then look at each pair of vectors in each equivalence class to create L 2 . Similarly, the expected number of elements in L 2 is m 2 ( m1 2 ) 2 −l2 ≈ m 2 1 2 −(l2+1) .
Following the above steps, we make an estimation that, we obtain about m 2 4-tuples 5 (g j1 , g j2 , g j3 , g j4 ) such that high l−l (g j1 ⊕ g j2 ⊕ g j3 ⊕ g j4 ) = 0, which correspond to m 2 parity checks with the correlation α 4 involving only x 0 , x 1 , ..., x l −1 . The running time and memory complexities of the above procedure are essentially proportional to the size of the lists that have been processed, which can be estimated as O(N + m 1 ).
Processing Phase. We now enter the process to recover the first l bits of the LFSR initial state of SNOW 3G. Following a similar method as that in [CJM02, GZ20, LV04], we use the FWT to speed up the evaluation of the m 2 parity check equations, and thus recover the value of the target l bits, which needs a time complexity 6 of O(m 2 + l 2 l ) and a memory complexity of O(2 l ). To guarantee a high success rate, we set m 2 = 2l ln 2/(α 4 ) 2 . Since m 1 = N 2 2 −(l1+1) and m 2 = m 2 1 2 −(l2+1) , the parameter N is determined to be N = (m 2 · 2 2l1+l2+3 ) 1 4 , and thus the required number of keystream words can be computed by D = N/2 + 2.
Complexity Analysis. For SNOW 3G, we follow the above preprocessing phase and processing phase with the parameters l = 512, l = 166. In this case, we need to prepare m 2 = 2l ln 2/(α 4 ) 2 = 2 171.69 approximation relations involving only x 0 , x 1 , ..., x 165 . By choosing l 1 = l 2 = 173, we have m 1 = 2 172.84 and N = 2 173.42 . Thus it requires a keystream of length D = N/2 + 2 = 2 172.42 , and the time/memory complexity for preparing m 2 approximation relations is O(N + m 1 ), i.e., 2 174.16 . The FWT is utilized to determine the first l = 166 bits of the LFSR initial state, which costs a time complexity 2 173.77 and a memory complexity 2 166 . Therefore, all the complexities are all upper bounded by 2 174.16 . Once the first 166 bits are recovered, the other LFSR state bits and the FSM state can be recovered by using a similar method and a small-scale exhaustive search with a much lower complexity. Since the initialization of SNOW 3G is a reversible process, the secret key can be recovered once knowing the initial state.

Comparison
Note that the first significant result on SNOW 3G was given in [YJM19] with all the complexities upper bounded by 2 176.56 , which used an 8-bit linear approximation in a fast correlation attack over F 2 8 . Here we improve slightly this result with all the complexities upper bounded by 2 174.16 in a bitwise fast correlation attack using the bitwise linear approximations. Actually, we have found more choices of tradeoff parameters by applying the bitwise fast correlation attack than the attack over F 2 8 , which can lead to somehow better attacks. Though not a significant improvement, our research results illustrate that we have an opportunity to achieve improvement over the large-unit attacks by using bitwise linear approximations in a linear approximation attack.

Conclusion
In this paper, we study and compare the large-unit and bitwise linear approximations of SNOW 2.0 and SNOW 3G, and present a bitwise fast correlation attack on SNOW 3G by using our newly found bitwise linear approximations, slightly improving the best known attack on SNOW 3G in [YJM19] which mounted a fast correlation attack by using the 8-bit (vectorized) linear approximation. On one hand, Property 1 and Property 2 indicate that approximations on large-unit alphabets have advantages over all the smaller-unit/bitwise ones in linear approximation attacks, and the results on SNOW 2.0 in [ZXM15] gave the impression that large-unit approximations lead to larger SEI and also to better attacks. However, as shown in Section 3.3 and Section 4.3 for the linear approximations of SNOW 2.0 and SNOW 3G, we have found many concrete examples of byte-wise linear approximations whose certain 1-dimensional/bitwise linear approximations have almost the same SEI as that of the original 8-bit ones. That is, each of these byte-wise approximations is dominated by a single bitwise approximation, and thus the whole SEI is not essentially larger than the SEI of the dominating single bitwise one. Since correlation attacks can be more efficiently implemented using bitwise approximations rather than large-unit approximations, improvements over the large-unit linear approximation attacks [ZXM15,YJM19] are possible for SNOW 2.0 and SNOW 3G. For SNOW 3G, we have given a fast correlation attack utilizing bitwise linear approximations, with all the complexities upper bounded by 2 174.16 , which improve slightly the previous best one in [YJM19]. Though not a significant improvement, our results have illustrated that the bitwise linear approximations may lead to better attacks than the large-unit ones. The cryptanalyst should carefully figure out the internal relation between large-unit linear approximations and the smaller ones, and make his/her best choice of attack parameters according to the concrete structure of the primitives. Note that for the new SNOW stream cipher SNOW-V, the large-unit linear approximations and bitwise ones on several close variants of SNOW-V are studied in [EJMY19] and [GZ21] respectively. It is our future work to study the relation between the large-unit and bitwise linear approximations for these variants, and present the large-unit and bitwise linear approximation attacks on the full SNOW-V. Λ = (Λ 0 Λ 1 Λ 2 Λ 3 ), Λ = (Λ 0 Λ 1 Λ 2 Λ 3 ), where Λ j ∈ F 2 8 , Λ j ∈ F 2 8 , we have (Λ 0 Λ 1 Λ 2 Λ 3 ) = lin (Λ 0 Λ 1 Λ 2 Λ 3 ) such that Λ 0 = trans (Λ 0 ) ⊕ Λ 1 ⊕ Λ 2 ⊕ Λ 3 ⊕ trans (Λ 3 ), (11) C Computing the 4-byte value W = l 1 (W) such that M * W = M 1 * 1 W For any 4-byte value W = (W 0 , W 1 , W 2 , W 3 ) and W = l 1 (W) = (W 0 , W 1 , W 2 , W 3 ), we can write W i ∈ F 2 8 and W i ∈ F 2 8 in bits as where W i,j , W i,j ∈ F 2 for i = 0, 1, 2, 3 and j = 0, 1, ..., 7. We derive D Computing the 4-byte value V = l 2 (V) such that M * V = M 2 * 2 V For any 4-byte value V = (V 0 , V 1 , V 2 , V 3 ) and V = l 2 (V) = (V 0 , V 1 , V 2 , V 3 ), we write V i ∈ F 2 8 and V i ∈ F 2 8 in bits as where V i,j , V i,j ∈ F 2 for i = 0, 1, 2, 3 and j = 0, 1, ..., 7. We derive ⊕ V (i)mod 4,7 ⊕ V (i+1)mod 4,7 ⊕ V (i+2)mod 4,7 ⊕ V (i+3)mod 4,7 V i,5 = V i,5 ⊕ V (i)mod 4,7 ⊕ V (i+2)mod 4,7 V i,6 = V i,6 ⊕ V (i+2)mod 4,7 ⊕ V (i+3)mod 4,7 V i,7 = V (i+1)mod 4,7