Cryptanalysis of Plantlet

Plantlet is a lightweight stream cipher designed by Mikhalev, Armknecht and Müller in IACR ToSC 2017. It has a Grain-like structure with two state registers of size 40 and 61 bits. In spite of this, the cipher does not seem to lose in security against generic Time-Memory-Data Tradeoff attacks due to the novelty of its design. The cipher uses a 80-bit secret key and a 90-bit IV. In this paper, we present a key recovery attack on Plantlet that requires around 276.26 Plantlet encryptions. The attack leverages the fact that two internal states of Plantlet that differ in the 43rd LFSR location are guaranteed to produce keystream that are either equal or unequal in 45 locations with probability 1. Thus an attacker can with some probability guess that when 2 segments of keystream blocks possess the 45 bit difference just mentioned, they have been produced by two internal states that differ only in the 43rd LFSR location. Thereafter by solving a system of polynomial equations representing the keystream bits, the attacker can find the secret key if his guess was indeed correct, or reach some kind of contradiction if his guess was incorrect. In the latter event, he would repeat the procedure for other keystream blocks with the given difference. We show that the process when repeated a finite number of times, does indeed yield the value of the secret key.


Introduction
Lightweight stream ciphers have become immensely popular in the cryptological research community, since the advent of the eStream project [est08].The three hardware finalists included in the final portfolio of eStream i.e.Grain v1 [HJM07], Trivium [CP08] and MICKEY 2.0 [BD08], all use bitwise shift registers to generate keystream bits.After the design of Grain v1 was proposed, two other members Grain-128 [HJMM06] and Grain-128a were added to the Grain family mainly with an objective to provide a larger security margin and include the functionality of message authentication respectively.In FSE 2015, Armknecht and Mikhalev proposed the Grain-like stream cipher Sprout [AM15] with a startling trend: the size of the internal state of Sprout was equal to the size of its key.After the publication of [BS00], it was widely accepted that to be secure against generic Time-Memory-Data tradeoff attacks, the internal state of a stream cipher needed to be at least twice the size of the secret key.However the novelty of the Sprout design ensured that the cipher remained secure against generic TMD tradeoffs.The smaller internal state makes the cipher particularly attractive for compact lightweight implementations.However, Sprout has been cryptanalyzed in more ways than one [Ban15, EK15, LNP15, ZG15] and so naturally there has been a lot of research going into the design of secure lightweight stream ciphers.
At the FSE 2017 conference of IACR ToSC, two lightweight stream ciphers Lizard [HKM17] and Plantlet [MAM16] were proposed.Plantlet was essentially a re-design of Sprout after patching some existing weaknesses.The main differences between Plantlet and Sprout are as follows: • Plantlet uses a 61-bit LFSR, whereas Sprout used a 40-bit LFSR.This slight increase in state size is done to prevent guess and determine attacks [LNP15,Ban15].
• In [EK15], a TMD tradeoff attack is outlined using an online time complexity of 2 33 encryptions and 770 TB of memory.The paper first observes that it is easy to deduce the secret key from the knowledge of the internal state and the keystream.The paper then makes an observation on special states of Sprout that produce keystream without the involvement of the secret key.A method to generate and store such states in tables is first outlined.The online stage consists of inspecting keystream bits, retrieving the corresponding state from the table, assuming of course that the state in question is a special state, and then computing the secret key.The process, if repeated a certain number of times, guarantees that a special state is encountered, from which the correct secret key is found.The attack leveraged the fact that the addition of the key bits to the state was done via a non-linear function, due to which it became easy to find special states.Plantlet patched this weakness by making the key addition strictly linear, thus preventing this attack.
• In [Ban15], it was observed that it was easy to find Key-IV pairs that led the LFSR to enter the all zero state after Key-IV mixing.Using this fact the authors were able to compute Key-IV pairs that produced keystream of period 80 bits.The authors were further able to use this fact to mount a guess and determine attack, that required around 2 66.7 Sprout encryptions.To counter this attack, the designers of Plantlet kept the 61st LFSR bit fixed to 1 during the entire Key-IV mixing phase.This ensured that after the Key-IV phase terminated, the LFSR would never enter the all zero state and hence both the above weaknesses were patched.
No cryptanalytic advances have yet been reported against Plantlet that recovers the secret key without the use of side channels.In [HKMZ18], a distinguishing attack against Plantlet was reported that uses data and memory complexity of 2 61 bits, and time complexity of 2 55 steps.In [MSS17], a differential fault attack was reported against Plantlet that recovered the secret key using 4 fault injections.

Contribution and Organization of the Paper
In this paper, we present a key recovery attack on Plantlet that requires a computational complexity of around 2 76.26 encryptions.As mentioned in the abstract, it is first observed that two internal states of Plantlet that differ in the 43rd LFSR location are guaranteed to produce keystream that are either deterministically equal or unequal in 45 locations.Thus, it can, with some probability, be guessed that when 2 segments of keystream blocks that possess the above 45 bit difference is encountered, they have been generated by two internal states that differ in the 43rd LFSR location.
Thereafter the set of polynomial expressions representing each keystream bit is computed and tabulated in an equation bank.If the guess was correct, the secret key is computed by solving the above set of polynomial equations.If not, the attacker reaches some kind of contradiction in the computations, concludes that his guess was incorrect, and starts afresh.We show that the process when repeated a finite number of times, does indeed yield the value of the secret key.After this, we observe that the previous attack was limited to internal state differences that occurred at time instances that were congruent to 0 mod 80.We further observe that by generalizing the attack to include internal state differences that are congruent to all equivalence classed modulo 80, we lower the total number of keystream bits required to perform the attack and in the process reduce the attack complexity.The rest of the paper is organized in the following manner.

1.
In Section 2, we present the mathematical description of the Plantlet stream cipher.

2.
In Section 3, we introduce some lemmas which serve to lay the mathematical foundations and form the building blocks of the attack.

3.
In Section 4, we present a mathematical description of the attack.We clearly outline that stages of the attack serially and estimate the time and memory complexities of each phase.We also present experimental evidence in support of the claims made in the complexity analysis.

4.
In Section 5, we show how the attack can be extended to all equivalence classes modulo 80, and hence reduce the attack complexity.

Description of Plantlet
The exact structure of Plantlet is explained in Figure 1.It consists of a 61-bit LFSR and a 40-bit NFSR.Certain bits of both the shift registers are taken as inputs to a combining Boolean function, whence the keystream is produced.The keystream is produced after performing the following steps: Initialization Phase: The cipher uses an 80 bit Key and a 90 bit IV.The first 40 most significant bits of the IV is loaded on to the NFSR and the remaining IV bits are loaded on to the first 50 most significant bits of the LFSR.The last 11 bits of the LFSR are initialized with the 11 bit constant 0x7fd, i.e. the string of nine 1 s followed by 01.Let L t = [l t , l t+1 , . . ., l t+60 ] and N t = [n t , n t+1 , . . ., n t+39 ] be the 40-bit vectors that denote respectively LFSR and NFSR states at the t th clock interval.During the initialization phase, the registers are updated as follows.

(b)
The LFSR updates as l t+59 = z t + f (L t ) (the last bit is fixed to 1), where (c) The NFSR updates as n t+40 = z t + g(N t ) + c 4 t + k * t + l t 0 , where c 4 t denotes the 4 th LSB of the modulo 80 up-counter which starts at t = 0, k t is the output of the Round Key function defined as: Here K i simply denotes the i th bit of the secret key.The non-linear functions g(N t ) and g (N t ) is given as: Keystream Phase: After the initialization phase is completed, the cipher discontinues the feedback of the keystream bit z t to the update functions of the NFSR and LFSR and makes it available as the output bit.During this phase, the LFSR and NFSR update themselves as l t+60 = f (L t ) and n t+40 = g(N t ) + c 4 t + k * t + l t respectively.Thus the LFSR now behaves as a 61 bit linear register, whereas during key-IV mixing, it was essentially functioning as a 60 bit register.It is recommended by the designers that one single key-IV pair not be used to generate more than 2 30 keystream bits.

Observations on the differential structure of Plantlet
Before outlining the attack on Plantlet, we proceed to list some observations on the structure of Plantlet, that will help us construct the attack.The following observations and lemmas should be seen as building blocks of our attack.

Observation 1:
The first of them is as follows: if the secret key is known, then the state update in the keystream phase is one-to-one and efficiently invertible.Before proceeding, we give a formal algorithmic description of the state update inversion routine in the keystream and initialization phase.We denote the algorithm by the notation KS −1 .Lemma 1.Given two time instances during the keystream phase t 1 , t 2 (with t 2 > t 1 and both less than 2 30 ), and the 61-bit difference vector δ = L t1 ⊕ L t2 .Then it is possible to compute the LFSR states L t1 , L t2 efficiently.
Input: L t , N t : The LFSR, NFSR state at time t; Output: L t−1 , N t−1 : The LFSR, NFSR state at time t − 1; Proof.If M is the companion matrix over GF (2) of the connection polynomial p(x) of the LFSR, then we can write L t+1 as a matrix-vector product between M and L t .Thus we have L t+1 = M • L t .We thus have L t2 = M t2−t1 • L t1 .And so we have, The above is a system of linear equations with the 61 variables in the L t1 vector as unknowns.Further it is known that the minimal polynomial of M is the connection polynomial p(x) of the LFSR itself.Since p(x) is primitive, its roots α 2 i , ∀ i ∈ [0, 60] are the eigenvalues of M (here α denotes any root of p(x)).Define Since 2 i is coprime with 2 61 − 1, we must have T ≡ 0 mod 2 61 − 1, which is a contradiction, since 0 < T < 2 30 .Thus M T ⊕ I has nonzero eigenvalues and is hence invertible.So the above system of equations can be solved efficiently by using Gaussian elimination to compute L t1 and hence L t2 .
Lemma 2. Consider two Plantlet internal states S t1 = (N t1 , L t1 ) and S t2 = (N t2 , L t2 ) during the keystream phase such that N t1 = N t2 and L t1 ⊕ L t2 = e 43 , (e i is the 61-bit unit hamming weight vector, with 1 at location i).We further impose the condition that t 1 = t 2 and they are both multiples of 80. Then consider the vectors Z t1 and Z t2 of the first 80 keystream bits generated by S t1 and S t2 respectively.Also consider the vectors Y t1 and Y t2 of the first 80 keystream bits produced by S t1 and S t2 respectively, in the backward direction, i.e. by running the KS −1 routine.To be more specific Then in the 160 bit difference vector ∆ = Z t1 ||Y t1 ⊕ Z t2 ||Y t2 , there are 45 bits that take the value 1 or 0 with probability 1, i.e. when the probability is computed over all possible initial states S t1 .
Proof.The above result is not difficult to verify, if we analyze the differential trail of the difference when introduced in the 43rd LFSR location.First of all since both t 1 and t 2 are multiples of 80, the values of the key bit k * t and counter bit c 4 t used in their respective update functions are the same.In the forward direction, for j ∈ [0, 10] ∪ {12} ∪ [14, 19] ∪ [21, 23] ∪ {25} ∪ [27, 28] ∪ {30, 32, 34, 39, 41}, the differences (between the Plantlet LFSR states L t1+j and L t2+j ) sit on LFSR locations that are not used in the computation of the keystream bit.Hence for all such j, z t1+j = z t2+j .Whereas, for j ∈ {13, 31, 40}, the difference appears at tap location l 30 , that contributes to the keystream equation linearly.For all such j , we have z t1+j ⊕ z t2+j = 1, with probability 1. Similarly in the backward direction, for m ∈ {−13} ∪ [−11, −1], the differences do not affect keystream equation, hence for all such m, we have z t1+m = z t2+m .At m = −12, the difference is on the NFSR tap location n 1 , which also contributes to the keystream equation linearly.Hence we have z t1+m ⊕ z t2+m = 1.There are, in total, 45 time instances where these events take place, hence a total of 45 bits in the difference vector are guaranteed to be either 0 or 1, with probability 1.
Note that the 43rd LFSR bit was chosen as the initial difference location because it maximizes the number of bits in ∆ that are deterministically equal to 0 or 1.

Lemma 3. Consider the same setting as in the previous lemma, i.e. we have two
Plantlet internal states S t1 = (N t1 , L t1 ) and S t2 = (N t2 , L t2 ) during the keystream phase such that N t1 = N t2 and L t1 ⊕ L t2 = e 43 , (e i is the 61-bit unit hamming weight vector, with 1 at location i), with t 1 = t 2 and they are both multiples of 80. Then consider the vectors Z t1 , Y t1 and Z t2 , Y t2 as defined previously.We have the following identities: Proof.The above lemma is similarly verified by a study of how the single bit difference at LFSR location 43 propagates through the internal state.For j = 20, for example, the difference between S t1 , S t2 is at LFSR location 23.Then the sum of the keystream bits z t1+20 , z t2+20 can be essentially expressed as: The other expressions can be verified similarly.Note that 7 of the 14 expressions listed above, depend only on L t1 , whereas z t1+46 ⊕ z t2+46 consists of a single product term involving an LFSR bit.Lemma 4. In the event that we generate uniformly randomly, N internal states of Plantlet S ti ∈ {0, 1} 101 , for i ∈ [1, N ], then the probability that there exists 2 102 .The above can be modeled as a Bernoulli trial with success probability p (where "success" is defined as the event in which we sample two internal states with the given difference).Then by repeating the above experiment (in which we sample internal states randomly) around 1 p times, we can expect to obtain one successful event.
Proof.The first probability value in the lemma is easy to prove, by birthday bound considerations.Given N samples in a domain of size 2 101 , the probability q that there are no collisions of the required type is given by Thus p = 1 − q results in the required expression.We repeat the above experiment a number of times.Therefore what we do is as follows: 1. Randomly sample N states and look for the given difference.
2. If the above trial fails, then erase the above samples and repeat step 1.
It is easy to see that the above results in a series of Bernoulli trials with probability of success p.The probability distribution of the number of such trials needed to get one success, is a geometric distribution with mean 1 p .Hence the second claim in the lemma follows.

Key recovery attack on Plantlet
Having made some preliminary observations about differential structure of Plantlet, we are now ready to mathematically describe the cryptanalytic steps.Note that, in the preceding experiment if N = 2 51 , then by birthday bound, one such trial would be sufficient.However we limit the value of N , because there is a limit to the maximum amount of keystream bits that can be generated using a single key-IV pair, and this limit is 2 30 bits.The attacker lets the cipher run for 2 30 cycles and collects the required keystream, with the idea that the cipher during its operation hits two internal states at times t 1 , t 2 (both multiples of 80) that differ in only the 43rd LFSR location.If it does, then the attacker can identify the states and the corresponding values of t 1 , t 2 by looking at the difference keystream vector ∆ = Z t1 ||Y t1 ⊕ Z t2 ||Y t2 (which was defined in the previous section).But there are obvious obstacles to this idea: • Firstly given the limited amount of keystream bits one is allowed to generate with one key-IV pair, by Lemma 4, it is extremely unlikely that the attacker will actually encounter two states with the required difference.Hence the attacker must repeat the experiment with the same key and some other randomly selected IV multiple number of times.Lemma 4, also enumerates the number of times (i.e. 1 p ) the experiments need to be repeated to get a success.
• Second, although it is true that two internal states with difference only at 43rd LFSR location, produces keystream bits whose differential is guaranteed to be 0 or 1 at 45 fixed locations, the opposite is not true.In fact there exist, with probability around 2 −45 , two completely random internal states of Plantlet that produce a keystream differential of 0/1 at the same 45 locations enumerated in Lemma 2. Thus the attacker, when for some (t 1 , t 2 ), observes a differential keystream having required 0/1 pattern in the locations enumerated in Lemma 2, may still proceed to the next steps, assuming that they were generated by two Plantlet states with difference in the 43rd LFSR bit.But if his assumption about the state difference is wrong, then in the subsequent steps he would certainly reach a contradiction that would invalidate the assumption.The attacker would then require to repeat the experiment to obtain some other t 1 , t 2 until he is successful in getting internal states with required difference.
• Thus any attack, must compensate for these computational overheads listed above.
Thus, at the very top level, the strategy of the attacker will be as follows: A: Generate 2 30 keystream bits with the given secret key and any random IV.This generates N = 2 30 80 ≈ 2 23.7 keystream segments of length 80-bits each.B: For all t = 80 • i where i ∈ [1, N − 1], store in a hash table t, Z t , Y t as defined.
C: From this table, try to find, if it exists, t 1 , t 2 so that ∆ = Z t1 ||Y t1 ⊕ Z t2 ||Y t2 exhibits the 1/0 pattern in the locations listed in Lemma 2. We refer to such an event as a keystream-collision.

D:
If there exists one or multiple such t 1 , t 2 , then assuming that the state differential in between the states at time t 1 , t 2 is 0 40 ||e 43 , try to solve for the remaining system of equations to find the key.
E: If a contradiction is reached, try other values of t 1 , t 2 , if they exist.If none exist then repeat step A with another IV.If the attacker does not encounter a contradiction, and is able to solve the equations, he would have computed the secret key.By Lemma 4, 1 p = 2 54.6 such trials should be sufficient to solve for the secret key.We now try to explain the finer details of the attack, starting with a precomputation step that would ease the computational burden in the online stage of the attack.

Precomputation Stage
If the attacker encounters a keystream-collision, regardless or not whether it was generated by two states with a single bit difference at the 43rd LFSR location, he will need to try to solve the resulting system of polynomial expressions for each keystream bit in Z t1 , Z t2 , Y t1 and Y t2 , assuming that the corresponding internal state difference is 0 40 ||e 43 .These are a system of boolean polynomials in 181 variables over GF(2) (40 for the NFSR state, 61 for the LFSR state, and 80 for the key).Such a system should generally be intractable to solve.However the attacker can use the results in Lemma 1, to get the value of the states L t1 and L t2 , since a keystream-collision, would automatically provide the values of t 1 , t 2 .He simply solves the equation e 43 = (M t2−t1 ⊕ I) • L t1 to compute L t1 and then after that computes L t2 = L t1 ⊕ e 43 .Once the entire LFSR states at times t 1 , t 2 are known, the resulting equation system is now defined over 120 unknowns which is much easier to solve by using any publicly available equation solver.
However since T = t 2 − t 1 , is the only varying parameter in the equation e 43 = (M t2−t1 ⊕ I) • L t1 , one can pre-solve the above set of equations for all possible values of T .Note that 1 ≤ t 1 < t 2 ≤ 2 30 and since t 1 , t 2 are multiples of 80, there are only around N − 1 = 2 30 80 − 1 ≈ 2 23.7 different values of T .Each equation can be solved offline, and the solutions stored in a table sorted along with the value of T .Thus in this way, in the online stage, finding the value of L t1 , L t2 from the values of t 1 , t 2 amounts to only a table lookup.
The total computational complexity in the offline stage amounts to solving N −1 ≈ 2 23.7 equations over 61 variables.Assuming conservatively, that it takes O(n 3 ) steps to do Gaussian elimination to solve the system, the total number of steps involved is bounded by 61 3 • 2 23.7 ≈ 2 41.5 .It takes 61 bits to store the solution of the equation (in the table cell indexed by T ) and so the memory complexity of this stage is 61 • 2 23.7 ≈ 2 29.6 bits.

Online Stage I: Collecting and storing keystream bits
In the online stage, the attacker needs to collect and store keystream bits and store it in a judicious manner.Note that since N = 2 30 80 ≈ 2 23.7 , the value of p calculated in Lemma 4 is around p = 2 −54.6 and so the number of IVs we need to try is around V = 1 p = 2 54.6 .For each such IV, the attacker proceeds to generate N keystream bits.
To facilitate detection of keystream-collision, one must choose a data structure to efficiently store keystream segments.For all t = 80 • i where i ∈ [1, N − 1], the attacker has to store in a hash table t, Z t , Y t as defined in Lemma 2. However it is unwise to insert the tuple into the table location indexed by t.Instead we insert the tuple in the table location I = z t+g0 ||z t+g2 || • • • ||z t+g44 , where the g i s are the locations enumerated in Lemma 2, where the differential keystream is guaranteed to be 0/1.Thus (g 0 , g 1 , . . ., g 40 ) = (0, 1, 2, . . ., 10, 12, 14, 15, . . ., 19, 21, 22, 23, 25, 27, 28, 30, 32, 34, 39, 41, −13, −11, −10, . . ., −1) are the locations where the difference is 0, and (g 41 , g 42 , g 43 , g 44 ) = (13, 31, 40, −12) are locations where the difference is 1.Note that each entry in the table should be able to store multiple entries.It is not difficult to see that a keystream-collision will occur if during an insertion into index I, the attacker checks the index , and finds one or multiple tuples already stored at I * .For each such keystream-collision pair in (I, I * ), the attacker proceeds to the next steps of the attack.
It takes 30 bits to store t and 160 bits to store Z t , Y t and so each IV trial takes around 190 • N ≈ 2 31.25 bits of memory on average.

Online Stage II: Further filtering
For each keystream-collision pair obtained in the previous step, the attacker can perform further filtering.First, let t 1 , Z t1 , Y t1 and t 2 , Z t2 , Y t2 be a pair filtered from the previous stage.The attacker can then compute t 2 − t 1 , and retrieve the value of L t1 from the precomputed table.By Lemma 3, there are 8 other bits in Z t1 ⊕ Z t2 that are directly related to L t1 .Since during the keystream stage the LFSR evolves independently, all l t1+i can be computed with the knowledge of L t1 alone.This provides us with an opportunity to further filter the keystream-collision pairs obtained from the stage.For example 1. If, the attacker finds that z t1+20 ⊕ z t2+20 = l t1+39 he can reject the pair.
So, let us calculate the probability that a given IV produces a keystream-collision pair that survives both the filter levels described above.Note that since a single IV can produce N tuples, the total number of pairs of tuples are D = N (N −1) 2 ≈ 2 46.36 .Denote α i = z t1+gi ⊕ z t2+gi and also define the following notations: The probability that a pair is not rejected is given as In the above calculation, we have assumed that the events P r(β 7 = 1) and P r(l t1+78 = 0) are statistically independent, but under this situation this is a fair assumption to make.Let X t1,t2 be the indicator variable that is 1 when the tuples at t 1 , t 2 are not rejected by the filters, and zero otherwise.Then we have shown that E(X t1,t2 ) = ρ.Let P s be the expected number of pairs that survive during processing keystream generated a single IV.We have Thus the total number of pairs that survive trials with V different IVs is given as 54 .This is the number of pairs that proceed to the next stage of the attack.

Online Stage III: Solving Equation System
The final stage of the attack involves attempting to solve the equation system resulting from the keystream segment pairs.The attacker has to try to solve the P u sets of equations assuming that they were generated by two Plantlet states that differ by 0 40 ||e 43 .Most of the times the assumption is wrong, so that a contradiction is arrived at.However the value of V has been chosen so that the attacker encounters, on average, at least one state pair with the required difference, which he can solve to find the secret key.
We construct the equation system over the polynomial ring Z 2 [N, K], where N = {n 0 , . . ., n 39 }, and K = {k 0 , . . ., k 79 }, where the variables k i correspond to the bits of the key, and the variables n i correspond to the bits of the NFSR.As explained in Section 4.3, if Z t1 , Y t1 and Z t2 , Y t2 satisfy the filtering criteria, we assume that N t1 = N t2 , and L t1 = L t2 ⊕ e 43 .We can compute the value of L t1 , and L t2 , from the precomputed tables.So let's assume L t1 = (l 0 , . . ., l 60 ), and L t2 = (l 0 , . . ., l 60 ) + e 43 .We can now generate the polynomial expressions for each keystream bit, by considering the content of the LFSR to be the 61 bit string (l 0 , . . ., l 60 ) over GF (2), the content of the NFSR to be the boolean variables (n 0 , . . ., n 39 ), and the key denoted by the boolean variables (k 0 , . . ., k 79 ), and do all the computations in R = Z 2 [N, K].Let the polynomial expressions generated this way be (z * t1 , . . ., z * t1+79 ), where all the entries are polynomials with unknowns in K ∪ N .Now we consider the equations of form z * t1+i ⊕ z t1+i = 0 for i ∈ [0, 79] and add them to the equation system.We can also do the same for the stream generated from time t 2 , by loading the LFSR with the initial value (l 0 , . . ., l 60 ) + e 43 , and the NFSR with the same variables (n 0 , . . ., n 39 ) and construct the equations of form z * t2+i ⊕ z t2+i = 0 for i ∈ [0, 79].We also generate the polynomial expressions for the keystream bits in the backward direction (z * t1 , z * t1−1 , . . ., z * t1−79 ) and (z * t2 , z * t2−1 , . . ., z * t2−79 ) with the same initial register values, and add the equations of form z * t1−i ⊕ z t1−i = 0 and z * t2−i ⊕ z t2−i = 0, to our system.This way we will have 4 × 80 equations over 120 unknowns.1As we are only looking for solutions in Z 2 , we can consider this system a SAT problem.We feed the system of equations to a SAT solver.For an incorrect assumption on the states generating a given differential keystream, a SAT based solver returns UNSAT, which is to say the system of equations fed to it are inconsistent.Thus this gives us an efficient method to arrive at a contradiction and reject an incorrect guess of initial state difference.

Experimental Results
The computations in this section were done using the computer algebra software SAGE 8.7 [Dev17], and we used the Cryptominisat 5.0.1 [SNC09] package for solving the underlying equation system.All experiments were done on an AMD Opteron 8354 processor with CPU speed of 2200 MHz running on Ubuntu 14.04.6 LTS.We ran three experiments: 1. First we estimated the time it takes the SAT solver to return UNSAT, i.e. when the attacker incorrectly assumes that a given differential keystream (which satisfies all filter requirements in Section 4.3) is produced by two Plantlet states differing only by 0 40 ||e 43 .Note that most of the times (around P u − 1 times), the attacker will have to face this situation, and hence it is important to measure the computational cost of this task.
2. Second we estimated the time it takes the SAT solver to return SAT, i.e. when the attacker correctly assumes that a given differential keystream (which satisfies all filter requirements in Section 4.3) is produced by two Plantlet states differing only by 0 40 ||e 43 .When this event occurs, the attacker would have successfully computed the value of the secret key.
3. Finally, we estimated the amount of time needed to perform one Plantlet encryption.This step is important because this way we can estimate the computational cost of solving an equation in terms of the computational cost of an encryption.Since there is no straightforward way to compute the number of steps taken by the solver to solve a given polynomial system, there is no good way of comparing the computational costs of solving an equation and performing one encryption.Due to this fact, many papers [ZLFL14,MAM16] in the past have measured the physical time to perform the above tasks to make a comparison.In [MAM16], in order to estimate the computational complexity of guess and determine attacks, the authors had measured the time of performing one encryption and concluded that it was possible to perform around 2 10 encryptions per second on their system.Using this fact and after experimentally finding the average physical time required to solve a particular set of equations, they had concluded that guess and determine attacks on Plantlet did not perform better than a brute force attack.We adopt a similar method to estimate the bounds we present in this paper.
First we estimate the time it takes the SAT solver to return UNSAT.For this, we randomly generate a pair of keystream segments (Z t1 , Y t1 and Z t2 , Y t2 ) of length 160 bits each and a 61 bit initial LFSR state L such that they satisfy all the filtering criterion in Section 4.3.Using the variables N, K and the bit-string L the polynomial expressions for Z t1 , Y t1 are computed.Similarly, using the variables N, K and the bit-string L ⊕e 43 we generate the polynomial expressions for Z t2 , Y t2 .The polynomials and keystream bits generated earlier form the left and right sides of an equation bank we make.Since the keystream bits and polynomial equations were generated randomly and independent of each other, the system of equations when fed to a SAT solver, will with a high probability make the solver return UNSAT.Note that we have set up the above system of equations in a manner so as to simulate the event when the attacker observes a differential keystream that satisfies all filtering requirements and incorrectly assumes that the bits were generated by two states differing in the 43rd LFSR location.
While doing the the experiments, we found that the solver returns a SAT/UNSAT verdict faster when the last 4 bits of the NFSR are additionally guessed.So we essentially solve the system of equations over 116 unknowns.We ran the above set of experiments for 1000 randomly generated samples, and the results are presented in Figure 2. The figure is a probability distribution histogram of the time taken for the solver to return UNSAT.The x-axis represents the time taken in seconds and the y-axis the corresponding probability of Second we estimate the time it takes the solver to return SAT, i.e. when the attacker correctly predicts the state differential.For this, we randomly generate a 101-bit initial state N, L and a 80 bit key K.We generate Z t1 , Y t1 using N, L, K and Z t2 , Y t2 using N, L ⊕e 43 , K. Thereafter, we generate the polynomial expressions for Z t1 , Y t1 using the variables N, K and the bit string L. Similarly, we generate the polynomial expressions for Z t2 , Y t2 using the variables N, K and the bit string L ⊕e 43 .Again, an equation bank is created with the expressions and keystream bits in the left and right sides.Since the keystream bits and polynomial equations were generated consistent with each other, the system of equations when fed to a SAT solver, will return the correct solution N = N and K = K.Note that we have set up the above system of equations in a manner so as to simulate the event when the attacker observes a differential keystream that satisfies all filtering requirements and correctly assumes that the bits were generated by two states differing in the 43rd LFSR location.
Again the last 4 bits of the NFSR are additionally guessed for getting faster solutions.We ran the above set of experiments for 1000 randomly generated samples, and the results are presented in Figure 3.The probability density function was again close to a normal distribution (as indicated by the black curve), with mean µ SAT = 66.17 seconds, and standard deviation σ SAT = 39.26 seconds.
Third, we also computed the time it takes to perform a single encryption.As argued in [EK15,Ban15] one Plantlet encryption should be equal to the average number of Plantlet rounds required to be executed per trial with a guessed value of the key (in a brute force search).This comes to 320 initialization rounds and 4 rounds in the keystream generation phase.We have given a proof of this in Appendix A (at the end of this paper).Thus one Plantlet round is equivalent to around 1 324 = 2 −8.34 Plantlet encryptions.We measured the time to perform 320 initialization and 4 keystream generation rounds (for around 10,000 random Key-IV samples) and the results are presented in Figure 4.As expected the distribution is close to normal, with a mean of µ EN C = 0.0057 seconds, and the standard deviation is σ EN C = 0.000699 seconds.We have provided SAGE codes used by us as auxiliary material attached to the paper.Note that in Grain-like designs, it is possible to optimize the encryption speed, by computing multiple rounds in a single clock cycle.For example, Grain v1 does not use any of the last 16 bits of both the linear and non-linear registers as inputs to the update or the keystream generating functions.As a result a 16 times speedup in software is possible by doing 16 round updates in one iteration.However that is not the case with Plantlet, as even the last NFSR bit is used in the non-linear update function g.
As pointed out earlier, the main aim of the previous experiment was to compare the cost of solving an equation and performing an encryption, so as to find the equivalent computational cost of solving P u equations in terms of Plantlet encryptions.Since we guess 4 bits of NFSR, the computational cost of returning UNSAT can be estimated on average to be C u = 2 4 •µ U N S µ EN C ≈ 2 17.13 Plantlet encryptions.Similarly the cost of returning SAT is around C s = 2 4 •µ SAT µ EN C ≈ 2 17.5 Plantlet encryptions.

Total Complexity of attack
The total time complexity in the precomputation stage has been already calculated in Section 4.1 as bounded by 2 41.5 bit operations.The online complexity is dominated by three computational tasks.
2. Second is the time required to solve equations.A total of P u = 2 48.54 equations need to be solved.Only one of these equations are expected to yield the correct solution for the secret key.In the previous section we have argued that the time to solve an equation unsuccessfully (i.e.yielding UNSAT from the solver) is computationally equivalent to C u = 2 17.13 encryptions, whereas to solve successfully is equivalent 3. The total memory access is dominated by the number of table insertions done in the online stage of the attack.We need a total of N • V = 2 54.6+23.7 = 2 78.3 table insertions.Note that any point of time of the attack we do not need more than 2 31.25 + 2 29.6 ≈ 304 MB of memory.Thus the tables can be stored in the primary memory and accessed reasonably quickly.We do not have a method to reliably compare this to the number of encryptions, but 2 78.3 memory accesses is not likely to take more time than 2 76.26 Plantlet encryptions, by any fair estimation.
Thus the computational complexity is dominated by the task of generating keystream and equal to 2 76.26 Plantlet encryptions.The memory required for the storage of the precomputed tables has been already calculated to be around 2 29.6 bits.Other than that, for each IV trial in the online stage, we already shown that around 2 31.25 bits are required.

Improving Attack Complexity
Based on the observations in the above section, we can propose a more efficient key recovery attack on Plantlet.First of all we note that one of the conditions of Lemma 2 and 3 is that both t 1 and t 2 are multiples of 80.It can be easily seen that both the lemmas also hold if we relax the conditions on t 1 , t 2 to t 1 ≡ t 2 mod 80.This is because if t 1 , t 2 belong to the same equivalence class modulo 80, then the sequence of key and counter bits used to update S t1 = (N t1 , L t1 ) and S t2 = (N t2 , L t2 ) are the same.Thus the differential evolution of S t1 , S t2 is independent of key and counter bits and hence the lemmas naturally hold.
Thus, we can see the attack in the previous section as limited to the equivalence class 0 mod 80. Thus one can naturally try to improve the attack complexity by extending the attack to all equivalence classes 0 mod 80.We will see how subtly tweaking the above steps can lead to a more efficient attack.a) Generating Keystream: We need 2 30 to generate keystream bits from 2 48.32 IVs.
b) Solving Equations: Since the total number of trials does not change, all the attack procedure described in Sections 4.3,4.4do not change.A total of P u = 2 48.54 equations satisfy both filtering levels and need to be solved.Thus the computational burden for this task is around (P u − 1) • C u + C s ≈ 2 65.7 encryptions.

c) Memory access:
The total memory access is proportional to the number of trials.
Hence this complexity too does not change.We need a total of 2 54.6+23.7 = 2 78.3 table insertions.
Thus the dominant time complexity is the one required to generate keystream and is around 2 69.98 Plantlet encryptions.

Conclusion
In this paper, we propose a key recovery attack on the Plantlet stream cipher.The first attack requires 2 30 keystream bits to be generated with the secret key and 2 54.6 randomly chosen IVs.This is computationally equivalent to performing 2 76.26 Plantlet encryptions.
The attack takes advantage of the sparse locations of bits tapped from the LFSR which are used as inputs to the filter function producing the keystream bit.As a result, two Plantlet states that differ in the 43rd LFSR location are guaranteed to produce keystream that are either equal or unequal in 45 locations with probability 1.This enables us to get a reasonably reliable probabilistic mapping from a differential keystream with a given pattern to a given difference in the internal state.Using precomputed tables, we can probabilistically extract the LFSR part of the two states, once we encounter a differential keystream with the required 0/1 pattern.We then try to solve for the remainder of the state and the secret key.The process, if repeated with 2 54.6 randomly chosen IVs, each generating 2 30 keystream bits, is expected to give us the correct value of secret key at least once.
In the second part of the paper, we observe that the previous attack was limited to values of t 1 , t 2 limited to the equivalence class 0 mod 80.We extend the scope of the attack to all equivalence classes modulo 80.This requires the attacker to generate keystream from much lesser number IVs, and reduces the online complexity to 2 69.98 Plantlet encryptions.

Figure 2 :
Figure 2: Histogram of the SAT solver abort time

Figure 3 :
Figure 3: Histogram of the SAT solver runtime

Figure 4 :
Figure 4: Histogram of the encryption runtime