Abstract. ICEPOLE is a CAESAR candidate with the intermediate level of robustness under nonce misuse circumstances in the original document. In particular, it was claimed that key recovery attack against ICEPOLE is impossible in the case of nonce misuse. ICEPOLE is strong against the differential cryptanalysis and linear cryptanalysis. In this paper, we developed the differential-linear attacks against ICEPOLE when nonce is misused. Our attacks show that the state of ICEPOLE–128 and ICEPOLE–128a can be recovered with data complexity 246 and time complexity 246 ; the state of ICEPOLE–256a can be recovered with data complexity 260 and time complexity 260 . For ICEPOLE–128a and ICEPOLE–256a, the secret key is recovered once the state is recovered. We experimentally verified the attacks against ICEPOLE–128 and ICEPOLE–128a.

Key words: ICEPOLE, Authenticated cipher, CAESAR, Differential-linear cryptanalysis

1

Introduction

ICEPOLE is a new hardware-oriented single-pass authenticated cipher designed by Morawiecki et al. . It was submitted to the CAESAR competition [14] and published in CHES 2014 [15]. ICEPOLE is designed to be hardware-efficient. It can achieve 41 Gbits/s on the modern FPGA device Virtex 6 which is over 10 times faster than the equivalent implementation of AES-128-GCM [12]. ICEPOLE adopts the well-known duplex construction by Bertoni et al. [2] and uses a Keccak-like permutation as its iterative function. The ICEPOLE family of authenticated ciphers includes three variants: ICEPOLE-128, ICEPOLE–128a and ICEPOLE–256a. Note that the definition of ICEPOLE–128 has been slightly modified in the CHES 2014 version, which removed the use of secret message number. In this paper, we will follow the version submitted to the CAESAR competition. For the security of ICEPOLE, the designers claim that the confidentiality is the same as the key length, which is 128-bit for ICEPOLE–128 and ICEPOLE–128a and 256-bit for ICEPOLE–256a. The authentication security is 128-bit for all the three variants. In particular, it is mentioned in the document that the internal permutation is strong enough such that “in the case of nonce reuse the key-recovery attack is not possible” and the authenticity “is not threatened by nonce reuse”. In this paper, we apply the differential-linear cryptanalysis to study the security of the permutation of the ICEPOLE family. The differential-linear cryptanalysis is the combination of differential cryptanalysis [5] and linear cryptanalysis [11]. It was introduced by Langford and Hellman in [9] in 1994 to attack the block cipher DES, and later Biham, Dunkelman and Keller gave an enhanced version of this method [3] . This method has been applied to analyze a number of block ciphers such as Serpent [4, 7], CTC2 [8, 10] and SHACAL–2 [16]. Differential-linear attack is also successful in the analysis of certain stream ciphers, e.g., Phelix which involves the message in the state update function [17]. Although the design of authenticated cipher ICEPOLE is different from block ciphers, we manage to exploit the differential-linear property of the permutation when the nonce is reused 1 . We show 1

Here nonce includes the public message number and secret message number. The associated data is set to be identical or empty which is generally allowed.

2

Tao Huang, Ivan Tjuawinata, Hongjun Wu

that under the nonce-reuse assumption, there exists distinguishing attacks on ICEPOLE with both time and data complexity less than 236 . Furthermore, it is possible to recover the 256 bits unknown state of ICEPOLE–128 and ICEPOLE–128a with practical complexity 246 , and recover the 320 bits unknown state of ICEPOLE–256a with complexity 260 . We experimentally verified our results by recovering the state of ICEPOLE–128 using a 64-core server within 10 days. Thus, the security claims of ICEPOLE do not hold under the nonce-reuse circumstances. Due to the analysis of this paper, the designers have updated the security claims as “in the case of nonce misuse, the intermediate level of robustness (specified in the documentation) holds only when the SMN is present and respected, namely each message has the corresponding, unique secret message number ” [13]. In the presence of unique SMN, the attack in this paper will no longer work. The reason is that the unique SMN plays the role of the nonce in the initialization, and prevents the differential attack in the message processing. The rest of this paper is structured as follows: The specification of ICEPOLE is given in Section 2. Section 3 describes a differential-linear distinguishing attack on ICEPOLE. Section 4 introduces the state-recovery attack. Section 5 provides our experimental results of the state-recovery attack on ICEPOLE-128. Section 6 concludes the paper.

2

The ICEPOLE Authenticated Cipher

The ICEPOLE family of authenticated ciphers uses three parameters: key length (128 or 256 bits), secret message number (SMN) length (0 or 128 bits) and nonce length (96 or 128 bits). ICEPOLE –128 has 128-bit secret message number, 128-bit key, and 128-bit nonce. The other two variants, ICEPOLE –128a and ICEPOLE –256 a, have no secret message number with 128- and 256-bit secret key respectively. The nonce length for these two variants is 96-bit. We will briefly describe the specification of ICEPOLE authenticated cipher. The full specification can be found in [14]. An overview of ICEPOLE–128 is provided in Figure 1.

Fig. 1. General scheme of ICEPOLE encryption and authentication (Fig. 1 of [14])

2.1

Notations

The ICEPOLE algorithm has a 1280-bit internal state S. It uses the little-endian convention. The organization of internal state is similar to Keccak [1], which uses a 3-dimension structure. Therefore, the 1280-bit state S can be represented as S[4][5][64], or shortly S[4][5], an array of 64-bit words. For S[x][y][z], it is corresponding to the 64(x + 4y) + z-th bit of the input. ICEPOLE uses 4 × 5 slices. Each slice has 4 rows and 5 columns. And Sbnc denotes the first n bits of the state.

Differential-Linear Cryptanalysis of ICEPOLE

2.2

3

The ICEPOLE permutation P

The permutation P is applied iteratively on the ICEPOLE state S during the encryption and authentication. Each permutation is called a round or R. The 6–and 12–round of P are represented as P6 and P12 respectively. Each round includes five operations: µ, ρ, π, ψ, and κ. R=κ◦ψ◦π◦ρ◦µ The operations are defined as follows: µ: A column vector (Z0 , Z1 , Z2 , Z3 ) is multiplied by a constant matrix to produce a vector of four 5-bit words. 2Z0 + Z1 + Z2 + Z3 Z0 2 1 1 1 1 1 18 2 Z1 Z0 + Z1 + 18Z2 + 2Z3 1 2 1 18 × Z2 = Z0 + 2Z1 + Z2 + 18Z3 Z0 + 18Z1 + 2Z2 + Z3 Z3 1 18 2 1 The operations are done in GF (25 ). The irreducible polynomial x5 + x2 + 1 is used for the field multiplication. And µ can be efficiently implemented with simple bitwise equations, see Appendix B in [14]. ρ: The ρ step is the bitwise rotation on each of the 20 64-bit words. It is defined as: ρ(S[x][y]) = S[x][y] ≪ offsets[x][y]

for all(0 ≤ x ≤ 3), (0 ≤ y ≤ 4)

The rotation offsets are as follows: offset[0][0] := 0 offset[0][4] := 18 offset[1][3] := 45 offset[2][2] := 43 offset[3][1] := 55

offset[0][1] := 36 offset[1][0] := 1 offset[1][4] := 2 offset[2][3] := 15 offset[3][2] := 25

offset[0][2] := 3 offset[1][1] := 44 offset[2][0] := 62 offset[2][4] := 61 offset[3][3] := 21

offset[0][3] := 41 offset[1][2] := 10 offset[2][1] := 6 offset[3][0] := 28 offset[3][4] := 56

π: π reorders the bits within each slice. It maps S[x][y] to S[x0 ][y 0 ] using following rule: - x0 := (x + y) mod 4 - y 0 := (((x + y) mod 4) + y + 1) mod 5 φ: φ is the S-box layer. ICEPOLE uses following 5-bit S-box: { 31, 9, 18, 11, 5, 12, 22, 15, 10, 3, 24, 1, 13, 4, 30, 7, 20, 21, 6, 23, 17, 16, 2, 19, 26, 27, 8, 25, 29, 28, 14, 0 } The φ applies the 5-bit S-box to all the 256 rows of the state. κ: In κ the 64-bit constant is xored with S[0][0]. The constants are different for each round and we omit the values here.

4

Tao Huang, Ivan Tjuawinata, Hongjun Wu

2.3

Initialization

First, the state S is initialized with 1280-bit constant: S[0][0] := 0XFF97A42D7F8E6FD4 S[0][1] := 0X90FEE5A0A44647C4 S[0][2] := 0X8C5BDA0CD6192E76 S[0][3] := 0XAD30A6F71B19059C S[0][4] := 0X30935AB7D08FFC64 S[1][0] := 0XEB5AA93F2317D635 S[1][1] := 0XA9A6E6260D712103 S[1][2] := 0X81A57C16DBCF 555F S[1][3] := 0X43B831CD0347C826 S[1][4] := 0X01F22F1A11A5569F S[2][0] := 0X05E5635A21D9AE61 S[2][1] := 0X64BEFEF28CC970F2 S[2][2] := 0X613670957BC46611 S[2][3] := 0XB87C5A554FD00ECB S[2][4] := 0X8C3EE88A1CCF32C8 S[3][0] := 0X940C7922AE3A2614 S[3][1] := 0X1841F924A2C509E4 S[3][2] := 0X16F53526E70465C2 S[3][3] := 0X75F644E97F30A13B S[3][4] := 0XEAF1FF7B5CECA249 Then, the key (K) and the nonce are XORed to the state. The nonce is 128-bit for ICEPOLE–128. And the 96-bit nonce for ICEPOLE–128a and ICEPOLE–256a will be padded with 32 zeros to form a 128-bit nonce. nonce0 and nonce1 denote two 64-bit words of the padded nonce. For ICEPOLE–128 and ICEPOLE–128a, K0 and K1 denote two 64-bit words of the key, S[0][0] := S[0][0] ⊕ K0 S[1][0] := S[1][0] ⊕ K1 S[2][0] := S[2][0] ⊕ nonce0 S[3][0] := S[3][0] ⊕ nonce1

For ICEPOLE–256a, K0 , K1 , K2 and K3 denote four 64-bit words of the key, S[0][0] := S[0][0] ⊕ K0 S[1][0] := S[1][0] ⊕ K1 S[2][0] := S[2][0] ⊕ K2 S[3][0] := S[3][0] ⊕ K3 S[0][1] := S[0][1] ⊕ nonce0 S[1][1] := S[1][1] ⊕ nonce1

After that, the P12 permutation is applied to the state S. 2.4

Processing associated data and plaintext

ICEPOLE–128 uses 128-bit secret message number (SMN) σ SM N . It will be processed before associated data and the plaintext. Since ICEPOLE–128a and ICEPOLE–256a do not have SMN, only the associated data σiAD and the plaintext σiP will be processed. For ICEPOLE–128 and ICEPOLE–128a, the length of blocks σiAD and σiP is in the range [0, 1024] bits. The blocks will be padded to 1026 bits according to following rule. First, a frame bit is appended. It is set to ‘1’ for the last σ AD block and all σiP except the last one. For all other blocks, it is set to ‘0’. Then, a bit ‘1’ is appended, following by ‘0’s to make the length of padded block be 1026-bit. The number of blocks under a single key is less than 2126 . For ICEPOLE–256a, the associated data and plaintext blocks have length in the range [0, 960]. The same padding rule is applied and the padded blocks have length 962-bit. The number of blocks under a single key is less than 262 . The process of secret message number is as below for ICEPOLE–128: cSM N = Sb128c ⊕ σ SM N

Differential-Linear Cryptanalysis of ICEPOLE

5

σ SM N := pad(σ SM N ) Sb1026c := Sb1026c ⊕ σ SM N S := P6 (S) The process of associated data and plaintext blocks is as below: for all blocks σiAD { σiAD := pad(σiAD ) Sb1026c := Sb1026c ⊕ σiAD S := P6 (S) } for all blocks σiP { ci := Sblc ⊕ σiP (l is the length of σiP ) σiP := pad(σiP ) Sb1026c := Sb1026c ⊕ σiP S := P6 (S) } 2.5

Tag generation

After the AD and P are processed, the 128-bit tag T is derived: (T0 and T1 are two 64-bit words of T). T0 := S[0][0] T1 := S[0][1] The decryption and verification is trivial and we omit it here. 2.6

Security goals of ICEPOLE

The main security goals of ICEPOLE are: 128-bit encryption security for ICEPOLE–128 and ICEPOLE–128a; 256-bit encryption security for ICEPOLE–256a; and 128-bit authentication security for all variants. An important property of ICEPOLE is that the intermediate level of robustness under noncemisuse circumstance. It claimed that 1. “...in the case of nonce reuse the key-recovery attack is not possible”. 2. “Authenticity (integrity) in the duplex construction does not need a nonce requirement, thus is not threatened by nonce reuse.

3

Differential-Linear Distinguishing Attack on ICEPOLE

In [14], the designers have performed initial cryptanalysis on ICEPOLE, including the differential cryptanalysis, linear cryptanalysis and rotational cryptanalysis. In this section, we will revisit the cryptanalysis in [14] and introduce our analysis on ICEPOLE based on the differential-linear cryptanalysis. We would like to emphasize that our attacks only work under the assumption that the nonce and secret message number can be reused. The main idea is to query messages with certain input difference and analyze the statistics of the differences of chosen bits (according to the linear mask) in the output. When the XORed differences of the chosen bits have a significant bias from 0.5, the adversary can distinguish the cipher from a random permutation. In [10], Lu studied the implicit assumptions made in [9] and [3] and gave a theorem to compute the probability for the differential-linear distinguisher under the original two assumptions: 1. The involved round functions behave independently.

6

Tao Huang, Ivan Tjuawinata, Hongjun Wu

2. The two inputs E0 (P ) and E0 (P ⊕ α) of the linear characteristic for E1 behave as independent inputs with respect to the linear characteristic, where E0 is the encryption for the differential rounds and E1 is the encryption for the linear rounds, P is the plaintext and α is the input difference. Let pˆ be the probability that the input of the linear mask has no XORed difference after the differential step while be the linear characteristic bias for the linear step. The theorem says that p − 1)2 . the probability that the XORed difference of the output linear mask to be 0 is 12 + 2(2ˆ In [6], C´eline et al. further developed a method on computing the bias which only relies on the independence of the two parts of the cipher. Hence, when the bias is large enough, it is possible to distinguish it from a random permutation. In the analysis on ICEPOLE, we first divide the encryption P6 into two parts with equal number of rounds. So the first 3 rounds will be the differential step and the last 3 rounds will be the linear step. Our task is to find good 3-round differential characteristics and 3-round linear characteristics. Unless otherwise specified, we are discussing the ICEPOLE–128 and ICEPOLE–128a in this section under the nonce-misuse assumption. The ICEPOLE–256a is similar and we will discuss it later. 3.1

Constructing the differential characteristics

In ICEPOLE, S-box is the only non-linear operation, and the maximum differential probability of the S-box is 2−2 . The differential probability is largely determined by the number of active Sboxes. Although the designers expected that only 3% of the difference transitions has the maximum probability 2−2 , this probability is not rare in the early rounds. This is because most of the active S-boxes have 1 bit input difference after the diffusion and the differential probability of a single ICEPOLE S-box is 2−2 when the input and output differences are identical with weight 1. Those 1-to-1 identical difference transitions are preferred to the 1-to-n (n ≥ 2) cases even with the same probability as they will propagate to less number of active S-boxes in the next round. In [14], the designers analyzed the minimum number of active S-boxes using SAT solver and found that the minimum number is 9 for 3 rounds ICEPOLE. There is an intuitive way to construct the differential characteristics reach this lower bound. We can place 1 active S-box in the middle round and let it propagate backward and forward for one round. Then the number of active S-boxes will be in the form of 4–1–4, which reaches the minimum number 9. However, the above 3 round differential path is not feasible. In fact, only at most 1024 bits(for ICEPOLE–256a, 960 bits) out of the 1280 bits in a state can be affected by the plaintext and are feasible to introduce difference. Thus, we have the following observation on the operation µ. Observation 1 For any 20-bit slice, when the output difference is one bit after µ, there is at least one bit input difference at the last column (S[·][4]). This observation can be easily verified by analyzing the inverse operation of µ. Therefore, when an active S-box is propagated backward, it is likely that a number of active bits will be propagated to the last column of the input state, which is infeasible to introduce. And we programed to verify that it is impossible to find such feasible 4–1–4 differential path. It implies that for any feasible differential characteristics of ICEPOLE, the minimum number of active S-boxes is 2 in the first round. As a result, we will consider the differential characteristics with two active S-boxes in the first round. The ideal case is that the output difference of each active S-box is only 1-bit. Then after µ, this 1-bit difference in a slice will propagate to 4 bits. So we expect the good differential characteristics will have 8 active S-boxes in round 2 and no more than 32 active S-boxes in round 3. We searched for good 3-round differential characteristics, and found 5 possible initial differences in Table 1 (the rotated differences on the 64-bit word are not considered here), which can lead to feasible 3-round differential characteristics. We will name them as D1 , D2 , D3 , D4 and D5 . The total number of 3-round active S-boxes is 42, which implies that the differential probability is at most 2−84 . But in fact, the only requirement is that the input linear mask does not have XORed difference at the beginning of round 4. Hence, a large number of differential characteristics

Differential-Linear Cryptanalysis of ICEPOLE (a) D1 0x0 0x1 0x1 0x0

0x0 0x0 0x1 0x1

0x1 0x1 0x1 0x0

0x0 0x0 0x1 0x0

(b) D2

0x0 0x0 0x0 0x0

0x0 0x1 0x0 0x1

0x0 0x1 0x1 0x0

0x1 0x1 0x0 0x1

(c) D3

0x0 0x1 0x1 0x0

0x0 0x0 0x0 0x0

(d) D4 0x0 0x1 0x1 0x0

0x0 0x0 0x1 0x1

0x0 0x1 0x1 0x0

0x0 0x0 0x1 0x1

7

0x1 0x1 0x0 0x1

0x1 0x0 0x0 0x1

0x0 0x0 0x0 0x1

0x0 0x1 0x1 0x0

0x0 0x0 0x0 0x0

(e) D5

0x0 0x0 0x0 0x0

0x0 0x0 0x1 0x1

0x0 0x1 0x0 0x1

0x0 0x0 0x1 0x1

0x0 0x1 0x0 0x1

0x0 0x0 0x0 0x0

Table 1. The initial differences. Each entry represents a 64-bit word.

are satisfied. As long as the number of active S-boxes is relatively low (note that 32 is only 1/8 of the total 256 S-boxes), there is a high probability for the differential step. 3.2

Constructing the linear characteristic

The bias of an ICEPOLE linear characteristic is determined by the S-boxes involved in the linear characteristic. We use the following method to construct the 3-round linear characteristics. Take a single bit in both the input and output masks of an S-box in the middle round. Then find related bits in the input of the first round (input linear mask ) and the output of the third round before S-box (output linear mask ), assuming that the S-boxes are identical mappings. The rationale 3 , it nearly reaches the maximum bias 2−2 here is that the 1-bit identical mappings have a bias 16 while keeping the masks low-weight. The most essential criterion for the input and output masks is low-weight. When the input linear mask has lower weight, the probability that the input linear mask does not have XORed difference after the 3-round differential step will be higher. When the output linear mask has lower weight, less number of active S-boxes in round 6 will be involved in the linear relation. Two good linear characteristics we found are given in Table 2. They are denoted as L1 and L2 . (a) Linear characteristic 1, input linear mask 0x0 0x40 0x0 0x0

0x800000000000000 0x0 0x2000000000 0x0

0x2000000000 0x800000000000000 0x800000000000000 0x800002000020000

0x20000 0x2000020000 0x40 0x0

0x800000000000040 0x0 0x2000020000 0x40

(b) Linear characteristic 1, output linear mask 0x40000 0x0 0x0 0x0

0x1 0x4 0x200000 0x0

0x0 0x0 0x2000000000000000 0x20000000000

0x80000000000 0x0 0x0 0x100000000000000

0x0 0x0 0x0 0x0

(c) Linear characteristic 2, input linear mask 0x120000004000000 0x8020000000000000 0x8100000000000000 0x4000000

0x8000000000000000 0x4000000 0x20000000000000 0x8100100000000000

0x100000000000 0x8000100000000000 0x0 0x0

0x0 0x0 0x100000000000 0x0

0x0 0x100000000000000 0x4000000 0x20100000000000

(d) Linear characteristic 2, output linear mask 0x40000 0x8000 0x0 0x0

0x1 0x4 0x0 0x400

0x0 0x0 0x2000000000000000 0x20000000000

0x0 0x0 0x0 0x100000000000000

0x0 0x0 0x0 0x0

Table 2. The linear characteristics, assuming S-boxes are identical mappings

3.3

Observations to improve the attack

The following observation will be helpful in the selection of the differential characteristics.

8

Tao Huang, Ivan Tjuawinata, Hongjun Wu

Observation 2 When the 1024 bits S[0..3][0..3] of an ICEPOLE state S are known, we are able to determine S[0][2], S[1][3], S[2][4], S[3][0], S[3][2] after µ, ρ and π operations are performed on S. According to the above observation, we are able to determine the values of certain input bits to the S-boxes in the first round. Thus, we can update the differential tables for the S-boxes with fixed input values. The following example shows how this will help to improve the differential probability. Suppose the input of an S-box is in the form “1 ∗ 0 ∗ ∗”, where the ‘∗’s are the unknown bits and the ‘1’ and ‘0’ are the bits with fixed values, and the input difference is 2, then the output difference is 2 with probability 1, which is larger than the 0.25 in the general case. To get the fixed values, we can manipulate the input bits according to the mask in Table 3.

0x0 0x800000010 0x0 0x800000010

0x800000010 0x0 0x800000010 0x0

0x200000001 0x0 0x0 0x800000010

0x0 0x200000001 0x200000001 0x200000001

0x0 0x0 0x0 0x0

Table 3. Input mask for the fixed bits in round 1.

By setting the bits in S[0][1] selected by the mask 0x800000010 as ‘1’ and all the other bits selected by the other masks as ‘0’, the two active S-boxes in round 1 will satisfy the above condition on the input values. Our next observation is about the S-box in the round 6 of P6 . Observation 3 When 4 of the 5 bits in the output of an S-box are known, it is possible to recover some of the input bits from the output bits of the S-boxes. Table 4 provides the probability that we can recover a bit at each position of the S-box input.

Position

0 1 2 3 4

Probability

6 5 4 4 1 8 8 8 8 8

Table 4. Probability that the input bits can be recovered for an S-box at round 6.

Since the ciphertext is generated by XORing the keystream and the plaintext, we can obtain the values of at most 1024 bits in the state after 6 rounds. For ICEPOLE–128 and ICEPOLE–128a, 4 of the 5 bits in the output of all the S-boxes are known. And for ICEPOLE–256a, 4 of the 5 bits for 192 S-boxes and 3 of the 5 bits for 64 S-boxes in the output are known. Therefore, instead of using the bias of the linear relation in the round 6, we can recover the chosen input bits in round 6 before S-box by enumerating the values of output bits. This can then be used to recover the value of the bit in the linear relation in the output of round 5. For example, if we consider the output linear mask of L1 , the probability that we can recover the 8 bits can be computed as P rL1 = (3/4) × (5/8)3 × (1/2)4 = 2−6.45 . And using these 8 bits, we can compute the value of bit S[1][1][0] at the output of round 5. Similarly, the probability for L2 is P rL2 = (3/4)2 × (5/8)3 × (1/2)3 = 2−5.86 .

Differential-Linear Cryptanalysis of ICEPOLE

3.4

9

Concatenating the differential and linear characteristics

After the 3-round differential characteristics and 3-round linear characteristics are constructed, we can concatenate them to form 6-round differential-linear characteristics. First, we choose the initial difference D2 because the two active S-boxes in round 1 are in the last row which has two known bits. The difference of state before the S-box for D2 is given in Table 5. 0x0 0x0 0x0 0x0

0x0 0x0 0x0 0x400

0x0 0x0 0x0 0x20000000000

0x0 0x0 0x0 0x0

0x0 0x0 0x0 0x0

Table 5. The difference of state before the first S-box in D2 .

To choose the 3-round linear characteristic, we consider all the possible rotations of the two linear characteristics L1 and L2 given in Section 3.2. There are 128 possible rotated linear characteristics. The selection of linear characteristic is done experimentally: 1. Randomly pick 1024-bit plaintext blocks pairs with the chosen initial difference D2 and the fix values (1, 0) for the two known bits at positions 0 and 2 in the round 1 active S-boxes. 2. After 5 rounds, verify whether the XORed difference of the states under the linear characteristic is zero or not. If it is zero, add it to a counter cntSame. 3. Repeat the above process, and choose the one with highest bias at cntSame from 0.5. For D2 as the initial difference, the highest bias is 2−9.2 when the linear characteristic L1 left rotated by 33 bits is used. We remark that since round independence does not hold in the case of ICEPOLE, the experimental results would be better choices for the biases rather than the theoretical estimation. From the bias above, we immediately get a distinguishing attack on ICEPOLE. 1. Generate 233.9 pairs of two-block plaintext such that the first 1024-bit plaintext block has initial difference D2 and the fixed values as specified above. All the other bits are random. 2. Use ICEPOLE to encrypt the plaintext blocks and then decide the 1024 bits input and output state of the first P6 . Discard those pairs if the two bits in round 5 output in the linear characteristic L1 left rotated 33 bits cannot be recovered. Note that for each bit, the probability is 2−6.45 to recover. So there are 221 pairs left. Then compute the bit at position S[1][1][33] of the output of round 5 using the recovered bits from round 6. 3. Analyze the bias of the XORed difference of those two bits (S[1][1][33]). If the bias is larger than 2−10.2 we conclude that it is the ICEPOLE encryption. The success probability is computed by using the normal distribution to approximate the binomial distribution of the bias. For ICEPOLE, the bias is a random variable X ∼ N (n(1/2 + 2−9.2 ), n(1/2 + 2−9.2 )(1/2 − 2−9.2 )), where n is the number of pairs of the recovered 5 rounds output. When n = 221 , the probability that X ≥ 2−10.2 is 99.3%. A random permutation, on the other hand, has its bias to be a random variable Y ∼ N (n/2, n/4). The value is larger than 2−10.5 with probability 0.7%. Hence, we have a very good chance to distinguish the ICEPOLE encryption from a random permutation by using 235.9 plaintext blocks (a pair of two-block messages are counted as 4 plaintext blocks), assuming the nonce can be reused.

4

State Recovery Attack on ICEPOLE

In this section, we will use the differential-linear characteristics to launch state recovery attacks on ICEPOLE. 4.1

State recovery attacks on ICEPOLE–128 and ICEPOLE–128a

For ICEPOLE–128 and ICEPOLE–128a, there are 256 unknown bits in the state before P6 . They are in the last column of each slice. For convenience, we denote those four 64-bit unknown words in the last column as {U0 , U1 , U2 , U3 } according to the row index. We will recover them step by step.

10

Tao Huang, Ivan Tjuawinata, Hongjun Wu

4.1.1

Recovering U0 and U3

To recover the unknown bits in the state, we will analyze the input values at the active S-boxes. For D2 , there are two active S-boxes in round 1. One has input difference 2 and the other has input difference 4. We will focus on the one with input difference 4 and denote the five input bits as b0 , b1 , ..., b4 according to their positions (b0 being the least significant bit). When the two bits of an ICEPOLE S-box are fixed with values b0 = 1 and b2 = 0, there are 8 possible values for the remaining three bits. When the input difference is 4 (at b2 ), the output difference have weight 1 only when b1 = 1 and b3 = 0. Intuitively, the lower the weight of the output difference after the first round, the higher the probability that there is no XORed difference at the output linear mask after 5 rounds. Hence, it is possible to relate the input value of the active S-box in round 1 to the bias of XORed output difference in round 5. We experimentally find the following biases for different values of the input bit b1 and b3 . Note that we collected 230 data in the experiments to compute the bias and repeated for several time. When the bias is less than 2−14 , the experimental results were not very stable, and the average number is listed in the table. In fact, it is not necessary to consider those low biases as only the highest ones could be useful for our analysis.

b1 b1 b1 b1

values = 0, b3 = 0 = 1, b3 = 0 = 0, b3 = 1 = 1, b3 = 1

bias (log based 2) −13.0 −7.3 −13.9 −11.9

Note that b1 is related to an unknown bit in U3 , and b3 is related to two unknown bits, in U0 and U3 . So we have following relations: b1 = U331 ⊕ a0 and b3 = U049 ⊕ U349 ⊕ a1 , where U x is the x-th bit in the 64-bit word U ; a0 and a1 are constants which can be computed from the 1024-known bits. We describe the state recovery process given as below: 1. Generate 233.9 pairs of two-block plaintext satisfied following requirements. The first block of the plaintext has difference D2 and each active S-box has fixed values ‘1’ and ‘0’ in bit 0 and 2 respectively. All the other bits are random. 2. Use ICEPOLE to encrypt the plaintext blocks and then decide the 1024 bits input and output state of the first P6 . Discard the pairs if the two bits in round 5 output in the linear characteristic L1 left rotated 33 bits cannot be recovered. There are 221 pairs left. Then compute the bit at position S[1][1][33] of the output of round 5 using the recovered bits from round 6. 3. If the two bits at the position S[1][1][33] are the same, we compute the values of a0 and a1 according to the input and increase the counter for the value of (a0 , a1 ) by 1. 4. Suppose the largest counter of (a0 , a1 ) takes value a0 = v0 and a1 = v1 , we guess that U331 = v0 ⊕1 and U049 ⊕ U349 = v1 . 5. By rotating the differential-linear characteristic for the other 63 bits, we can recover the two 64-bit unknown words U0 and U3 . The success probability of this scheme is equivalent to the probability that a random variable X ∼ N (n(1/2 + 2−7.3 ), n(1/2 + 2−7.3 )(1/2 − 2−7.3 )) has value greater than the random variable Y ∼ N (n(1/2 + 2−11.9 ), n(1/2 + 2−11.9 )(1/2 − 2−11.9 )). When n = 219 , the probability is almost 1. Since we have 221 pairs of input, each of the four choices of b0 and b1 will be around 219 pairs. We remark that here the probability is high enough such that even if the experiment is repeated for 64 times, the success probability is still close to 1.

Differential-Linear Cryptanalysis of ICEPOLE

4.1.2

11

Recovering U2

Assuming that U0 and U3 have been recovered correctly, we can use similar method to recover U2 . In this case, we use D1 and L2 left rotated by 58 bits as the differential-linear characteristic. The difference of state before the first round S-box for D1 is given in the Table 6. 0x0 0x0 0x0 0x0

0x0 0x4 0x200000 0x0

0x0 0x0 0x0 0x0

0x0 0x0 0x0 0x0

0x0 0x0 0x0 0x0

Table 6. The difference of state before the first S-box in D1 .

Fixed bits: With the knowledge of U0 and U3 , we fix bit 0, 2, 3 with the values (1, 0, 0) respectively for the active S-box at the second row. This is to ensure the output difference of this active S-box has weight exactly 1. And we fix bit 0, 4 with the values (1, 1) for the active S-box at the third row. Then, the weight of output difference can be distinguished from the input value of bit 2, which is denoted as b2 . We experimentally find the following biases for b2 after 5 rounds. The biases are based on the difference of the output bit at position S[3][1][58].

values bias (log based 2) b2 = 0 −11.0 b2 = 1 −15.4

From the active S-box at the third round, it is possible to find following relation: b2 = U224 ⊕ a, where the constant a can be computed from the known input bits. The state recovery process process: 1. Generate 236.7 pairs of two-block plaintext satisfying following requirements. The first block of the plaintext has difference D1 and each active S-box has fixed values according to the above paragraph. All the other bits are random. 2. Use ICEPOLE to encrypt the plaintext blocks and then decide the 1024 bits input and output state of the first P6 . Discard the pairs if the two bits in the linear relation (according to L2 rotated by 58 bits) in the output of round 5 cannot be recovered. There are 225 pairs left. Then compute the bit at position S[3][1][58] of the output of round 5 using the recovered bits from round 6. 3. If the two bits at the position S[3][1][58] are the same, we compute the value of a according to the input and increase the counter for that value by 1. 4. Suppose the largest counter of a take value a = v, we guess that U10 = v. 5. By rotating the differential-linear characteristic for the other 63 bits, we can recover the 64-bit unknown word U1 . The estimated success probability is 99.6% for each bit. 4.1.3

Recovering U1

At this stage, we assume that U0 , U2 and U3 have been recovered correctly. In this case, we select D3 and L2 left rotated by 35 bits as the differential-linear characteristic. The differential of state before the first round S-box for D3 is given in the Table 7.

12

Tao Huang, Ivan Tjuawinata, Hongjun Wu 0x0 0x8000 0x0 0x0

0x0 0x0 0x0 0x0

0x80000000000000 0x0 0x0 0x0

0x0 0x0 0x0 0x0

0x0 0x0 0x0 0x0

Table 7. The differential of state before the first S-box in D3 .

Fixed bits: We fix bit 1, 2, 4 with the values (1, 0, 1) for the active S-box at the first row. It is to ensure that the weight of output difference of this active S-box is determined by the value of input bit 3 which is denoted as b0,3 . And we fix bit 0, 2, 3, 4 with the values (0, 1, 1, 1) for the active S-box at the second row. It is to ensure that the weight of output difference is determined by the value of input bit 1 which is denoted as b1,1 . The b0,3 and b1,1 are related to the unknown bits: b0,3 = a0 ⊕ U112 b1,1 = a1 ⊕ U113 where a0 and a1 are constants which can be computed from the known input. We experimentally find the following biases for different values of the input bit b0,3 and b1,1 . The biases are base on the difference of the output bit at position S[3][1][35].

b0,3 b0,3 b0,3 b0,3

values = 0, b1,1 = 1, b1,1 = 0, b1,1 = 1, b1,1

=0 =0 =1 =1

bias (log based 2) −11.2 −15.2 −16.4 −14.8

We remark that the biases other than the first row in above table may not be very accurate considering the small bias. The state recovery process process: 1. Generate 237.7 pairs of two-block plaintext satisfied following requirements. The first block of the plaintext has difference D3 and each active S-box has fixed values according to the above paragraph. All the other bits are random. 2. Use ICEPOLE to encrypt the plaintext blocks and then decide the 1024 bits input and output state of the first P6 . Discard those pairs if the two bits in the linear relation (according to L2 rotated by 35 bits) in the output of round 5 cannot be recovered. There are 226 pairs left. Then compute the bit at position S[3][1][35] of the output of round 5 using the recovered bits from round 6. 3. If the two bits at the position S[3][1][35] are the same, we compute the value of a0 and a1 according to the input and increase the counter for (a0 , a1 ) by 1. 4. Suppose the largest counter of (a0 , a1 ) take value a0 = v0 and a1 = v1 , we guess that U112 = v0 and U113 = v1 . 5. By rotating the differential-linear characteristic for the other 31 even bits less than 64, we can recover the 64-bit unknown word U1 . Note that in each rotation of the differential-linear characteristic, we are able to recover two consecutive bits, so we only need to test 32 rotations to recover the 64-bit U1 . The estimated success probability is 98.7% for each bit. 4.1.4

Correcting the recovered state

In the previous state recovery attack, only the first 128 bits can be correctly recovered with probability almost 1. For the other 128 bits, although the success probability is around 99%, it is

Differential-Linear Cryptanalysis of ICEPOLE

13

still possible that the state we recovered has some error bits. Whether the state is correct can be easily verified by encrypting some new messages and compare the ciphertext. We can correct up to 7 error bits with relatively low complexity. To correct i error bits is to choose any i bits from the 128 bits and flip the values. Then test whether the modified unknown state is correct. Suppose that for the 128 bits U1 and U2 , the probability that each bit is correct is 0.99, we can compute the probability that the number of error bits is less than 8 as 7 X 128 i=0

i

× .99128−i × .01i = 0.99995.

The total number of encryptions to correct up to 7 error bits is 237.5 , which is negligible to the whole attack. 4.1.5

Summary of the attack

The data complexity is: - U0 and U3 : 2 × 2 × 233.9 × 26 . We multiply 233.9 by 2 two times due to the fact that we use 233.9 pairs of 2-blocks plaintext. - U2 : 2 × 2 × 236.7 × 26 - U1 : 2 × 2 × 237.7 × 25 - Total: 245.8 The time complexity is the 245.8 encryptions of the one block plaintext and the possible 237.5 encryptions for correction. The memory cost is mainly on the storage of some counters, which is negligible. The success rate of this attack is close to 1, and can be adjusted through the number of input messages. 4.2

State recovery attack on ICEPOLE–256a

In the case of ICEPOLE–256a, there are 320 unknown bits in the input and output states in the encryption of a block. In addition to the U0 to U3 in the previous subsection, we use U4 to denote the unknown 64-bit word S[3][3]. Different from the ICEPOLE–128 and ICEPOLE–128a, in the last row of the output in ICEPOEL–256a, there are only 3 known bits instead of 4 known bits. Consequently, it is impossible to recover the input bits given the output bits for that row. To deal with this issue, we have to consider the linear relation of the input mask of the S-box. When the value of the input mask is less than 8, the largest bias is 3/16, but if the value of the input mask is 8, the largest bias is 1/16. For L1 , there are two mask bits placed in the last row, one is 4 and the other is 8. From the Piling-Up Lemma [11], the bias is 2−5.4 . For L2 , there are three mask bits placed in the last row, 2, 4, and 8. By Piling up Lemma, the bias is 2−6.8 . To increase the bias of the last row, we introduce another linear characteristic L3 (Table 4.2), which has only 1 bit in the last row of the input mask of round 6 S-box. For L3 , the bias is 3/16 and the probability to recover other bits is 2−9.6 . To recover the U4 , we use the differential characteristic D2 with linear characteristic L1 left rotated 33 bits, same as the recovery of U0 and U3 . Since the value of S[3][2] is not known, we will only fix the active bits in S[3][0], setting it to 1. We will find the value of bit 2 in the input of active S-box related to S[3][1] with difference 0x400. We denote this bit b3,2 , and we have b3,2 = a ⊕ U433 , where a is a constant from the known input. We experimentally find the following biases for the input bit values b3,2 after 5 rounds. The biases are base on the difference of the output bit at position S[1][1][33].

14

Tao Huang, Ivan Tjuawinata, Hongjun Wu (a) input linear mask 0x0 0x410000000000000 0x400000000000000 0x410000000000000

0x10000000000000 0x0 0x10000000000000 0x0

0x20000 0x80000000000 0x2000000000 0x10000000000000

0x82000000000 0x20000 0x80000020000 0x2000020000

0x400000000000000 0x2000000000 0x0 0x80000000000

(b) output linear mask 0x40000 0x8000 0x8 0x0

0x0 0x4 0x200000 0x0

0x0 0x0 0x0 0x20000000000

0x80000000000 0x0 0x0 0x0

0x200000000000 0x0 0x100000000000 0x0

Table 8. The linear characteristic 3, assuming S-boxes are identical mappings

values bias (log based 2) b3,2 = 0 −9.2 b3,2 = 1 −15.3 (negative)

Considering the linear relation of this XORed value to the 4 bits in the two output states, the bias of the XORed difference of the 4 bits becomes 2−18 by Piling up lemma. To ensure the high probability of correctly guessing the value, we need around 240 pairs of two-block plaintext. The total data needed is around 258.7 . Next, the complexity of recovering the other unknown words need to be amended. We omit the similar attack process and discuss the estimated complexities here. For U0 and U3 , the increased bias is 2−8.8 which needs to be compensated by around 217.6 additional messages. Furthermore, some of the bits from the S-box layer in round 6 can directly be recovered from the linear relation, hence increasing the success probability by a factor of 24 . The total effect is the data complexity becomes 255.5 . For U2 , D1 with L3 left rotated 13 bits will be used and for U1 , D3 with L1 left rotated 2 bits will be used. The increased bias is roughly 2−2.8 for both cases (including the increased 5-round bias). And the decreased probability for recovering the values before round 6 S-box has a factor 2−7.5 . Therefore, the total data complexity for recovering each bit is increased by 213.1 , which is 257.8 . Therefore, the total data complexity of the state recovery attack for ICEPOLE–256a is estimated to be 259.8 , which is less than the constraint 262 . 4.3

Implications of the state recovery attacks

For ICEPOLE–128, the state recovery attack implies the failure of encryption security if both the nonce and the secret message number are reused. When an adversary has the full knowledge of a state, he can invert the cipher until the secret message number is injected. Thus, the adversary can decrypt arbitrary plaintext blocks. It also implies a forgery attack on the authentication of plaintext and associated data since the valid tag for any modified associated data or plaintext block can be computed. Since both the key and secret message number are unknown, the adversary cannot recover the key. For ICEPOLE–128a and ICEPOLE–256a, the state recovery attack implies the whole security is broken if the nonce is reused. The initialization of ICEPOLE–128a is invertible, so the adversary can directly recover the secret key from the known state. Then both the encryption and authentication are insecure. Summary of the security of ICEPOLE under our analysis when nonce is reused is given in Table 9.

5

Experimental Results

We experimentally verified the state recovery attack on ICEPOLE–128a. And we managed to recover the 256 unknown bits practically. The experiments used the state after the initialization with all zeros IV and key. The unknowns states are:

Differential-Linear Cryptanalysis of ICEPOLE

confidentiality for the plaintext confidentiality for the secret message number integrity for the plaintext integrity for the associated data integrity for the secret message number integrity for the public message number

15

ICEPOLE-128 ICEPOLE-128a ICEPOLE-256a 46 46 60 128 46 46 60 46 46 60 46 46 46 60

Table 9. Number of bits of security when nonce and secret message number can be reused.

U0 U1 U2 U3

= 0x1e7aed5bf aeb535f = 0xe0dcc6422595e5ba = 0x892bf 76586876c23 = 0x8b2ef 3bf 50e902f 6

To recover U0 and U3 , we run the attack in Sect. 4.1.1 on a server with 64 cores (AMD Opteron(tm) Processor 6276). Instead of checking the constants from input, we used an equivalent method: directly extract the input bits of the active S-box in the first round, and decide whether they are the estimated ones. The number of plaintext pairs we used is 234 for each bit. The attacks takes 15.3 hours and all the 128 bits recovered are correct. To recover U2 , we run the attack in Sect. 4.1.2 on the server. The number of plaintext pairs we used is 237 for each bit. The attacks takes 3.5 days and all the bits are correct. To recover U1 , we run the attack in Sect. 4.1.3 on the server. The number of plaintext pairs we used is 238 for each bit. The attacks takes 3.5 days and there is one error bit. Since the number of error bits is very small, the experiments show that our state recovery attack indeed works for ICEPOLE–128a.

6

Conclusion

In this paper, we analyzed the security of the ICEPOLE family of authenticated ciphers using the differential-linear cryptanalysis when nonce is misused. ICEPOLE is strong against differential cryptanalysis since only part of the input difference is affected by message in the attack (so the best differential attack against the permutation cannot be applied to break ICEPOLE); and ICEPOLE is strong against linear cryptanalysis since only part of the input and output of the permutation are known in the attack (so the best linear attack against the permutation cannot be applied to break ICEPOLE). We successfully developed the differential-linear cryptanalysis against ICEPOLE by bypassing the input/output constraints of ICEPOLE. Our attacks show that the states of all the ICEPOLE variants can be recovered, and the secret key of ICEPOLE–128a and ICEPOLE–256a can also be recovered. The security claims of ICEPOLE do not hold under the nonce misuse circumstances. From the attacks against ICEPOLE, the lesson we learned is that when we are designing a strong cipher based on a permutation, it is better to consider the best attacks against the permutation in which the input/output can affect the whole state. Furthermore, if the performance of a cipher is improved by considering input/output constraints, the designers should analyze whether the input/ output constraints could be bypassed in the attacks.

References 1. G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche. Keccak Sponge Function Family Main Document. Submission to NIST (Round 2), 2009. 2. G. Bertoni, J. Daemen, M. Peeters, and G. Van Assche. Duplexing the sponge: Single-pass authenticated encryption and other applications. In A. Miri and S. Vaudenay, editors, Selected Areas in Cryptography, volume 7118 of Lecture Notes in Computer Science, pages 320–337. Springer Berlin Heidelberg, 2012. 3. E. Biham, O. Dunkelman, and N. Keller. Enhancing Differential-Linear Cryptanalysis. In Y. Zheng, editor, Advances in Cryptology–ASIACRYPT 2002, volume 2501 of Lecture Notes in Computer Science, pages 254–266. Springer Berlin Heidelberg, 2002.

16

Tao Huang, Ivan Tjuawinata, Hongjun Wu

4. E. Biham, O. Dunkelman, and N. Keller. Differential-Linear Cryptanalysis of Serpent. In T. Johansson, editor, Fast Software Encryption, volume 2887 of Lecture Notes in Computer Science, pages 9–21. Springer Berlin Heidelberg, 2003. 5. E. Biham and A. Shamir. Differential Cryptanalysis of DES-like Cryptosystems. Journal of Cryptology, 4(1):3–72, 1991. 6. C. Blondeau, G. Leander, and K. Nyberg. Differential-Linear Cryptanalysis Revisited. In Fast Software Encryption–FSE 2014. Springer, 2014. 7. O. Dunkelman, S. Indesteege, and N. Keller. A Differential-Linear Attack on 12-Round Serpent. In D. Chowdhury, V. Rijmen, and A. Das, editors, Progress in Cryptology - INDOCRYPT 2008, volume 5365 of Lecture Notes in Computer Science, pages 308–321. Springer Berlin Heidelberg, 2008. 8. O. Dunkelman and N. Keller. Cryptanalysis of ctc2. In M. Fischlin, editor, Topics in Cryptology CT–RSA 2009, volume 5473 of Lecture Notes in Computer Science, pages 226–239. Springer Berlin Heidelberg, 2009. 9. S. Langford and M. Hellman. Differential-Linear Cryptanalysis. In Y. Desmedt, editor, Advances in Cryptology–CRYPTO 1994, volume 839 of Lecture Notes in Computer Science, pages 17–25. Springer Berlin Heidelberg, 1994. 10. J. Lu. A Methodology for Differential-Linear Cryptanalysis and Its Applications. In A. Canteaut, editor, Fast Software Encryption, volume 7549 of Lecture Notes in Computer Science, pages 69–89. Springer Berlin Heidelberg, 2012. 11. M. Matsui. Linear Cryptanalysis Method for DES cipher. In Advances in Cryptology–EUROCRYPT’93, pages 386–397. Springer, 1994. 12. D. McGrew and J. Viega. The Galois/Counter Mode of Operation (GCM). http://csrc.nist.gov/ CryptoToolkit/modes/proposedmodes/gcm/gcm-spec.pdf. 13. P. Morawiecki. Nonce reuse in ICEPOLE. available from http://competitions.cr.yp.to/round1/ icepole-misuse.html, Jul 2014. 14. P. Morawiecki, K. Gaj, E. Homsirikamol, K. Matusiewicz, J. Pieprzyk, M. Rogawski, M. Srebrny, and M. W´ ojcik. ICEPOLE v1. submission to CAESAR competition, available from http://competitions.cr.yp.to/round1/icepolev1.pdf. 15. P. Morawiecki, K. Gaj, E. Homsirikamol, K. Matusiewicz, J. Pieprzyk, M. Rogawski, M. Srebrny, and M. W´ ojcik. Icepole: High-speed, hardware-oriented authenticated encryption. In L. Batina and M. Robshaw, editors, Cryptographic Hardware and Embedded Systems, CHES 2014, volume 8731 of Lecture Notes in Computer Science, pages 392–413. Springer Berlin Heidelberg, 2014. 16. Y. Shin, J. Kim, G. Kim, S. Hong, and S. Lee. Differential-Linear Type Attacks on Reduced Rounds of SHACAL-2. In H. Wang, J. Pieprzyk, and V. Varadharajan, editors, Information Security and Privacy, volume 3108 of Lecture Notes in Computer Science, pages 110–122. Springer Berlin Heidelberg, 2004. 17. H. Wu and B. Preneel. Differential-Linear Attacks Against the Stream Cipher Phelix. In A. Biryukov, editor, Fast Software Encryption, volume 4593 of Lecture Notes in Computer Science, pages 87–100. Springer Berlin Heidelberg, 2007.