XOR#
XOR is short for "exclusive or," a fundamental bitwise operation. Its rule is to compare two bits; if the two bits are the same, the result is 0; if the two bits are different, the result is 1.
Since โ 0.5, there is information. Thus, even with just one subset of values, the deviation can be exploited in a roughly similar manner. Even 0.5000001.
source of info = If the degree of consistency between the input expression and the output expression does not exceed 0.5, then one expression is the source of information for the other expression.
Linear Cryptanalysis#
 Plaintext and ciphertext are visible
 It is a known plaintext attack
 A sufficient number of plaintext and ciphertext pairs are needed to find the linear relationship between input bits and output bits.
 Requires sufficiently good linear approximations.
 Refers to a type of attack aimed at block ciphers.
 Approximates block ciphers using linear expressions (some variables represented by mod2 bit operations).
 The truth (or falsehood) of these expressions carries a certain deviation, which is used to discover key bits.
Substitution Boxes (Sbox)#
 Determine the bias of the Sbox and construct the LAT.
 Use (biased) linear expressions.
 Link expressions from the initial plaintext to the intermediate ciphertext.
 Through Matsui's bias theorem.
 For the final subkey, determine the frequency of the expression's validity.
 If the bias matches, you have discovered the key!
 Repeat the above steps for each subkey.
Implemented in the form of a lookup table, but cannot scale with increasing input size, so outputs must be computed differently, i.e., calculated as expressions of input bits. e.g.
Approximating Sboxes#
one key XORing and one Substitution Box
Assuming X3 โ X4 = Y1 โ Y4,
then K3 โ K4 = P3 โ P4 โ Y1 โ Y4
Linear Approximation Table (LAT) for Sboxes#
Analysis:
 x represents input, y represents output
 This is the bias table:
 0 means unbiased, so probability is 0.5.

 means the actual occurrence is more than expected,  means the actual occurrence is less than expected
 Nonzero indicates information; the larger the absolute value, the more information it contains
A[1,2] means the input mask is '01', which means we only care about the second bit of the input. The output mask is '10', which means we only care about the first bit of the output.
Calculate the XOR value of the output bit corresponding to each input#
We need to perform the following substeps for each input:
 Look at the output value of that input in the Sbox.
 Apply the input mask to the input value, keeping only the bits where the mask corresponds to 1.
 Apply the output mask to the output value, similarly keeping only the bits where the mask corresponds to 1.
 Calculate the XOR value of these two bits.
01 00 11 10 โ 1010
11 01 10 00 โ 1010
XOR โ 0000 = 4/4
so answer is +2
also for B answer is 2
Matsuiโs Piling Up Lemma#
Used to calculate the bias of multiple variables
 e.g.: bias is not 0
 e.g.: As long as one is 0, the final result is unbiased. Example Z2 is unbiased, effectively randomizing Z1 โ Z2
 Conclusion: The formula is as follows
Linking Sbox Approximations#
Steps:
 How to calculate probability? Based on input rather than output, as illustrated to derive the approximation of G2
 With the approximation, subtract the expected probability (usually 1/2) to get the bias
 Common XOR, eliminate intermediate nodes to move to the right side of the equation, turning it into a probability of =0
 Substitute into Matsuiโs Lemma formula to find that value
 Continue calculating the overall, as illustrated
 In the substitution steps, G2 appears, so XOR becomes 0, moving to the right side of the equation
 At the same time, the probability of =1 is calculated as 5/8
 Transformation example
 Calculation example
 The transformation of the above XOR formula, so the result is the same
 Summary of the above transformations
Discovering Keys#
Visible?#
 The system only has plaintext P and ciphertext C visible, after all, it is linear cryptanalysis
Final Round Key#
The last round Sbox is invertible, so the corresponding U can be calculated.
For any round key guessed by the cryptanalyst, they can generate U.
 Assume a last round key, obtaining the intermediate ciphertext, which represents the intermediate steps U1, U2, U3, U4, etc., in the encryption process.
 If the guessed key is correct, then the intermediate ciphertext is correct.
 Next, with P and U and C, calculate the approximate expression (approx. expression), which has a certain bias (bias), such as 1/8 or 3/8.
 If this approximate expression matches the observed bias, then the assumed key can be considered correct.
 If the guessed key is incorrect, then the intermediate ciphertext is also incorrect.
 If one or more key bits are guessed incorrectly, then this approximate expression will only hold true half the time, as the result is randomized.
Therefore, by comparing the approximate expression under the assumed key with the bias of 1/2, one can distinguish between correct and incorrect key guesses. Finally, try all possible last round keys to see which (P,U) approximation is least consistent with the bias of 1/2; this key is likely the actual key.
Applying#
Substitution Permutation Network (Heyโs Cipher)#
 In this system, Heys' Cipher enhances the security of encryption through repeated substitution and permutation operations and key mixing.
 3 rounds Sbox: XORing, all Sboxes are the same.
 Pbox: Permutation box, permuting, it rearranges bits according to specific rules to diffuse any changes in input during each round.
 16bit subkeys: Assumed to be independently generated, usually obtained through key scheduling, similar to the approach in DES (Data Encryption Standard).
 Example: Each probability is based on input.
 Key mixing uses bitwise XOR operations.

E2E refers to endtoend approximation, the entire encryption process from plaintext to ciphertext.
If you have enough plaintextciphertext pairs, theoretically, you can use this linear approximation to decrypt or more accurately guess the key.

How to calculate? Similarly, Matsuiโs Lemma
Key subblocks and reverse engineering:
 Goal: Generate Intermediate Ciphertexts through K5 and C (the second and fourth subsets)
 Unrelated to the 1st and 3rd subsets, so the interest is in the 2nd and 4th.
From C to U, the correct K yields the correct U, and vice versa.
Try All Subkeys#
 The table only shows a portion of the subkeys, assuming there are 64 subkeys.
 Assuming there are 10,000 plaintextciphertext pairs, using 64 subkeys.
 For each pc pair, c and subkey generate p, calculating the bias.
 The subkey with the largest bias is usually correct, but not always.
 It can be seen that when the subkeys are 24, the bias is the largest, a candidate.