13/07/2024

Codebreaking with AI

In 2016, the public was made aware of how AI (artificial intelligence) dramatically improved the quality of machine translation (Google's release in Japanese). It naturally raises a question: is codebreaking possible with AI?
For a long time, I couldn't find a positive answer to this question. Although my search was half-hearted, even Copilot with Chat-GPT-4 did not give a relevant answer to my question: "Are there papers about codebreaking by using AI?" Then, I noticed a poster abstract in HistoCrypt 2024: Oriol Closa, "Polyalphabetic cipher decryption function learning with LSTM networks", which seems to be based on her master thesis, Closa Oriol (2023), "LSTM-attack on polyalphabetic cyphers with known plaintext: Case study on the Hagelin C-38 and Siemens and Halske T52" (KTH). It teaches that "the application of Machine Learning to extract key information from intercepts is not a well researched area yet." (Abstract) and there are even "many authoritative opinions within the field" against utility of machine learning in classical cryptography (p.57).

Machine "learns" by finding a best parameter set for a model, which is like a very complex filter that receives an input and produces an output. In an example of machine translation, the input is a sequence of words in English and the output is a sequence of words in Japanese. In order to train a machine in this example, bilingual corpus of corresponding texts in the two languages is fed to the computer, whereby the computer learns an English-Japanese translation model. Given a new text in English, the trained computer can apply the model (filter) on the input to produce a text in Japanese as an output.

By analogy, given a corpus of ciphertext/plaintext pairs, AI may learn to decipher a new ciphertext into a plaintext. However, it should be limited to the case where the new ciphertext is based on the same cipher key used for training -- that's what I thought. But the thesis taught me machine learning can do more than that. (The key is included in the training data (p.39).)
Remember that the output of a trained model need not be similar to the input. For example, when the input is a text in English, the output may be some classification or labelling of the text rather than a text in another language. In Oriol (2023), in my understanding, the input is ciphertext, a crib (known plaintext, presumably corresponding to the ciphertext), and a null key (a placeholder in the input vector), and the output is plaintext (which should match the crib and is included only for analysis) and the extracted key (p.34). Thus, this seems to receive a maching ciphertext/plaintext pair to produce its key ("extract the external key given a combination of plaintext and ciphertext without the use of the internal setting" p.51).

The thesis deals with four ciphers: Vigenere, Playfair, Hagelin C-38, and Siemens and Halske T52 with LSTM networks (a kind of neural network). Main differences among them are the decipher function reflecting the cipher scheme (I guess this means the cipher algorithm is known and only the key is to be found out), the crib length (e.g., 15, 25), and the size of the hidden layer (e.g., 256, 2048) (p.33, 40). The author says LSTM networks can extract key information given a crib (in my understanding, this is a matching plaintext/ciphertext pair) for Vigenere, Hagelin C-38, and Siemens and Halske T52, but not Playfair.

(16 July 2024) Ajeet Singh, Kaushik Bhargav Sivangi, and Appala Naidu Tentu (2024), "Machine Learning and Cryptanalysis: An In-Depth Exploration of Current Practices and Future Potential", Journal of Computing Theories and Applications (JCTA, DOI: https://doi.org/10.62411/jcta.9851), Vol. 1 No. 3 (2024) also says "the integration of machine learning, and specifically deep learning, into cryptanalysis has been relatively unexplored."

No comments:

Post a Comment