Neural Network for File Decryption - Possible?

Tags:

I have already worked with Neural Networks before and know most basics about them. I especially have experience with regular Multi-Layer-Perceptrons. I was now asked by someone if the following is possible and somehow feel challenged to master the problem :)

The Situation

Let's assume I have a program that can encrypt and decrypt regular ASCII-Coded Files. I have no idea at all about the specific encryption method nor the key used. All I know is, that the program can reverse the encryption and thus read the original content.

What I want?

Now my question is: Do you think it is possible to train (some kind of) Neural Network which replicates the exact decryption-Algorithm with acceptable effort?

My ideas and work so far

I have not much of experience with encryption. Someone suggested just to assume AES encryption, so I could write a little program to batch-encrypt ASCII-Coded files. So this would cover the gathering of learning data for supervised learning. Using the encrypted files als input for the neural networks and the original files as training data I could train any net. But now I am stuck, how would you suggest to feed the input and output data to the Neural Network. So how many Input and Output-Neurons would you guys use? Since I have no Idea what the encrypted files would look like, it might be the best idea to pass the data in binary form. But I can't just use thousands of input and output-neurons and pass all bits at the same time. Maybe recurrent networks and feed one bit after another? Also doesn't sound very effective.

Another problem is, that you can't decrypt partially - meaning you can't be roughly correct. You either got it right or not. To put it other words, in the end the net error has to be zero. From what I have experienced so far with ANN, this is nearly impossible to achieve for big networks. So is this problem solvable?

452

asked May 13 '11 08:05

EliteTUM

3 Answers

Another problem is, that you can't decrypt partially - meaning you can't be roughly correct. You either got it right or not.

That's exactly the problem. Neural Networks can approximate continuous functions, meaning that a small change in the input values causes a small change in the output value, while encryption functions/algorithm are designed to be as non-continuous as possible.

101

answered Oct 26 '22 08:10

Andre Holzner

I think if that worked, people would be doing it. As far as i know, they aren't doing it.

Seriously, if you could just throw a lot of plaintext/ciphertext pairs at a neural network and construct a decrypter, then it would be a very effective known-plaintext or chosen-plaintext attack. Yet the attacks of that kind we have against current ciphers are not very effective at all. That means that either the entire open cryptographic community has missed the idea, or it doesn't work. I realise that this is far from a conclusive argument (it's effectively an argument from authority), but i would suggest it's indicative that this approach won't work.

answered Oct 26 '22 09:10

Tom Anderson

Say you have two keys A and B that translate ciphertext K into Pa and Pb respectively. Pa and Pb are both "correct" decryptions of ciphertext K. So if your neural network has only K as input, it has no means of actually predicting the correct answer. Most ways of encryption cracking involve looking at the result to if it looks like what you're after. For example, readable text is more likely to be the plaintext than apparently random junk. A neural network would need to be good at guessing if it got the right answer according to what the user would expect the contents to be, which could never be 100% correct.

However, neural networks can in theory learn any function. So if you have enough cyphertext/plaintext pairs for a particular encryption key, then a sufficiently complex neural network can learn to be exactly the decryption algorithm for that particular key.

Also regarding the continuous vs discrete problem, this is basically solved. The outputs have something like the sigmoid function so you just have to pick a threshold for 1 vs 0. .5 could work. With enough training you could in theory get the correct answer for 1 vs 0 100% of the time.

The above assumes that you have one network big enough to process the entire file at once. For arbitrarily sized ciphertext, you would probably need to do blocks at a time with an RNN, but I don't know if that still has the same "compute any function" properties as for a traditional network.

None of this is to say that such a solution is practically doable.

answered Oct 26 '22 09:10

Hans

Related questions
                            
                                Incremental Nearest Neighbor Algorithm in Python [closed]
                            
                                How to interpret caffe log with debug_info?
                            
                                An understandable clusterization
                            
                                How to write a custom evaluation metric in python for xgboost?
                            
                                How does tf.multinomial work?
                            
                                What is the difference between backpropagation and reverse-mode autodiff?
                            
                                Obtain importance of individual trees in a RandomForest
                            
                                How to acquire tf.data.dataset's shape?
                            
                                What is the difference between cross_val_score with scoring='roc_auc' and roc_auc_score?
                            
                                scaling inputs data to neural network
                            
                                TimeDistributed vs. TimeDistributedDense Keras
                            
                                SGDClassifier vs LogisticRegression with sgd solver in scikit-learn library
                            
                                what is meaning of hook that used in tensorflow
                            
                                How do I load custom image based datasets into Pytorch for use with a CNN?
                            
                                get_config missing while loading previously saved model without custom layers
                            
                                How to avoid impression bias when calculate the ctr?
                            
                                How to get labels ids in Keras when training on multiple classes?
                            
                                UseMethod("predict") : no applicable method for 'predict' applied to an object of class "train"
                            
                                First Order Logic Engine
                            
                                Medical information extraction using Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Neural Network for File Decryption - Possible?

Tags:

machine-learning

neural-network

encryption