Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Various questions about RSA encryption

I'm currently writing my own ASE/RSA encryption program in C++ for Unix. I've been going through the literature for about a week now, and I've started to wrap my head around it all but I'm still left with some pressing questions:

1) Based on my understanding, an RSA key in its most basic form is the combination of the product of the two primes (R) used and the exponents. It's obvious to me that storing the key in such a form in plaintext would defeat the purpose of encryption anything at all. Therefore, in what form can I store my generated public and private keys? Ask the user for a password and do some "simple" shift/replacing on the individual digits of the key with an ASCII table? Or is there some other standard I haven't run across? Also, when the keys are generated, are R and the respective exponent simply stored sequentially? i.e. ##primeproduct####exponent##? In that case, how would a decryption algorithm parse the key into the two separate values?

2) How would I go about programatically generating the private exponent, given that I've decided to use 65537 as my public exponent for all encryptions? I've got the equation P*Q = 1mod(M), where P and Q and the exponents and M is the result of Euler's Totient Function. Is this simply a matter of generating random numbers and testing their relative primality to the public exponent until you hit pay dirt? I know you can't simply start from 1 and increment until you find such a number, as anyone could simply do the same thing and get your private exponent themselves.

3) When generating the character equivalence set, I understand that the numbers used in the set can't be must be less than and relatively prime to P*Q. Again, this is a matter of testing relative primality of numbers to P*Q. Is the speed of testing relative primality independent of the size of the numbers you're working with? Or are special algorithms necessary?

Thanks in advance to anyone who takes the time to read and answer, cheers!

like image 664
Connor Spangler Avatar asked Nov 21 '13 03:11

Connor Spangler


People also ask

Why is RSA not suitable to encrypt large amounts of data?

Simply, RSA is very resource expensive algorithm, it takes time to generate RSA keys and to perform operations on these enormous prime numbers. As the size of data increases, the process load increases and the whole thing ends up taking too much time to complete.

Why is RSA not used to encrypt application messages?

It isn't generally used to encrypt entire messages or files, because it is less efficient and more resource-heavy than symmetric-key encryption. To make things more efficient, a file will generally be encrypted with a symmetric-key algorithm, and then the symmetric key will be encrypted with RSA encryption.

Why is it so difficult to encrypt RSA encryption?

There are infinitely many prime numbers. Using a computer, it is relatively easy to find lots of large prime numbers. At present, however, it is very difficult to find the prime factorisation of a very large number. This is what makes RSA encryption so hard to crack.

What are the possible threats for RSA algorithm?

Four main classes of RSA attacks were found: (1) elementary attacks that show the misuse of the system, (2) low private exponent to show how serious it gets when a low private is used, (3) low public exponent attacks, and (4) attacks on the RSA implementation.


1 Answers

There are some standard formats for storing/exchanging RSA keys such as RFC 3447. For better or worse, most (many, anyway) use ASN.1 encoding, which adds more complexity than most people like, all by itself. A few use Base64 encoding, which is a lot easier to implement.

As far as what constitutes a key goes: in its most basic form, you're correct; the public key includes the modulus (usually called n) and an exponent (usually called e).

To compute a key pair, you start from two large prime numbers, usually called p and q. You compute the modulus n as p * q. You also compute a number (often called r) that's (p-1) * (q-1).

e is then a more or less randomly chosen number that's prime relative to r. Warning: you don't want e to be really small though -- log(e) >= log(n)/4 as a bare minimum.

You then compute d (the private decryption key) as a number satisfying the relation:

d * e = 1 (mod r)

You typically compute this using Euclid's algorithm, though there are other options (see below). Again, you don't want d to be really small either, so if it works out to a really small number, you probably want to try another value for e, and compute a new d to match.

There is another way to compute your e and d. You can start by finding some number K that's congruent to 1 mod r, then factor it. Put the prime factors together to get two factors of roughly equal size, and use them as e and d.

As far as an attacker computing your d goes: you need r to compute this, and knowing r depends on knowing p and q. That's exactly why/where/how factoring comes into breaking RSA. If you factor n, then you know p and q. From them, you can find r, and from r you can compute the d that matches a known e.

So, let's work through the math to create a key pair. We're going to use primes that are much too small to be effective, but should be sufficient to demonstrate the ideas involved.

So let's start by picking a p and q (of course, both need to be primes):

p = 9999991
q = 11999989

From those we compute n and r:

n = 119999782000099
r = 119999760000120

Next we need to either pick e or else compute K, then factor it to get e and d. For the moment, we'll go with your suggestion of e=65537 (since 65537 is prime, the only possibility for it and r not being relative primes would be if r was an exact multiple of 65537, which we can verify is not the case quite easily).

From that, we need to compute our d. We can do that fairly easily (though not necessarily very quickly) using the "Extended" version of Euclid's algorithm, (as you mentioned) Euler's Totient, Gauss' method, or any of a number of others.

For the moment, I'll compute it using Gauss' method:

template <class num>
num gcd(num a, num b) {
    num r;
    while (b > 0) {
        r = a % b;
        a = b;
        b = r;
    }
    return a;
}

template <class num>
num find_inverse(num a, num p) {
    num g, z;

    if (gcd(a, p) > 1) return 0;

    z = 1;

    while (a > 1) {
        z += p;
        if ((g=gcd(a, z))> 1) {
            a /= g;
            z /= g;
        }
    }
    return z;
}

The result we get is:

d = 38110914516113

Then we can plug these into an implementation of RSA, and use them to encrypt and decrypt a message.

So, let's encrypt "Very Secret Message!". Using the e and n given above, that encrypts to:

74603288122996
49544151279887
83011912841578
96347106356362
20256165166509
66272049143842
49544151279887
22863535059597
83011912841578
49544151279887
96446347654908
20256165166509
87232607087245
49544151279887
68304272579690
68304272579690
87665372487589
26633960965444
49544151279887
15733234551614

And, using the d given above, that decrypts back to the original. Code to do the encryption/decryption (using hard-coded keys and modulus) looks like this:

#include <iostream>
#include <iterator>
#include <algorithm>
#include <vector>
#include <functional>

typedef unsigned long long num;

const num e_key = 65537;
const num d_key = 38110914516113;
const num n = 119999782000099;

template <class T>
T mul_mod(T a, T b, T m) {
    if (m == 0) return a * b;

    T r = T();

    while (a > 0) {
        if (a & 1)
            if ((r += b) > m) r %= m;
        a >>= 1;
        if ((b <<= 1) > m) b %= m;
    }
    return r;
}

template <class T>
T pow_mod(T a, T n, T m) {
    T r = 1;

    while (n > 0) {
        if (n & 1)
            r = mul_mod(r, a, m);
        a = mul_mod(a, a, m);
        n >>= 1;
    }
    return r;
}

int main() {
    std::string msg = "Very Secret Message!";
    std::vector<num> encrypted;

    std::cout << "Original message: " << msg << '\n';

    std::transform(msg.begin(), msg.end(),
        std::back_inserter(encrypted),
        [&](num val) { return pow_mod(val, e_key, n); });

    std::cout << "Encrypted message:\n";
    std::copy(encrypted.begin(), encrypted.end(), std::ostream_iterator<num>(std::cout, "\n"));

    std::cout << "\n";

    std::cout << "Decrypted message: ";
    std::transform(encrypted.begin(), encrypted.end(),
        std::ostream_iterator<char>(std::cout, ""),
        [](num val) { return pow_mod(val, d_key, n); });

    std::cout << "\n";
}

To have even a hope of security, you need to use a much larger modulus though--hundreds of bits at the very least (and perhaps a thousand or more for the paranoid). You could do that with a normal arbitrary precision integer library, or routines written specifically for the task at hand. RSA is inherently fairly slow, so at one time most implementations used code with lots of hairy optimization to do the job. Nowadays, hardware is fast enough that you can probably get away with a fairly average large-integer library fairly easily (especially since in real use, you only want to use RSA to encrypt/decrypt a key for a symmetrical algorithm, not to encrypt the raw data).

Even with a modulus of suitable size (and the code modified to support the large numbers needed), this is still what's sometimes referred to as "textbook RSA", and it's not really suitable for much in the way of real encryption. For example, right now, it's encrypting one byte of the input at a time. This leaves noticeable patterns in the encrypted data. It's trivial to look at the encrypted data above and see than the second and seventh words are identical--because both are the encrypted form of e (which also occurs a couple of other places in the message).

As it stands right now, this can be attacked as a simple substitution code. e is the most common letter in English, so we can (correctly) guess that the most common word in the encrypted data represents e (and relative frequencies of letters in various languages are well known). Worse, we can also look at things like pairs and triplets of letters to improve the attack. For example, if we see the same word twice in succession in the encrypted data, we know we're seeing a double letter, which can only be a few letters in normal English text. Bottom line: even though RSA itself can be quite strong, the way of using it shown above definitely is not.

To prevent that problem, with a (say) 512-bit key, we'd also process the input in 512-bit chunks. That means we only have a repetition if there are two places in the original input that go for 512 bits at a time that are all entirely identical. Even if that happens, it's relatively difficult to guess that that would be, so although it's undesirable, it's not nearly as vulnerable as with the byte-by-byte version shown above. In addition, you always want to pad the input to a multiple of the size being encrypted.

Reference

https://crypto.stackexchange.com/questions/1448/definition-of-textbook-rsa

like image 178
Jerry Coffin Avatar answered Sep 29 '22 02:09

Jerry Coffin