An Explanation of the RSA Public Key Cryptosystem

INTRODUCTION

RSA (which stands for Rivest, Shamir, Adleman - the names of its inventors, all professors at MIT) is a "public key" encryption algorithm. That is, it uses two separate keys for encryption and decryption. The two share a relationship which makes it possible for data encrypted with the public key to be decrypted with the private key, but not the other way around. Further, it is considered computationally infeasible to derive the private key from the public key (meaning, it could in theory be done, but would require more processing power than anyone has available).

The RSA algorithm is based on the idea that, as of right now, there is no known way to factor large integers into their prime factors. Here, large means having decimal representations with many hundreds of digits (or binary representations with many thousands of digits). However, there is a way to tell whether a large number is prime fairly efficiently. In fact, there are several ways to do this: most implementations use a probabilistic test called the Rabin-Miller randomized primality test that tells, with extremely high likelihood, whether or not a given number is prime (with the probability of error being somewhere on order of $1/2^{200}$ - so small that the computer is much more likely to be broken than the algorithm wrong). There is also a completely deterministic method, but it's fairly expensive and in practice most people use Rabin-Miller.

To show how the algorithm works, we need some definitions and theorems from Algebra to understand the mathematics. I introduce these here, but don't give much in the way of proof (in most cases, the proofs are not particularly hard anyway); for those are interested in proofs of the underlying mathematics, I recommend getting a good book on abstract algebra to understand them. Michael Artin's Algebra is considered standard and is quite good.

DEFINITION OF THE mod OPERATOR IN THE INTEGERS

Given an integer a and another integer $N$, we define the modulus operator,

      
        \[
            a \pmod N
        \]

to be the remainder of a when divided by $N$.

That is, for all $a$ and $N$ in the naturals, there exist integers $q$ and $r$ with $r \lt N$ such that,

and we say that $(a \pmod N) = r$.

In fact, it can be shown that $q$ and $r$ are unique. Several properties of mod can also be shown from this definition:

For all $a, b, c, N$ in the integers,

$(a \pmod N)(b \pmod N) = (ab) \pmod N$
$(a \pmod N)^b \pmod N = a^b \pmod N$
$(a^b \pmod N)^c \pmod N = (a^b)^c \pmod N = a^{bc} \pmod N$

Note also that, if no confusion will arise, I will drop the parenthesis from around $\pmod N$.

GROUP DEFINITIONS

A group $(G, \cdot)$ is a set $G$, together with some binary operation $\cdot$ on the elements of $G$ such that the following axioms hold:

$\cdot$ is closed on $G$. That is, $\forall x, y \in G, x \cdot y$ and $y \cdot x$ are in $G$.
$\cdot$ is associative on $G$. That is, $\forall x, y, z \in G, x \cdot (y \cdot z) = (x \cdot y) \cdot z$.
Existence of an identity. There exists an element $e \in G$ such that, $\forall x \in G, e \cdot x = x \cdot e = x$. Note that this implies that $e \cdot e = e$. It can be shown that $e$ is unique for any given group (though we shall not do so here), and we call $e$ the identity element or identity of G.
Existence of inverses. $\forall x \in G, \exists x^{-1} \in G$ such that $x\cdot x^{-1} = x^{-1}\cdot x = e$.

Suppose that $(G, \cdot)$ is a group, and that $G$ is a finite set, then we call $(G, \cdot)$ a finite group.

(Note that we don't say that $\cdot$ need be commutative on $G$ [that is, for $x, y \in G, x \cdot y = y \cdot x$]. In general, this is not true of groups. But, if $\cdot$ is commutative, then we say that the group is commutative and call it an Abelian group. This is not particularly relevant to understanding RSA, however.)

If no ambiguity will arise, we often just write $G$ to mean $(G, \cdot)$.

If $G$ and $G^\prime$ are two groups, and there exists a function $f: G \rightarrow G^\prime$ such that, for all $x, y \in G, f(x \cdot y) = f(x) \cdot^\prime f(y)$, then we say that f is a homomorphism and that $G$ and $G^\prime$ are homomorphic.

If $G$ and $G^\prime$ are homomorphic groups and $f$ is a bijective homomorphism between then, then we call $f$ an isomorphism (or group isomorphism) and say that $G$ and $G^\prime$ are isomorphic.

EULER'S TOTIENT FUNCTION

For a positive integer $N$, Euler defined the totient function ($\phi(N)$) to be the number of positive integers less than or equal and relatively prime to $N$. That is,

      
        \[
            \phi(N) = |\lbrace k \in \mathbb{Z} \mid
                          0 \lt k \lt N \,\mathrm{and}\,
                          \gcd(N, k) = 1 \rbrace|
        \]

It can be shown that the numbers $\lbrace 1, \dots, \phi(N) \rbrace$ taken with multiplication $\mod\phi(N)$ form a group. We say that this group is isomorphic to the group $((\mathbb{Z}/N\mathbb{Z})^*, \cdot)$, but showing that is beyond the scope of this article. Interested parties should see Artin; just recognize the notation.

STATEMENT OF FERMAT'S LITTLE THEOREM AND GENERALIZATION TO A GROUP

(Not to be confused with Fermat's last theorem!)

Fermat's little theorem says that, given a prime number $p$ and any integer $a$, $a^{p - 1} = 1 \pmod p \implies a^p = a \pmod p$. This generalizes for any integer $n$ coprime to $a$ and we get: $a^{\phi(n)} = 1 \pmod n \implies a^{\phi(n) + 1} = a \pmod n$.

Proof of Fermat's little theorem is omitted here for brevity. Interested parties should see Artin or a similar reference.

We can generalize this theorem to apply to a group by noting that, for any element $a$ in a group $(\mathbb{Z}/N\mathbb{Z})^*$ of order $\phi(N)$ for some positive integer $N$, $a^{\phi(N)} = e$. That is, $a \cdot a \cdot \dots \cdot a$ for $\phi(N)$ factors is the identity.

For more details, again, see Artin or another reference.

THE EXTENDED EUCLIDIAN ALGORITHM AND FINDING INVERSES IN A FINITE GROUP OF INTEGERS

We know that $G = (\mathbb{Z}/N\mathbb{Z})^*$ for some positive integer $N$ is a finite group under multiplication $\mod N$. Further, this group is isomorphic to the group, $(\lbrace1, \dots \phi(N)\rbrace, \cdot \pmod{\phi(N)})$. Then, for every $x \in G$, $x$ also has an inverse in $G$. The question is, how do we find it? That is, we want a number $y$ such that,

      
        \[
            xy = 1 \pmod N
        \]

Recall that this means,

      
        \[
            xy - qN = 1
        \]

Several thousand years ago, Euclid gave his famous algorithm for finding the greatest common divisor of two integers $a$ and $b$: it was later extended to finding integers $c$ and $d$ such that:

      
        \[
            ac + bd = gcd(a, b)
        \]

This is the so called "Extended Euclidean Algorithm." Here is a version in Python:

      
        def euclid(a, b):
            if (a % b) == 0:
                return [0, 1]
            else:
                [c, d] = euclid(b, a % b)
                return [d, c - d*(a / b)]

Details about this program the Extended Euclidean Algorithm and its implementation here are beyond the scope of this article. For more information, see Artin or an equivalent text.

To find inverses in the group $G$, realize that every element in $G$ will be relatively prime to $N$ by definition, including our element $x$. So let $a = x, c = y$ and $b = N, d = -q$. Then we have:

      
        \[
            1 = \gcd(a, b) = ac + bd
        \]

And by applying Euclid, we can recover $-q = d$ and $c = y$, the inverse of $x$.

Finally, we have the mathematical background we need and are ready to proceed to the RSA algorithm itself.

GENERATING KEYS

We will start with generating keys and then move into performing encryption and decryption. This builds on the machinery that we have collected (if not completely assembled) above.

Randomly pick two prime integers p and q; make sure they are very large, as described in the beginning of this post.
Compute $N = pq$.
Pick some number $e$ randomly from $(\mathbb{Z}/N\mathbb{Z})^*$, which has order $(p - 1)(q - 1)$.
By calculating inverses in a group, find the number $d$ such that $f(d) = f(e^{-1}) \implies f(ed) = 1 \pmod{\phi(N)}$, where $f$ is an isomorphism from $(\mathbb{Z}/N\mathbb{Z})^*$ to $\mathbb{Z}/\phi(N)\mathbb{Z}$. That is, $e$ and $d$ are inverses when considered in the group $\mathbb{Z}/\phi(N)\mathbb{Z}$ which is isomorphic to $(\mathbb{Z}/N\mathbb{Z})^*$.
The tuple $(d, N)$ is the (private) decryption key. Keep it safe.
The tuple $(e, N)$ is the (public) encryption key. Publish it widely.

ENCRYPTION AND DECRYPTION ALGORITHMS

Consider a message $M$. We can think of $M$ as some number (even if it's really text, for instance). We require that $O \lt M \lt N$. Then, we encrypt $M$ as follows:

      
        \[
            C = M^e \mod N
        \]

That's it. Given an encrypted message $C^\prime$, we decrypt as follows:

      
        \[
            M^\prime = (C^\prime)^d \mod N
        \]

Suppose that $C^\prime = C$ as above. Then, we can see that $M^\prime = M$ as follows:

      
        \[
            \begin{eqnarray*}
              M^\prime &=& (C^\prime)^d \mod N  \\
                       &=& C^d \mod N  \\
                       &=& (M^e \mod N)^d \mod N  \\
                       &=& M^{ed} \mod N
            \end{eqnarray*}
        \]

But, recall that $ed = 1 \pmod{\phi(N)} \implies ed = q\phi(N) + 1$. Therefore,

      
        \[
            \begin{eqnarray*}
              M^\prime &=& M^{q\phi(N) + 1} \mod N  \\
                       &=& (M^{q\phi(N)} M) \mod N  \\
                       &=& ((M^{\phi(N)})^q M) \mod N  \\
                       &=& ((M^{\phi(N)} \mod N)^q M) \mod N
            \end{eqnarray*}
        \]

But $M^{\phi(N)} \pmod N = 1$, by Fermat's little theorem. Then,

      
        \[
            \begin{eqnarray*}
              M^\prime &=& (1^q M) \mod N  \\
                       &=& 1 \cdot M \mod N  \\
                       &=& M \mod N
            \end{eqnarray*}
        \]

But, $0 \lt M \lt N$, so $M \pmod N = M$ and thus

      
        \[
            M^\prime = M
        \]

as desired.

But what about the constraints we imposed on the message? They seem unreasonable. Fortunately, they can easily be worked around.

Any message we wish to send will be greater than $0$. Messages, by their very nature, are unsigned when considered as numbers. So, that's not a problem. Suppose that $M \gt N$. Then, we can simply split $M$ up smaller messages $M_1, M_2, \dots, M_n$ such that $0 \lt M_i \lt N$ for $i \in \lbrace 1, \dots, n\rbrace$. But, probably a better way to do it is to generate some random number $k$ that can be used as a secret key for a symmetric encryption algorithm (e.g., blowfish, AES, etc). Then, we encrypt the message $M$ using $k$, and encrypt the key $k$ using RSA, concatenate the two, and send them to the recipient. The recipient then splits them apart, decrypts the symmetric key using RSA, and uses that to decrypt the message itself. So let's assume that $M$ is always less than $N$.

And that is how RSA works.

Special thanks to Mark Conger for helpfully pointing out that the original post unnecessarily required that $\gcd(N, M) = 1$. This restriction has been removed. Also thanks to Abhijit Ray for pointing out a small error in the statement of Fermat's Little Theorem.