## What is RSA and how does it work?

People always say to me, “Dan, what is number theory even
needed for?”

Usually I give one of two responses, either “Does it have to have a use?” or “One major use is in cryptography”. I approach the second of these responses here.

Number theory is one of the oldest branches of mathematics that arose out of the mathematicians want to understand how the integers work. There is more to numbers than basic arithmetic, they have certain structures and properties that can help us to study other things.

It is quite a central branch in pure maths in that many links can be formed with a lot of seemingly unrelated branches. Over $50$ years ago most people would have categorised number theory as the “hobby” of the mathematical world, studying numbers just for the sake of it (and for pleasure).

However, in the $1970$‘s, number theory became the beholder of security. The RSA system was born.

————————–

Before going on, I had better give a quick run through of how cryptography works.

Cryptography is the art of secret writing. Imagine Alice wishes to send Bob a message, but wants noone else to read it (Alice and Bob are common names in cryptography). There are two major ways Alice can ensure this to happen; either Alice can hide the message in a place where noone can find it except Bob or Alice can hide the meaning of the message so that it cannot be read easily.

The first of these is the process of Steganography and the second is Cryptography. A process that operates on certain units of plaintext (i.e. letters) is called a cipher (this is something different to a code, they operate on words).

We call the message to be sent the plaintext and the result after applying a
cipher the ciphertext. So the ciphertext is the version of the message that is gibberish. The process of going from plaintext to ciphertext is called encryption, the reverse process is called decryption.

In order for Bob to read the gibberish, certain information must be known about how Alice did the encryption and how to use this info to perform the decryption (in order to get the intended message). This information is called the key.

So for example, let’s turn to one of the oldest ciphers, the Caesar cipher.

Suppose Alice wants to send the message “JANE IS AN IDIOT” to Bob, without Jane reading it. Alice might think, “If I shift each letter in my message 3 places on in the alphabet then the message will become gibberish”. So Alice does this and sends the message “MDQH LV DQ LGLRW” to Bob.

The key here is 3 because that is the information needed by Bob to decrypt the message. So Alice finds a way to tell Bob that the key is 3 and then Bob can reverse the procedure by shifting each letter backwards by 3 places in the alphabet…retrieving the original message. Of course the same procedure can be done with any shift between 0 and 25.

Mathematically this process is described as follows. If we label the alphabet A,B,C,…,Z by the numbers 0,1,2,…,25 then:

$C_i \equiv P_i + 3 \bmod 26$

gives you each ciphertext value $C_i$ from the plaintext value $P_i$.

Given knowledge of the key by Bob, he can easily work out the inverse of this procedure:

$P_i \equiv C_i - 3 \bmod 26$

and so can work out the plaintext.

Notice that if Jane ever gets her hands on this inbetween, she only sees the message “MDQH LV DQ LGLRW”, which makes no sense to her.

So at this point, some of you might be saying, “Hang on…can’t Jane just try all $25$ possible shifts until the message makes sense?”

Yes you can, and this takes about 2 minutes using pencil and paper, so the Caesar cipher is not very secure. Despite this, in a time when few could read or write, the Caesar cipher was secure enough to hide messages. From then on we were on a quest to find more secure ciphers.

This type of cipher is called a Substitution cipher, because we took each
letter in the ciphertext and substituted it for another. There is another main
type of cipher, the Transposition cipher, which changes the order of the letters in a message (so all of the message is there just anagrammed according to a specific rule).

————————–

Actually we can generalise the Caesar cipher in many ways. We can first make something called the Affine cipher. The main idea here is that the transformation from $\mathbb{Z}_{26}$ to itself defined by $n \mapsto n+k$ where $k$ is some “shift” value mimics the Caesar cipher. It is an invertible transformation…so we were able to go backwards to find the
plaintext!

There are many other invertible transformations of $\mathbb{Z}_{26}$. If $a$ is coprime to 26 then the transformation $n \mapsto an + k$ is invertible too! These are called affine transformations in number theory.

This means we can make other ciphers defined by:

$C_i \equiv aP_i + k \bmod 26$

with decryption coming from:

$P_i \equiv a^{-1}(C_i - k) \bmod 26$

(from basic number theory, $a^{-1}$ exists if and only if $a$ is coprime to 26). Play around with these.

We are already starting to see number theory crop up.

It should be noted that these ciphers are also substitution ciphers too. Still though, this cipher can be broken by trying all $\phi(26)*26 = 12*26 = 312$ possibilities for $a,k \bmod 26$ and seeing which one gives meaningful text.

One might have a flash of inspiration now and say “what if we just encrypt via any bijection of $\mathbb{Z}_{26}$?” Well this creates an even stronger cipher called the general monoalphabetic substitution cipher.

Put simply, this cipher lets any letter of the alphabet be sent to any letter of the alphabet in a bijective fashion. So for example:

A -> H, B -> S, C -> Y…., Z -> P.

The number of possibilities to brute force this cipher is much bigger than the Affine cipher, there are $26! = 403291461126605635584000000$
possible ways of permuting the letters of the alphabet so there are that many
keys!

However, there are ways to break these ciphers without having to check every possible key, by using statistical tools and looking for letter patterns in the ciphertext (for example we know that same letters in the ciphertext will correspond to same letters in the plaintext). Maybe I will mention the cryptanalysis of these ciphers in the future.

————————–

Another way to generalise the Caesar cipher is to make the so called Vigenere cipher. Instead of shifting each letter in your plaintext the same amount of places in the alphabet, why not change the number after every letter. For example, we might shift the first letter 1 place, the second letter 2 places, third 3 places etc. Then the message “JANE IS AN IDIOT” would become “KCQI NY HV RNTAG”.

We now note that same letters in the plaintext may not become the same letter in the ciphertext and vice versa. This is an example of a Polyalphabetic substitution cipher, each letter is encrypted with a “different” key.

Mathematically, the Vigenere cipher is described by forming the key $K_1 K_2 K_3 ... K_n$, where $n$ is the length of the plaintext and each $K_i$ is a “shift” value for place $i$ in the plaintext.

Then the encryption is:

$C_i \equiv P_i + K_i \bmod 26$

The decryption is similar to earlier, invert by subtracting $K_i$ from each $C_i$, working mod $26$.

The situation is easiest when the $K_i$ follow a pattern or repeat themselves after so long. You can turn the numbers into letters and make a keyword (words are often easier to remember than strings of numbers).

Now the number of keys is variable, if your ciphertext has length $n$ then there are $26^n$ possible keys to check! So if your message is big, you have an impossible amount of keys to check.

Fortunately, or unfortunately (however you view it) there are statistical tools to help us when the keyword is repetitive. So this cipher can sometimes be broken but is still of average security.

If the shifts are generated at random (or say the keyword is taken from a long passage in a book) then effectively you have an “unbreakable cipher”, the so called one time pad. By today’s standards there is not much that can be done to break these ciphers without either trying every key (which is infeasible) or trying to intercept other plaintexts encrypted with the same long key (hence the name one time pad, the key should be used once and then discarded).

————————–

Ok, so we have seen a few classical ciphers briefly (and I am sure I will revisit them sometime in the future in more detail) but for now let’s move into more modern territory.

The thing about all of the classical ciphers mentioned above is that once Jane finds out what the key is, she is sure to be able to decipher the message and find out what it says. So the main security in these ciphers is in keeping the key private. Could there be a way to get around this?

Surprisingly enough, there is a way! This was the start of what is now called public key cryptography.

The basic idea is to have two keys, a public key (for encryption) which will be made available to everyone and a private key (for decryption) which will be kept secret. We never had this before, we only had the one key that served for both encryption AND decryption (these systems are known as symmetric). Our system will be asymmetric, so that we require the private key for decryption.

Here comes the real surprise, the private key will come from certain properties of the public key. This might sound completely strange, after all we are telling everyone the public key so surely they can construct the private key from this? Actually the answer is no and the reason why is pure and simple, some things in maths are easy to do one way but hard to do in the other way.

We will base the method of finding the private key from the public on a trapdoor function, a mathematical procedure that is easy to perform one way but hard to invert.

Let us get to the aim of this post now, the beautiful RSA cipher. This was one of the first public key ciphers to be made and is still one of the most secure! It was created in the 1970’s by three mathematicians/cryptographers/computer scientists; Rivest, Shamir and Adleman (the initials make RSA). Actually I have just told a lie, this system was first invented under top secret conditions by British mathematician Clifford Cocks, whilst working for GCHQ intelligence (but first public knowledge of this system came from the three people mentioned above).

Ok so how does it work?

1) Bob chooses two different prime numbers $p$ and $q$ and forms the number $N = pq$.

2) He also forms the number $(p-1)(q-1)$ and chooses an integer $e>1$ that is coprime to $(p-1)(q-1)$. The reason for this will be explained in a minute.

3) The pair of numbers $(e,N)$ are made public for all to see. This pair form the public key for Bob.

4) Bob can use basic number theory (i.e. Euclids algorithm) to find an integer $d$ such that $ed \equiv 1 \bmod (p-1)(q-1)$. This integer $d$ forms the private key for Bob and is kept secret.

5) Say Alice wants to send a plaintext message $m$ to Bob (assume the message is converted to a number between $0$ and $N$, there are nice ways to do this). She looks up his public key and sends Bob the value of $c \equiv m^e \bmod N$.

6) Bob receives the value $c$ and can find the original message by calculating $c^d \bmod N$ (using his private key).

Ok so first we ask why working out $c^d \bmod N$ gives the original
plaintext. Well this is really none other than Fermat’s little theorem in action (or perhaps it is better to call this a use of Lagrange’s theorem in group theory, but i’ll do it using Fermat and let you spot easier way to do it using group theory).

We see that:

$c^d \equiv m^{ed} \equiv m^{k(p-1)(q-1)+1} \equiv m^{k(p-1)(q-1)}m \bmod N$

From this we deduce that:

$c^d \equiv (m^{p-1})^{k(q-1)}m \equiv m \bmod p$

and similarly:

$c^d \equiv m \bmod q$

Both using Fermat’s last theorem. The Chinese remainder theorem now tells us that $c^d \equiv m \bmod N$, so we really do get the plaintext message back!

————————–

Ok so let’s put ourselves in Jane’s shoes. Say she intercepts $c$ somehow. At this moment in time it is not known, mathematically, if there is any way to find $m$, other than to find the value of the private key $d$ first.

But to be able to find $d$ in a plausible amount of time you have to know the value of $(p-1)(q-1)$, and to know that you have to know the values of $p$ and $q$. However, Jane only knows the value of $N=pq$ so cannot know the values of the two primes without factorising $N$.

I am sure you are probably questioning now whether this system really is secure (for example if I chose $N = 15$ as part of my public key, I am sure you could easily find the values of the two primes I used and so could break the cipher easily).

The one detail I have left out so far is that the primes you choose need to be big. You see, a computer can easily multiply two big prime numbers together but if the primes are big enough, it is infeasible (with today’s methods) to break the number back down into the two primes.

The kind of primes used by RSA would typically have over 300 digits, making factorisation unbelievably difficult (modern supercomputers can only manage around 250 digits with lots of time needed to factorise). We know, by Euclid, that there are infinitely many primes and also we are able to generate them faster than we are able to factorise such $N$’s, so the system is secure…until someone is able to develop better and better algorithms for integer factorisation or if someone finds a fault with the system (like a shortcut that gets $m$ without needing $d$).

Of course, every system has its weaknesses and I shall be highlighting a few in the next post. The RSA system is a truly beautiful use of number theory in the modern world. It underpins a lot of internet/banking security amongst other things.