RSA
An encryption standard using prime number factorization to encrypt and decrypt with an asymmetric keypair
Description
Symbols
n: Modulus, part of public keypandq: The prime factors ofne: Exponent, part of public keyc: Ciphertext, result after encryptingd: Decryption exponent, part of the private keym: Message, plaintextor
phi: Decryption modulus
Equations
, where and are distinct primes.
such that
, where
Python
RsaCtfTool has a lot of attacks built-in for common challenges. See test.sh and the test() functions in the attacks on their GitHub for lots of examples of what inputs an attack needs
Common options
--private: Display private key if recovered, should always be included--attack: Specify the attack modes (default: all)-n N: Specify the modulus (format: int or 0xhex)-e E: Specify the public exponent, using commas to separate multiple exponents (format: int or 0xhex)--uncipher UNCIPHER: Uncipher a ciphertext, using commas to separate multiple ciphers--publickey PUBLICKEY: public key file. You can use comma-separated or wildcards (*or?) for multiple keys.
Attacks
A collection of how to exploit attacks on specific RSA cases
Private key from Public key
This automatically tries many common vulnerabilities in private key generation to guess it:
Factoring manually
The whole security of RSA comes from the difficulty of finding the private factors p and q that multiply to the public n. With huge numbers generated completely randomly, this task is impossible for today's computers. But for smaller numbers or numbers with specific patterns this task may become doable. The method above tries many patterns of primes, but if the primes are small we can try to factor them ourselves.
First, a good idea is to check if this hasn't been done already. FactorDB has a giant database of already factored numbers, some of which are surprisingly big. It does not contain every computable number though, so sometimes you'll want to do it manually.
One tool that does this very quickly and efficiently is yafu (install):
Small exponent, short plaintext (root)
With a small exponent, the plaintext (m) will not be very large after exponentiation. Then after the modulus n is applied only a few iterations of k will be done, or even none if . This means that we can just iterate over k until we find a perfect integer root.
A good indicator of this is when c is significantly smaller than n. Here's an example where e=3, resulting in the equation we can brute-force:
Chinese Remainder Theorem (CRT)
The attack using the Chinese Remainder Theorem is a more powerful version of the idea from above. Instead, it works when , and you have amount of different 's and 's with the same plaintext. So often when the message isn't too long (no padding) and you have multiple ciphertexts and public keys, you can use this attack.
Let's say that in RSA you use (can be any exponent just requires more samples). Then you would need 3 ciphertext and public key examples. The equations for this would look like this:
Then we can use the idea of CRT which says using:
where you know and , you can find efficiently. In the case of RSA, this would be , and then we can simply get the cube root to find and we have cracked the message. (source)
An example implementation for this attack would be:
Here get_encrypted() is a simple RSA implementation with a high enough e=257 that a simple root of the ciphertext won't work. But using the CRT you can get 257 different samples and compute x, to finally get m.
This script runs for about 40 seconds, but for e=65537 and 1024-bit primes, it would take about 10 hours. The biggest bottleneck here is generating the primes by the "server", as this can take around a second for 1024-bit primes. When we need 65537 samples this really adds up, but in a real-world scenario, 10 hours is very doable.
Coppersmith's Attack
The small exponent attack explained in the earlier root section only works when the plaintext is short. That is why there is another attack that requires any of the following information:
A part of the plaintext (Stereotyped Messages)
High bits of either
pandqprimes (writeup)
The technique involves quite a bit of math, being a result of Lattice Reduction (LLL). A great page with some details and resources about the specifics of the Coppersmith's attack is this GitHub repository by ashutosh1206.
Stereotyped Messages
For this type of attack, we need to know a part of the start of the plaintext. An example of such a challenge would be the following (also see this writeup):
To verify we can use this attack to efficiently recover the plaintext we need to make sure that: . Where this difference is between the plaintext and your guess of the plaintext. If you know a significant part of the start of a plaintext this difference will be small enough to satisfy the condition. It also means that the smaller e is, the larger the upper bound for the difference, and the less plaintext we need to know. We can do a sanity check in Python like this:
In this example, the XXX is small enough that we can use this attack to efficiently recover the plaintext. We can use SageMath to do some mathematical magic and compute possible differences, which is Coppersmith's attack. We can even try multiple lengths of the unknown text because this attack only takes a few seconds:
Euclidean Algorithm
The Euclidean Algorithm is originally an algorithm for efficiently computing the Greatest Common Divisor (GCD) for two numbers. If this answer is 1, it means the numbers don't share any factors, which may be important in some cryptosystems.
There is also the Extended Euclidean Algorithm that can do a lot more. It can find two new numbers, that when multiplied with their respective numbers and added, equal the greatest common divisor. This is especially useful when the numbers are coprime, meaning the GCD is equal to 1. Then you can rearrange the equation to have something useful in modular arithmetic. In the example below, and would be the inputs, and the Extended Euclidean Algorithm finds and :
This last line is how we get the multiplicative inverse in RSA, used to generate the value of d. But in custom schemes similar to RSA, this may be exploitable. When working with it may be possible to reduce some arguments to 1 like above.
Reverse of Modulo Multiplication
Here is an example of using this knowledge to break a flawed cryptosystem. Let's say it uses two random numbers that multiply together, and after the modulus is applied, you get the answer. In this example you have only one of the factors, the modulus, and the result, you want to calculate the other factor that was used.
In normal arithmetic without a modulus, this would be easy. Just divide the answer by the factor you know, to get the other one. When in modular arithmetic, this is a bit harder. The answer might have wrapped around to another iteration of the modulus. With big numbers, just brute-forcing this takes way too long. That is where the Extended Euclidean Algorithm comes in. As explained in this math exchange answer, it tells us we can get a number that when multiplied with , becomes . So in our original equation, we can just multiply by this to get a nice equation for calculating with only variables we know:
So this finally means we have an efficient way of calculating the other factor we wanted. In code, it would look something like this:
Redacted Private Key
Sometimes you'll find part of a private key or a partially redacted screenshot for example. In some cases, there is still enough information in the redacted key that we can recover the entire private key. A good example is this writeup from Cryptohack.
Often you'll find the private key in the PEM format:
This format encodes a few numbers that RSA uses in the following order:
When the key is decoded from Base64, the raw data is split by ASN.1 headers. These differ per private key but are in a simple format. For example 02 82 01 01:
02: The data type: Integer82: Meaning the length of the encoded integer value will be stored in the following 2 bytes0101: The length of the integer value that follows. Taking this value (257) and reading the next 257 from Big Endian results in the number
You can then search the redacted private key for these header values and find parts of the private key. Then after writing down every number you have found, you can try to use the RSA Equations to calculate or brute-force unknown values.
You might find the value and q, but no n as seen in the writeup linked above. In that case, we know some equations, and we can find p having only one unknown:
Here meaning we can easily brute-force it until we get a valid p that is prime.
Now we have p and q, and can easily multiply them to get n. We can also now calculate d from p and q to get the decryption key.
If you get multiple possible values for p, you could use some partial numbers in the redacted private key to verify each possible value. If you have some bits of the n for example, you could calculate n and then verify it with the known bits.
No modulus (n)
n)If exponent e is small, the ciphertext might not have wrapped around with the modulus and you can just root it to get the original plaintext. You can verify this by checking if the ciphertext is a perfect root.
Known plaintext
If you have 2 plaintext-ciphertext pairs, you can recover N with some math:
Here the last two equations both have n multiplied by some different k. The product of can be easily calculated as the difference between and , which we know from the plaintext-ciphertext pairs. This means we know two different numbers, which have n as a factor. We can use the Greatest Common Divisor (GCD) to calculate this common factor n. Sagemath has a very fast implementation for the power and GCD:
2 keys: Same n, same ciphertext, different e
Multiple keys with common factors
Converting keys
Formats of RSA keys can be tricky. Here are a few common ways to convert public/private keys into other useful formats
n and e to any file format
n and e to any file formatAny file format to n and e
n and eLast updated

