Pseudo-Random Number Generators (PRNG)
Often the default random function in whatever language is not cryptographically secure, making it possible to predict values
Last updated
Often the default random function in whatever language is not cryptographically secure, making it possible to predict values
Last updated
import random
= Mersenne TwisterThe default Python random
module is very fast, but not very secure. If the application allows you to get 624 or more random values from it, you can crack the random seed used by the generator to predict future values. You can also use this to know values that are generated by the application, but not shown to you, like generating a key right after showing you 624 random values.
The mersenne-twister-predictor library is a nice Python implementation of this attack. It can recreate the internal state of Python's random
module, after submitting 624 32-bit values. Then the regular python random
interface is available to make perfect predictions for the future:
I made a writeup of a challenge where you had to crack the seed after 624 random values, to generate the same key as the application would encrypt the flag:
Note that this library is not limited to 32-bit numbers, larger numbers divisible by 32 are also possible. These are made up of multiple smaller 32-bit numbers, and thus also require fewer total samples because each one gives more information:
When the code you are trying to break leaks information after the numbers you want to retrieve, we need to predict the past rather than the future. Luckily this is fairly easy by just implementing an algorithm that untwists the internal state so that we can even reach to before the state was leaked. The following functions can be added to mt19937predictor.py
MT19937Predictor
class to add an interface for offsetting the state by any amount, even backward.
Imagine the following scenario where we want to recover the unknown
values:
When samples are smaller than 32 bits, or less than 624 samples, you won't have enough to perfectly recreate the internal state. However, it turns out that statement solvers like Z3 that use symbolic execution can get constraints set to the samples, and then solve for the seed. This process will be a lot slower, but it may be your only option.
The symbolic_mersenne_cracker repository implements an easy-to-use symbolic solver using Z3 where you can feed it partial samples with for example 16 bits instead of 32, which it then solves and gives a synchronized random
instance. It works by submitting binary strings with ?
question marks for the unknown bits.
Tip: To solve a scenario where the solution isn't very simple like with getrandbits()
, you can view the source code of the random
module yourself or for going deeper even the C implementation. Then figure out what parts of getrandbits
your function uses.
Math.random()
= xorshift128+Javascript has a few variants per browser. Chrome, Firefox, and Safari all do slightly different things when it comes to generating random numbers. But Chrome uses V8 for JavaScript, and so does NodeJS. This makes it the biggest target.
The Math.random()
function is not cryptographically secure, and with about 5 random numbers from it, one can crack the random state and predict future values.
PwnFunction made a great video explaining the attack and published a Python script that uses the Z3 solver to solve the random state and predict future numbers.
A more practical example is for situations where the leaked bits are truncated (eg. floored) to some smaller number of bits per sample. This could be for a simple activation code generator:
This code only leaks some part of the random output as it is a rounded decimal, but with this limited information, we can still efficiently solve the state as found in another piece of research with the tool below. These solutions need more samples, around 15 to be consistent. Then the solver can be run again with these inputs to predict past and future values.
See the README.md for more details, as the 64-wide cache from v8 makes it slightly confusing:
java.util.Random()
= Linear Congruential GeneratorJava's random number generator uses a Linear Congruential Generator (LCG). These generators are really fast but also really insecure. The internal state is only 48 bits long and can be recovered with only a few samples. For example, it doesn't help that the int
output values from the generator are simply the first 32 bits of the state, allowing you to brute-force the last 16 bits if you have another sample you can compare with.
This idea is explained here and implemented well for a few functions that give a lot of information:
For attacking general LCGs, see this writeup which shows a few different techniques depending on how much of the constants are known
The above method is possible because you get a big part of the internal state in one single number (like and int
). But in some cases, you can't get all this information, only small numbers that are generated using a range. This is where another more involved method comes in that can analyze the patterns between numbers in order to recover the internal state.
Getting a number from the java.util.Random()
generator works by truncating the state (seed) with a number of bits. If you need a 32-bit number, the first 32 bits of the seed are given and the seed rotates for the next time. If you need an 8-bit number the first 8 bits are given and the seed is rotated again. This means that using 8-bit numbers you only get a small part (top 8 bits) of the seed, and then it already rotates to the next seed. That is the problem we are trying to solve.
It turns out this can be done using LLL Reduction, a highly useful algorithm in cryptanalysis. The math of this all is pretty complex, but implementations already exist allowing you to use them. This gist has a few different methods for cracking general LCGs and can be applied to Java specifically by only changing the constants.
I made another gist that is specific to Java's Random()
class and has an easy-to-use CLI.
$RANDOM
The bash
shell has a dynamic variable called $RANDOM
you can access at any time to receive a random 15-bit number:
To seed this random number generator, it can be set directly to get the same values every time:
There are 2 different calculations depending on your bash version, which may make one seed give two different outputs.
The algorithm works by iterating a 32-bit integer internal seed every time you access it. This means that if you can sync up with the seed, you can predict all future values of the variable. Luckily, the calculations the algorithm performs are very fast meaning it is easy to try every possible 32-bit seed and compare the results with your expected values.
I looked at the bash source code to find out how it exactly works, and created a tool that uses the above idea to brute-force every seed in only a few seconds, supporting both bash versions: