Z3 Solver
The Z3 Theorem Prover can automatically solve puzzles in Python
Description
The Z3 Theorem Prover is a Python library that can automatically solve puzzles you give it in code.
The idea is that you define Z3 variables and perform certain operations on them. Then you can add constraints to the solver and lets it fulfill those constraints. To learn using Z3 I highly suggest looking up some random puzzles and trying to solve them with Z3. There are lots of useful functions in Z3 to try out.
One example Z3 would be good at is math equations. It can for example solve a quadratic equation like :
Note that the operations and constraints don't need to be in a specific order. You can call s.add()
any time to add a constraint with the current variables
By default, Z3 will only try to find one solution, but you can also let it find all the solutions by just adding a constraint to say that it can't use that same solution again, and then letting it solve again:
Variable types
Int(name)
: An integer value, without fractionsReal(name)
: A real number, with fractionsBool(name)
: A boolean value, True or FalseBitVec(name, bits)
: "Bit Vector", a collection of bits forming a number. Useful for working with bytes/integers that wrap around. See Bitwise Operations for details
Almost all variable types can also be defined as multiple at once by appending an s
to the function name. For example:
Bools('a b c d e')
Functions
And(*args)
: All conditions in this function must be metOr(*args)
: Any condition in this function must be metNot(a)
: Condition invertsXor(a, b)
: Performs the Exclusive OR operation on the two values. Only one of the two can be true.LShR(n, b)
: Logical right shift (abcd
->0abc
) The normal>>
operator doesabcd
->aabc
instead. For clarity about all these shifts see StackOverflowDistinct(*args)
: All values need to be different from each other
The rest of the operations should be available just using the normal arithmetic operators in python, like +
, -
, *
, /
or **
.
Also the bitwise operators: <<
, >>
, &
, |
And finally, the comparison operators: ==
, !=
, <
, >
, <=
, >=
Logic gates
Logic gates are used everywhere in computers. One common problem is finding out what input leads to a given output. You could try to reverse this by hand by going through the circuit backwards but this is often very tedious. Luckily Z3 can do this for us.
It has support for Bool
values that are either ON or OFF, just like a regular logic circuit. Then we can use functions like And
, Or
, Not
and Xor
to recreate the circuit in Z3, and finally, add the last output to the solver as a constraint. This way Z3 will find a value for all the input booleans that make the output true. See an example of a script that does this for the Google Beginners CTF:
Bitwise Operations
When reverse engineering a low-level cryptographic algorithm you're often looking at bitwise operations like shifts, XORs and multiplication or addition which wrap. This is difficult to do cleanly in plain Python, but Z3 can help us out with the BitVec
and BitVecVal
constructors to create variables that behave like n-bit numbers, allowing easy bitwise operations and solving.
With signed and unsigned numbers and varying bits, this can be a bit tricky to get right. Most operators work as you would expect:
+
,-
: Add and subtract as unsigned numbers, wrapping on overflow&
,|
,^
: AND, OR and XOR operations on each bit of both numbers~
: Invert all bits of one number*
: Multiply, works like adding multiple times
With BitVec
's, there are a few edge cases, however. Namely, some operators perform signed versions by default. This means the first bit of the number represents the sign of the decimal number, and is not always the desired behaviour. This is an especially large pitfall for the >>
shift right operator which you might expect to shift right and fill bits on the left with 0's, but instead, it will be filled with the sign (first) bit!
Not only >>
is a victim of this, but also other operators like /
divide and %
modulus. Even comparison operators like <
and >
do a signed comparison by default. Luckily, there are built-in replacements that do the unsigned version instead. For >>
, for example, there is LShR()
. Here are some examples of performing specific bitwise operations to explain their differences:
x << 1
Shift all bits to the left, discarding the leftmost and filling empty bits with 0's
11110001
11100010
x >> 1
Shift all bits to the right, discarding the rightmost and filling empty bits with the sign bit (leftmost)
10001111
11000111
LShR(x, 1)
Shift all bits to the right, discarding the rightmost and filling empty bits with 0's
10001111
01000111
RotateLeft(x, 1)
Shift all bits to the left, and wrap the leftmost bit back to the right
11100111
11001111
RotateRight(x, 1)
Shift all bits to the right, and wrap the rightmost bit back to the left
11100111
11110011
x / 2
Divide signed number, keeping the sign bit
10110000
11011000
UDiv(x, 2)
Divide unsigned number
10110000
01011000
All these operations can be used on BitVec
(variable) and BitVecVal
(constant) numbers. If you want to be able to follow how a constant changes and to make sure it follows n-bit operations, wrap it with BitVecVal
. This tells Z3 about the number and all operations will behave as explained above.
Solving cryptographic functions
Some bad implementations of cryptographic functions may have vulnerabilities that allow you to leak data. It might be hard to find these vulnerabilities yourself by looking at the code, so sometimes you can implement the algorithm in Z3 to check if it can be broken somehow.
Snippets
Some small but useful pieces of Z3 code that are common across scripts.
For more practical examples, see this repository:
CrossHair: RegEx and more
This testing framework is intended to prove statements you make in a Python docstring. It will use Z3 to try and find counterexamples for edge cases. These are useful to create functions that behave as expected, but also useful as a security researcher to find edge cases.
It is similar to the use of Z3 to solve statements, but much more flexible as it can directly integrate with the Python source code without having to be rewritten, which may change logic in the process. You can use it by defining pre:
conditions it should expect, and post:
conditions for it to disprove. You can play around with it on the live demo, here is an example (_
= return):
error: false when calling
make_bigger(-10)
(which returns-10
)
Here it finds the edge case where n
is negative enough that the multiplication outweighs the addition, making the implied effect of always returning a larger value false. We could fix the code, or add a pre: n >= 0
line before the post:
to tell CrossHair that the input value should never be negative during analysis.
The tool can be installed and run easily from the command line:
The check
command has a fairly small default timeout per condition, but it can be increased by setting the --per_condition_timeout
argument:
Another useful feature for finding differences in functions is the diffbehavior
tool. It takes two functions and compares the behavior of the two. Here, a refactor made an unrecognized response return None
instead of False
:
Also, see cover
for a tool that can automatically generate test cases for all code paths!
Regular Expressions
One of the biggest improvements on Z3 is the fact that it understands regular expressions and that it can solve statements involving them to look for edge cases or bypasses.
If we have a regular expression for which we want to find any valid string, we can simply tell CrossHair there is none and it will try to find a counterexample:
error: false when calling
simple_regex('abdd')
(which returns<...>
)
For a more complex example, it can find multiple conditions at once. Think of a first condition as passing through the checks and a second condition as being exploitable.
error: false when calling
intersect('acdddd')
(which returns<...>
)
Examples
Some examples of security research use cases to find edge cases and bypasses.
First, an RFC-compliant regex that shows it's possible to inject <
characters when surrounding the name with "
quotes:
Another shorter example where the IPv6 address allows any special characters:
When using multi-line regexes, it finds it is possible to bypass a $
restriction with a newline:
Taken from a real CTF challenge where the password was given as a regular expression that should be matched. Only one string would be able to match, and it would be the password. This tool can solve it without having to do any reverse engineering!
Last updated