Z3 Solver

The Z3 Theorem Prover can automatically solve puzzles in Python

Description

The Z3 Theorem Prover is a Python library that can automatically solve puzzles you give it in code.

$ pip install z3-solver

The idea is that you define Z3 variables and perform certain operations on them. Then you can add constraints to the solver and lets it fulfill those constraints. To learn using Z3 I highly suggest looking up some random puzzles and trying to solve them with Z3. There are lots of useful functions in Z3 to try out.

One example Z3 would be good at is math equations. It can for example solve a quadratic equation like 6x2+11xโˆ’35=06xยฒ + 11x - 35 = 0:

from z3 import *

s = Solver()

# Define variables
x = Real('x')

# Define operations (an equation in this case)
y = 6*x**2 + 11*x - 35

# Define constraints
s.add(y == 0)
# s.add(6*x**2 + 11*x - 35 == 0)  # Also works

if s.check() == sat:  # If satisfiable
    print(s.model())  # [x = 5/3]

Note that the operations and constraints don't need to be in a specific order. You can call s.add() any time to add a constraint with the current variables

By default, Z3 will only try to find one solution, but you can also let it find all the solutions by just adding a constraint to say that it can't use that same solution again, and then letting it solve again:

while s.check() == sat:  # While satisfiable
    m = s.model()
    print(m)  # [x = 5/3], [x = -7/2]
    s.add(x != m[x])  # Exclude this solution

Variable types

  • Int(name): An integer value, without fractions

  • Real(name): A real number, with fractions

  • Bool(name): A boolean value, True or False

  • BitVec(name, bits): "Bit Vector", a collection of bits forming a number. Useful for working with bytes/integers that wrap around. See Bitwise Operations for details

Almost all variable types can also be defined as multiple at once by appending an s to the function name. For example:

Bools('a b c d e')

Functions

  • And(*args): All conditions in this function must be met

  • Or(*args): Any condition in this function must be met

  • Not(a): Condition inverts

  • Xor(a, b): Performs the Exclusive OR operation on the two values. Only one of the two can be true.

  • LShR(n, b): Logical right shift (abcd -> 0abc) The normal >> operator does abcd -> aabc instead. For clarity about all these shifts see StackOverflow

  • Distinct(*args): All values need to be different from each other

The rest of the operations should be available just using the normal arithmetic operators in python, like +, -, *, / or **. Also the bitwise operators: <<, >>, &, | And finally, the comparison operators: ==, !=, <, >, <=, >=

Logic gates

Logic gates are used everywhere in computers. One common problem is finding out what input leads to a given output. You could try to reverse this by hand by going through the circuit backwards but this is often very tedious. Luckily Z3 can do this for us.

It has support for Bool values that are either ON or OFF, just like a regular logic circuit. Then we can use functions like And, Or, Not and Xor to recreate the circuit in Z3, and finally, add the last output to the solver as a constraint. This way Z3 will find a value for all the input booleans that make the output true. See an example of a script that does this for the Google Beginners CTF:

Bitwise Operations

When reverse engineering a low-level cryptographic algorithm you're often looking at bitwise operations like shifts, XORs and multiplication or addition which wrap. This is difficult to do cleanly in plain Python, but Z3 can help us out with the BitVec and BitVecVal constructors to create variables that behave like n-bit numbers, allowing easy bitwise operations and solving.

With signed and unsigned numbers and varying bits, this can be a bit tricky to get right. Most operators work as you would expect:

  • +, -: Add and subtract as unsigned numbers, wrapping on overflow

  • &, |, ^: AND, OR and XOR operations on each bit of both numbers

  • ~: Invert all bits of one number

  • *: Multiply, works like adding multiple times

With BitVec's, there are a few edge cases, however. Namely, some operators perform signed versions by default. This means the first bit of the number represents the sign of the decimal number, and is not always the desired behaviour. This is an especially large pitfall for the >> shift right operator which you might expect to shift right and fill bits on the left with 0's, but instead, it will be filled with the sign (first) bit!

Not only >> is a victim of this, but also other operators like / divide and % modulus. Even comparison operators like < and > do a signed comparison by default. Luckily, there are built-in replacements that do the unsigned version instead. For >>, for example, there is LShR(). Here are some examples of performing specific bitwise operations to explain their differences:

OperationDescriptionExample

x << 1

Shift all bits to the left, discarding the leftmost and filling empty bits with 0's

11110001 11100010

x >> 1

Shift all bits to the right, discarding the rightmost and filling empty bits with the sign bit (leftmost)

10001111 11000111

LShR(x, 1)

Shift all bits to the right, discarding the rightmost and filling empty bits with 0's

10001111

01000111

RotateLeft(x, 1)

Shift all bits to the left, and wrap the leftmost bit back to the right

11100111

11001111

RotateRight(x, 1)

Shift all bits to the right, and wrap the rightmost bit back to the left

11100111

11110011

x / 2

Divide signed number, keeping the sign bit

10110000 11011000

UDiv(x, 2)

Divide unsigned number

10110000 01011000

All these operations can be used on BitVec (variable) and BitVecVal (constant) numbers. If you want to be able to follow how a constant changes and to make sure it follows n-bit operations, wrap it with BitVecVal. This tells Z3 about the number and all operations will behave as explained above.

Example
from z3 import *

s = Solver()

# 16-bit numbers
var = BitVec('var', 16)
const = BitVecVal(1000, 16)

const *= 2000
s.add(var == const)

if s.check() == sat:
    var = s.model()[var].as_long()
    print(var)  # 33920, not 2_000_000

Solving cryptographic functions

Some bad implementations of cryptographic functions may have vulnerabilities that allow you to leak data. It might be hard to find these vulnerabilities yourself by looking at the code, so sometimes you can implement the algorithm in Z3 to check if it can be broken somehow.

Snippets

Some small but useful pieces of Z3 code that are common across scripts.

Get all flags
# Define 20 characters as bytes
flag = [BitVec(f"flag_{i}", 8) for i in range(20)]

# Restrict to printable ASCII
for character in flag:
    s.add(character >= 0x20, character < 0x7e)

[...constraints...]

# Find all solutions, and print as string
while s.check() == sat:
    m = s.model()
    result = bytes([m[flag[i]].as_long() for i in range(len(flag))])
    print(result)

    s.add(Or([flag[i] != m[flag[i]] for i in range(len(flag))]))

For more practical examples, see this repository:

CrossHair: RegEx and more

This testing framework is intended to prove statements you make in a Python docstring. It will use Z3 to try and find counterexamples for edge cases. These are useful to create functions that behave as expected, but also useful as a security researcher to find edge cases.

It is similar to the use of Z3 to solve statements, but much more flexible as it can directly integrate with the Python source code without having to be rewritten, which may change logic in the process. You can use it by defining pre: conditions it should expect, and post: conditions for it to disprove. You can play around with it on the live demo, here is an example (_ = return):

def make_bigger(n: int) -> int:
    '''
    post: _ > n
    '''
    return 2 * n + 10

error: false when calling make_bigger(-10) (which returns -10)

Here it finds the edge case where n is negative enough that the multiplication outweighs the addition, making the implied effect of always returning a larger value false. We could fix the code, or add a pre: n >= 0 line before the post: to tell CrossHair that the input value should never be negative during analysis.

The tool can be installed and run easily from the command line:

$ python3 -m pip install crosshair-tool
...
$ crosshair check main.py
error: false when calling make_bigger(-10) (which returns -10)

The check command has a fairly small default timeout per condition, but it can be increased by setting the --per_condition_timeout argument:

$ crosshair check test.py --per_condition_timeout 999999

Another useful feature for finding differences in functions is the diffbehavior tool. It takes two functions and compares the behavior of the two. Here, a refactor made an unrecognized response return None instead of False:

def version1(s: str) -> bool:
    if s in ('y', 'yes'):
        return True
    return False

def version2(s: str) -> bool:
    if s in ('y', 'yes'):
        return True
    if s in ('n', 'no'):
        return False
$ crosshair diffbehavior test.version1 test.version2
Given: (s='z\x00\x00'),
  test.version1 : returns False
  test.version2 : returns None

Also, see cover for a tool that can automatically generate test cases for all code paths!

Regular Expressions

One of the biggest improvements on Z3 is the fact that it understands regular expressions and that it can solve statements involving them to look for edge cases or bypasses.

If we have a regular expression for which we want to find any valid string, we can simply tell CrossHair there is none and it will try to find a counterexample:

def simple_regex(s: str) -> bool:
    """
    post: not _  # We say: return value will always be False
    """
    return re.fullmatch(r"a(b|c)d{2,4}", s)

error: false when calling simple_regex('abdd') (which returns <...>)

For a more complex example, it can find multiple conditions at once. Think of a first condition as passing through the checks and a second condition as being exploitable.

def intersect(s: str) -> bool:
    """
    post: not _
    """
    return re.fullmatch(r"a(b|c)d{2,4}", s) and \
           re.fullmatch(r"a(c|d)d{4,10}", s)  # Extra condition

error: false when calling intersect('acdddd') (which returns <...>)

Examples

Some examples of security research use cases to find edge cases and bypasses.

First, an RFC-compliant regex that shows it's possible to inject < characters when surrounding the name with " quotes:

Email address XSS
# Source: RFC-compliant - https://stackoverflow.com/a/201378/10508498
def is_email(s: str):
    """
    We tell it it's impossible for this regex to let through a "<"
    post: not (_ and "<" in s)
    """
    return bool(re.fullmatch(r"""(?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21\x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])*")@(?:(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9]))\.){3}(?:(2(5[0-5]|[0-4][0-9])|1[0-9][0-9]|[1-9]?[0-9])|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])+)\])""", s))
$ crosshair check main.py --per_condition_timeout 999999
error: false when calling is_email('"<"@0.0') (which returns True)

Another shorter example where the IPv6 address allows any special characters:

2nd email address XSS
# Source: short-hand version - https://stackabuse.com/validate-email-addresses-with-regular-expressions-in-javascript/
def test(s: str) -> bool:
    """
    post: not (_ and "<" in s)
    """

    return bool(re.fullmatch(r"([!#-'*+/-9=?A-Z^-~-]+(\.[!#-'*+/-9=?A-Z^-~-]+)*|\"(\[\]!#-[^-~ \t]|(\\[\t -~]))+\")@([!#-'*+/-9=?A-Z^-~-]+(\.[!#-'*+/-9=?A-Z^-~-]+)*|\[[\t -Z^-~]*])", s))
$ crosshair check main.py --per_condition_timeout 999999
error: false when calling is_email('?@[<]') (which returns True)

When using multi-line regexes, it finds it is possible to bypass a $ restriction with a newline:

def multiline(s: str) -> bool:
    """
    pre: s
    post: not (_ and "<" in s)
    """
    return bool(re.match("^a(b|c)d$", s, re.MULTILINE))
$ crosshair check main.py --per_condition_timeout 999999
error: false when calling multiline('abd\n<') (which returns True)

Taken from a real CTF challenge where the password was given as a regular expression that should be matched. Only one string would be able to match, and it would be the password. This tool can solve it without having to do any reverse engineering!

def challenge(s: str) -> bool:
    """
    post: not _
    """
    return bool(re.match("(?:(?=[^\u6e0d-\u8ffb])[\x66]){0}?(?:(?=[^\u2f60-\u4bb9])[\x66]){0}?(?:(?=[^\ufcb8-\ufcc1])[\x75]){0}?(?:(?=[^\u7f87-\u99aa])[\x61]){0}?(?:(?=[^\u05c7-\ubcf7])[\x49]){1}?(?:(?=[^\ufa88-\ufc28])[\x65]){0}?(?:(?=[^\u9a98-\uc554])[\x76]){0}?(?:(?=[^\uf84d-\ufdd6])[\x70]){0}?(?:(?=[^\uf5e0-\uf711])[\x6e]){0}?(?:(?=[^\ufa45-\ufbeb])[\x61]){1}?(?:(?=[^\uf0ca-\uf28f])[\x73]){0}?(?:(?=[^\ue189-\uf7cb])[\x7a]){0}?(?:(?=[^\u2998-\u7c8b])[\x70]){0}?(?:(?=[^\u5fa8-\ufbb6])[\x6c]){0}?(?:(?=[^\ufef9-\uffa6])[\x4d]){1}?(?:(?=[^\ub312-\ueb5f])[\x6d]){0}?(?:(?=[^\u32bc-\ue435])[\x6d]){0}?(?:(?=[^\u45b2-\u736c])[\x6e]){0}?(?:(?=[^\u372d-\u96b1])[\x71]){0}?(?:(?=[^\ubeac-\uca7e])[\x74]){0}?(?:(?=[^\u9207-\ua598])[\x61]){1}?(?:(?=[^\ua32a-\uc32e])[\x63]){0}?(?:(?=[^\u10e2-\ufc58])[\x66]){0}?(?:(?=[^\ua3ff-\uc711])[\x7a]){0}?(?:(?=[^\u32b6-\u5fca])[\x77]){0}?(?:(?=[^\u3942-\ue7d8])[\x6d]){0}?(?:(?=[^\ud2dc-\uf3d7])[\x4d]){1}?(?:(?=[^\u7881-\ub2aa])[\x71]){0}?(?:(?=[^\u3173-\ub6b8])[\x74]){0}?(?:(?=[^\ua582-\ue3e7])[\x63]){0}?(?:(?=[^\u1f30-\u4a71])[\x7a]){0}?(?:(?=[^\ue799-\uf0ce])[\x65]){0}?(?:(?=[^\u6618-\u96f4])[\x64]){0}?(?:(?=[^\uc2dd-\uc3d3])[\x71]){0}?(?:(?=[^\ud05b-\ue4fc])[\x65]){0}?(?:(?=[^\udf4b-\ueec5])[\x61]){1}?(?:(?=[^\ue2aa-\uf6f6])[\x72]){0}?(?:(?=[^\u4da9-\uc4d6])[\x6e]){0}?(?:(?=[^\u7e7b-\ubb07])[\x69]){0}?(?:(?=[^\uc718-\uff10])[\x6d]){0}?(?:(?=[^\u6c84-\uac27])[\x6c]){1}?(?:(?=[^\ua4a0-\uf819])[\x66]){0}?(?:(?=[^\ue594-\uee75])[\x63]){0}?(?:(?=[^\uf8ca-\ufb79])[\x66]){0}?(?:(?=[^\u51a2-\u5817])[\x7a]){0}?(?:(?=[^\ucfd8-\uea6a])[\x6f]){0}?(?:(?=[^\u3118-\ud5d2])[\x6d]){0}?(?:(?=[^\uec3e-\ufdfd])[\x6f]){0}?(?:(?=[^\u0fb9-\u8106])[\x4c]){1}?(?:(?=[^\ud516-\udca6])[\x6f]){0}?(?:(?=[^\u24a3-\u8174])[\x6a]){0}?(?:(?=[^\u6110-\ueacf])[\x6b]){0}?(?:(?=[^\uf3be-\uf70f])[\x63]){0}?(?:(?=[^\u863c-\uedd6])[\x6b]){0}?(?:(?=[^\u1918-\ued3b])[\x70]){0}?(?:(?=[^\uccde-\udf61])[\x7a]){0}?(?:(?=[^\ub02e-\ue007])[\x61]){1}?(?:(?=[^\ue823-\uf2b1])[\x72]){0}?(?:(?=[^\u3493-\ub3d4])[\x74]){0}?(?:(?=[^\ue507-\ufc8a])[\x70]){0}?(?:(?=[^\ue249-\uf8b6])[\x72]){0}?(?:(?=[^\u9eb1-\ue0ed])[\x6e]){0}?(?:(?=[^\u8a39-\uefa3])[\x72]){1}?(?:(?=[^\u998b-\u9d4d])[\x74]){0}?(?:(?=[^\uf87c-\ufcd2])[\x72]){0}?(?:(?=[^\u4054-\u5fc4])[\x67]){0}?(?:(?=[^\ufbc4-\ufe79])[\x62]){0}?(?:(?=[^\uf57f-\uf6a1])[\x6c]){0}?(?:(?=[^\u5030-\u64bc])[\x6c]){0}?(?:(?=[^\u2371-\u4ee7])[\x44]){1}?(?:(?=[^\ud132-\ue943])[\x71]){0}?(?:(?=[^\ubef9-\uea29])[\x79]){0}?(?:(?=[^\ub7fa-\ubfa1])[\x69]){0}?(?:(?=[^\u71ce-\u8b83])[\x70]){0}?(?:(?=[^\u691a-\u7279])[\x6a]){0}?(?:(?=[^\u9d53-\ub24b])[\x21]){1}?(?:(?=[^\ud30d-\uf213])[\x72]){0}?(?:(?=[^\u40c2-\udd0b])[\x78]){0}?(?:(?=[^\u94a0-\ud814])[\x6a]){0}?(?:(?=[^\u3fe0-\u91c7])[\x66]){0}?(?:(?=[^\u61db-\ud519])[\x62]){0}?", s))
$ crosshair check main.py --per_condition_timeout 999999
error: false when calling challenge('IaMaMalLarD!') (which returns True)

Last updated