9-Feb-2017: Symbolic execution

(The following text has been moved to the article at https://yurichev.com/tmp/SAT_SMT_DRAFT.pdf.)

The following blog post will be part of bigger article. Meanwhile, readers are highly advised to read short introduction into symbolic mathematics: https://github.com/dennis714/SAT_SMT_article/blob/89c46ca2d9040dd46b309043f4937b8e1d49c8ed/symbolic/main.tex.

XOR swap

There is a well-known counterintuitive algorithm for swapping two variables using XOR operation without any additional memory/register:

X=X^Y
Y=Y^X
X=X^Y

How it works? It would be better to construct an expression at each step of execution.

#!/usr/bin/env python
class Expr:
    def __init__(self,s):
        self.s=s
    
    def __str__(self):
        return self.s

    def __xor__(self, other):
        return Expr("(" + self.s + "^" + other.s + ")")

def XOR_swap(X, Y):
    X=X^Y
    Y=Y^X
    X=X^Y
    return X, Y

new_X, new_Y=XOR_swap(Expr("X"), Expr("Y"))
print "new_X", new_X
print "new_Y", new_Y


It would work, because Python dynamicaly typed language, so the function doesn't care what to operate on, numerical values, or Expr() class.

Here is result:

new_X ((X^Y)^(Y^(X^Y)))
new_Y (Y^(X^Y))

You can remove double variables in your mind (since XORing by a value twice will result in nothing). At new_X we can drop two X-es and two Y-es, and single Y will left. At new_Y we can drop two Y-es, and single X will left.

Change endianness

What does this code do?

mov     eax, ecx
mov     edx, ecx
shl     edx, 16
and     eax, 0000ff00H
or      eax, edx
mov     edx, ecx
and     edx, 00ff0000H
shr     ecx, 16
or      edx, ecx
shl     eax, 8
shr     edx, 8
or      eax, edx

In fact, many reverse engineers play shell game a lot, keeping track of what is stored where, at each point of time.

( Hieronymus Bosch -- The Conjurer )

Again, we can build equivalent function which can take both numerical variables and Expr() objects. We also extend Expr() class to support many arithmetical and boolean operations. Also, Expr() methods would take both Expr() objects on input and integer values.

#!/usr/bin/env python
class Expr:
    def __init__(self,s):
        self.s=s
    
    def convert_to_Expr_if_int(self, n):
        if isinstance(n, int):
            return Expr(str(n))
        if isinstance(n, Expr):
            return n
        raise AssertionError # unsupported type

    def __str__(self):
        return self.s

    def __xor__(self, other):
        return Expr("(" + self.s + "^" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __and__(self, other):
        return Expr("(" + self.s + "&" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __or__(self, other):
        return Expr("(" + self.s + "|" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __lshift__(self, other):
        return Expr("(" + self.s + "<<" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __rshift__(self, other):
        return Expr("(" + self.s + ">>" + self.convert_to_Expr_if_int(other).s + ")")

# change endianness
ecx=Expr("initial_ECX") # 1st argument
eax=ecx             #  mov     eax, ecx
edx=ecx             #  mov     edx, ecx
edx=edx<<16         #  shl     edx, 16
eax=eax&0xff00      #  and     eax, 0000ff00H
eax=eax|edx         #  or      eax, edx
edx=ecx             #  mov     edx, ecx
edx=edx&0x00ff0000  #  and     edx, 00ff0000H
ecx=ecx>>16         #  shr     ecx, 16
edx=edx|ecx         #  or      edx, ecx
eax=eax<<8          #  shl     eax, 8
edx=edx>>8          #  shr     edx, 8
eax=eax|edx         #  or      eax, edx

print eax


I run it:

((((initial_ECX&65280)|(initial_ECX<<16))<<8)|(((initial_ECX&16711680)|(initial_ECX>>16))>>8))

Now this is something more readable, however, a bit LISPy at first sight. In fact, this is a function which change endianness in 32-bit word.

By the way, my Toy Decompiler can do this job as well, but operates on AST (Abstract Syntax Tree) instead of plain strings: https://yurichev.com/writings/toy_decompiler.pdf.

Fast Fourier transform

I've found one of the smallest possible FFT implementations on reddit:

#!/usr/bin/env python
from cmath import exp,pi

def FFT(X):
    n = len(X)
    w = exp(-2*pi*1j/n)
    if n > 1:
        X = FFT(X[::2]) + FFT(X[1::2])
        for k in xrange(n/2):
            xk = X[k]
            X[k] = xk + w**k*X[k+n/2]
            X[k+n/2] = xk - w**k*X[k+n/2]
    return X

print FFT([1,2,3,4,5,6,7,8])


Just interesting, what value has each element on output?

#!/usr/bin/env python
from cmath import exp,pi

class Expr:
    def __init__(self,s):
        self.s=s
    
    def convert_to_Expr_if_int(self, n):
        if isinstance(n, int):
            return Expr(str(n))
        if isinstance(n, Expr):
            return n
        raise AssertionError # unsupported type

    def __str__(self):
        return self.s

    def __add__(self, other):
        return Expr("(" + self.s + "+" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __sub__(self, other):
        return Expr("(" + self.s + "-" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __mul__(self, other):
        return Expr("(" + self.s + "*" + self.convert_to_Expr_if_int(other).s + ")")

    def __pow__(self, other):
        return Expr("(" + self.s + "**" + self.convert_to_Expr_if_int(other).s + ")")
    
def FFT(X):
    n = len(X)
    # cast complex value to string, and then to Expr
    w = Expr(str(exp(-2*pi*1j/n)))
    if n > 1:
        X = FFT(X[::2]) + FFT(X[1::2])
        for k in xrange(n/2):
            xk = X[k]
            X[k] = xk + w**k*X[k+n/2]
            X[k+n/2] = xk - w**k*X[k+n/2]
    return X

input=[Expr("input_%d" % i) for i in range(8)]
output=FFT(input)
for i in range(len(output)):
    print i, ":", output[i]


FFT() function left almost intact, the only thing I added: complex value is converted into string and then Expr() object is constructed.

0 : (((input_0+(((-1-1.22464679915e-16j)**0)*input_4))+(((6.12323399574e-17-1j)**0)*(input_2+(((-1-1.22464679915e-16j)**0)*input_6))))+(((0.707106781187-0.707106781187j)**0)*((input_1+(((-1-1.22464679915e-16j)**0)*input_5))+(((6.12323399574e-17-1j)**0)*(input_3+(((-1-1.22464679915e-16j)**0)*input_7))))))
1 : (((input_0-(((-1-1.22464679915e-16j)**0)*input_4))+(((6.12323399574e-17-1j)**1)*(input_2-(((-1-1.22464679915e-16j)**0)*input_6))))+(((0.707106781187-0.707106781187j)**1)*((input_1-(((-1-1.22464679915e-16j)**0)*input_5))+(((6.12323399574e-17-1j)**1)*(input_3-(((-1-1.22464679915e-16j)**0)*input_7))))))
2 : (((input_0+(((-1-1.22464679915e-16j)**0)*input_4))-(((6.12323399574e-17-1j)**0)*(input_2+(((-1-1.22464679915e-16j)**0)*input_6))))+(((0.707106781187-0.707106781187j)**2)*((input_1+(((-1-1.22464679915e-16j)**0)*input_5))-(((6.12323399574e-17-1j)**0)*(input_3+(((-1-1.22464679915e-16j)**0)*input_7))))))
3 : (((input_0-(((-1-1.22464679915e-16j)**0)*input_4))-(((6.12323399574e-17-1j)**1)*(input_2-(((-1-1.22464679915e-16j)**0)*input_6))))+(((0.707106781187-0.707106781187j)**3)*((input_1-(((-1-1.22464679915e-16j)**0)*input_5))-(((6.12323399574e-17-1j)**1)*(input_3-(((-1-1.22464679915e-16j)**0)*input_7))))))
4 : (((input_0+(((-1-1.22464679915e-16j)**0)*input_4))+(((6.12323399574e-17-1j)**0)*(input_2+(((-1-1.22464679915e-16j)**0)*input_6))))-(((0.707106781187-0.707106781187j)**0)*((input_1+(((-1-1.22464679915e-16j)**0)*input_5))+(((6.12323399574e-17-1j)**0)*(input_3+(((-1-1.22464679915e-16j)**0)*input_7))))))
5 : (((input_0-(((-1-1.22464679915e-16j)**0)*input_4))+(((6.12323399574e-17-1j)**1)*(input_2-(((-1-1.22464679915e-16j)**0)*input_6))))-(((0.707106781187-0.707106781187j)**1)*((input_1-(((-1-1.22464679915e-16j)**0)*input_5))+(((6.12323399574e-17-1j)**1)*(input_3-(((-1-1.22464679915e-16j)**0)*input_7))))))
6 : (((input_0+(((-1-1.22464679915e-16j)**0)*input_4))-(((6.12323399574e-17-1j)**0)*(input_2+(((-1-1.22464679915e-16j)**0)*input_6))))-(((0.707106781187-0.707106781187j)**2)*((input_1+(((-1-1.22464679915e-16j)**0)*input_5))-(((6.12323399574e-17-1j)**0)*(input_3+(((-1-1.22464679915e-16j)**0)*input_7))))))
7 : (((input_0-(((-1-1.22464679915e-16j)**0)*input_4))-(((6.12323399574e-17-1j)**1)*(input_2-(((-1-1.22464679915e-16j)**0)*input_6))))-(((0.707106781187-0.707106781187j)**3)*((input_1-(((-1-1.22464679915e-16j)**0)*input_5))-(((6.12323399574e-17-1j)**1)*(input_3-(((-1-1.22464679915e-16j)**0)*input_7))))))

We can see subexpressions in form like $x^0$ and $x^1$. We can eliminate them, since $x^0=1$ and $x^1=x$. Also, we can reduce subexpressions like $x \cdot 1$ to just $x$.

    def __mul__(self, other):
        op1=self.s
        op2=self.convert_to_Expr_if_int(other).s

        if op1=="1":
            return Expr(op2)
        if op2=="1":
            return Expr(op1)

        return Expr("(" + op1 + "*" + op2 + ")")

    def __pow__(self, other):
        op2=self.convert_to_Expr_if_int(other).s
        if op2=="0":
            return Expr("1")
        if op2=="1":
            return Expr(self.s)

        return Expr("(" + self.s + "**" + op2 + ")")
0 : (((input_0+input_4)+(input_2+input_6))+((input_1+input_5)+(input_3+input_7)))
1 : (((input_0-input_4)+((6.12323399574e-17-1j)*(input_2-input_6)))+((0.707106781187-0.707106781187j)*((input_1-input_5)+((6.12323399574e-17-1j)*(input_3-input_7)))))
2 : (((input_0+input_4)-(input_2+input_6))+(((0.707106781187-0.707106781187j)**2)*((input_1+input_5)-(input_3+input_7))))
3 : (((input_0-input_4)-((6.12323399574e-17-1j)*(input_2-input_6)))+(((0.707106781187-0.707106781187j)**3)*((input_1-input_5)-((6.12323399574e-17-1j)*(input_3-input_7)))))
4 : (((input_0+input_4)+(input_2+input_6))-((input_1+input_5)+(input_3+input_7)))
5 : (((input_0-input_4)+((6.12323399574e-17-1j)*(input_2-input_6)))-((0.707106781187-0.707106781187j)*((input_1-input_5)+((6.12323399574e-17-1j)*(input_3-input_7)))))
6 : (((input_0+input_4)-(input_2+input_6))-(((0.707106781187-0.707106781187j)**2)*((input_1+input_5)-(input_3+input_7))))
7 : (((input_0-input_4)-((6.12323399574e-17-1j)*(input_2-input_6)))-(((0.707106781187-0.707106781187j)**3)*((input_1-input_5)-((6.12323399574e-17-1j)*(input_3-input_7)))))

Cyclic redundancy check

I've always been wondering, which input bit affects which bit in the final CRC32 value.

From CRC theory (good and concise introduction: http://web.archive.org/web/20161220015646/http://www.hackersdelight.org/crc.pdf) we know that CRC is shifting register with taps.

We will track each bit rather than byte or word, which is highly inefficient, but serves our purpose better:

#!/usr/bin/env python
import sys

class Expr:
    def __init__(self,s):
        self.s=s
    
    def convert_to_Expr_if_int(self, n):
        if isinstance(n, int):
            return Expr(str(n))
        if isinstance(n, Expr):
            return n
        raise AssertionError # unsupported type

    def __str__(self):
        return self.s

    def __xor__(self, other):
        return Expr("(" + self.s + "^" + self.convert_to_Expr_if_int(other).s + ")")

BYTES=1

def crc32(buf):
    #state=[Expr("init_%d" % i) for i in range(32)]
    state=[Expr("1") for i in range(32)]
    for byte in buf:
        for n in range(8):
            bit=byte[n]
            to_taps=bit^state[31]
            state[31]=state[30]
            state[30]=state[29]
            state[29]=state[28]
            state[28]=state[27]
            state[27]=state[26]
            state[26]=state[25]^to_taps
            state[25]=state[24]
            state[24]=state[23]
            state[23]=state[22]^to_taps
            state[22]=state[21]^to_taps
            state[21]=state[20]
            state[20]=state[19]
            state[19]=state[18]
            state[18]=state[17]
            state[17]=state[16]
            state[16]=state[15]^to_taps
            state[15]=state[14]
            state[14]=state[13]
            state[13]=state[12]
            state[12]=state[11]^to_taps
            state[11]=state[10]^to_taps
            state[10]=state[9]^to_taps
            state[9]=state[8]
            state[8]=state[7]^to_taps
            state[7]=state[6]^to_taps
            state[6]=state[5]
            state[5]=state[4]^to_taps
            state[4]=state[3]^to_taps
            state[3]=state[2]
            state[2]=state[1]^to_taps
            state[1]=state[0]^to_taps
            state[0]=to_taps

    for i in range(32):
        print "state %d=%s" % (i, state[31-i])

buf=[[Expr("in_%d_%d" % (byte, bit)) for bit in range(8)] for byte in range(BYTES)]
crc32(buf)


Here are expressions for each CRC32 bit for 1-byte buffer:

state 0=(1^(in_0_2^1))
state 1=((1^(in_0_0^1))^(in_0_3^1))
state 2=(((1^(in_0_0^1))^(in_0_1^1))^(in_0_4^1))
state 3=(((1^(in_0_1^1))^(in_0_2^1))^(in_0_5^1))
state 4=(((1^(in_0_2^1))^(in_0_3^1))^(in_0_6^(1^(in_0_0^1))))
state 5=(((1^(in_0_3^1))^(in_0_4^1))^(in_0_7^(1^(in_0_1^1))))
state 6=((1^(in_0_4^1))^(in_0_5^1))
state 7=((1^(in_0_5^1))^(in_0_6^(1^(in_0_0^1))))
state 8=(((1^(in_0_0^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))
state 9=((1^(in_0_1^1))^(in_0_7^(1^(in_0_1^1))))
state 10=(1^(in_0_2^1))
state 11=(1^(in_0_3^1))
state 12=((1^(in_0_0^1))^(in_0_4^1))
state 13=(((1^(in_0_0^1))^(in_0_1^1))^(in_0_5^1))
state 14=((((1^(in_0_0^1))^(in_0_1^1))^(in_0_2^1))^(in_0_6^(1^(in_0_0^1))))
state 15=((((1^(in_0_1^1))^(in_0_2^1))^(in_0_3^1))^(in_0_7^(1^(in_0_1^1))))
state 16=((((1^(in_0_0^1))^(in_0_2^1))^(in_0_3^1))^(in_0_4^1))
state 17=(((((1^(in_0_0^1))^(in_0_1^1))^(in_0_3^1))^(in_0_4^1))^(in_0_5^1))
state 18=(((((1^(in_0_1^1))^(in_0_2^1))^(in_0_4^1))^(in_0_5^1))^(in_0_6^(1^(in_0_0^1))))
state 19=((((((1^(in_0_0^1))^(in_0_2^1))^(in_0_3^1))^(in_0_5^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))
state 20=((((((1^(in_0_0^1))^(in_0_1^1))^(in_0_3^1))^(in_0_4^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))
state 21=(((((1^(in_0_1^1))^(in_0_2^1))^(in_0_4^1))^(in_0_5^1))^(in_0_7^(1^(in_0_1^1))))
state 22=(((((1^(in_0_0^1))^(in_0_2^1))^(in_0_3^1))^(in_0_5^1))^(in_0_6^(1^(in_0_0^1))))
state 23=((((((1^(in_0_0^1))^(in_0_1^1))^(in_0_3^1))^(in_0_4^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))
state 24=((((((in_0_0^1)^(in_0_1^1))^(in_0_2^1))^(in_0_4^1))^(in_0_5^1))^(in_0_7^(1^(in_0_1^1))))
state 25=(((((in_0_1^1)^(in_0_2^1))^(in_0_3^1))^(in_0_5^1))^(in_0_6^(1^(in_0_0^1))))
state 26=(((((in_0_2^1)^(in_0_3^1))^(in_0_4^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))
state 27=((((in_0_3^1)^(in_0_4^1))^(in_0_5^1))^(in_0_7^(1^(in_0_1^1))))
state 28=(((in_0_4^1)^(in_0_5^1))^(in_0_6^(1^(in_0_0^1))))
state 29=(((in_0_5^1)^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))
state 30=((in_0_6^(1^(in_0_0^1)))^(in_0_7^(1^(in_0_1^1))))
state 31=(in_0_7^(1^(in_0_1^1)))

For larger buffer, expressions gets increasing exponentially. This is 0th bit of the final state for 4-byte buffer:

state 0=((((((((((((((in_0_0^1)^(in_0_1^1))^(in_0_2^1))^(in_0_4^1))^(in_0_5^1))^(in_0_7^(1^(in_0_1^1))))^(in_1_0^(1^(in_0_2^1))))^
(in_1_2^(((1^(in_0_0^1))^(in_0_1^1))^(in_0_4^1))))^(in_1_3^(((1^(in_0_1^1))^(in_0_2^1))^(in_0_5^1))))^(in_1_4^(((1^(in_0_2^1))^
(in_0_3^1))^(in_0_6^(1^(in_0_0^1))))))^(in_2_0^((((1^(in_0_0^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))^(in_1_2^(((1^
(in_0_0^1))^(in_0_1^1))^(in_0_4^1))))))^(in_2_6^(((((((1^(in_0_0^1))^(in_0_1^1))^(in_0_2^1))^(in_0_6^(1^(in_0_0^1))))^(in_1_4^(((1^
(in_0_2^1))^(in_0_3^1))^(in_0_6^(1^(in_0_0^1))))))^(in_1_5^(((1^(in_0_3^1))^(in_0_4^1))^(in_0_7^(1^(in_0_1^1))))))^(in_2_0^((((1^
(in_0_0^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^1))))^(in_1_2^(((1^(in_0_0^1))^(in_0_1^1))^(in_0_4^1))))))))^(in_2_7^(((((((1^
(in_0_1^1))^(in_0_2^1))^(in_0_3^1))^(in_0_7^(1^(in_0_1^1))))^(in_1_5^(((1^(in_0_3^1))^(in_0_4^1))^(in_0_7^(1^(in_0_1^1))))))^(in_1_6^
(((1^(in_0_4^1))^(in_0_5^1))^(in_1_0^(1^(in_0_2^1))))))^(in_2_1^((((1^(in_0_1^1))^(in_0_7^(1^(in_0_1^1))))^(in_1_0^(1^(in_0_2^1))))^
(in_1_3^(((1^(in_0_1^1))^(in_0_2^1))^(in_0_5^1))))))))^(in_3_2^(((((((((1^(in_0_1^1))^(in_0_2^1))^(in_0_4^1))^(in_0_5^1))^(in_0_6^(1^
(in_0_0^1))))^(in_1_2^(((1^(in_0_0^1))^(in_0_1^1))^(in_0_4^1))))^(in_2_0^((((1^(in_0_0^1))^(in_0_6^(1^(in_0_0^1))))^(in_0_7^(1^(in_0_1^
1))))^(in_1_2^(((1^(in_0_0^1))^(in_0_1^1))^(in_0_4^1))))))^(in_2_1^((((1^(in_0_1^1))^(in_0_7^(1^(in_0_1^1))))^(in_1_0^(1^(in_0_2^1))))^
(in_1_3^(((1^(in_0_1^1))^(in_0_2^1))^(in_0_5^1))))))^(in_2_4^(((((1^(in_0_0^1))^(in_0_4^1))^(in_1_2^(((1^(in_0_0^1))^(in_0_1^1))^(in_0_4^
1))))^(in_1_3^(((1^(in_0_1^1))^(in_0_2^1))^(in_0_5^1))))^(in_1_6^(((1^(in_0_4^1))^(in_0_5^1))^(in_1_0^(1^(in_0_2^1))))))))))

Expression for the 0th bit of the final state for 8-byte buffer has length of ~350KiB, which is, of course, can be reduced significantly (because this expression is basically XOR tree), but you can feel the weight of it.

Now we can process this expressions somehow to get a smaller picture on what is affecting what. Let's say, if we can find "in_2_3" substring in expression, this means that 3rd bit of 2nd byte of input affects this expression. But even more than that: since this is XOR tree (i.e., expression consisting only of XOR operations), if some input variable is occurring twice, it's annihilated, since $x \oplus x=0$. More than that: if some vairables occurred even number of times (2, 4, 8, etc), it's annihilated, but left if it's occurred odd number of times (1, 3, 5, etc).

    for i in range(32):
        #print "state %d=%s" % (i, state[31-i])
        sys.stdout.write ("state %02d: " % i)
        for byte in range(BYTES):
            for bit in range(8):
                s="in_%d_%d" % (byte, bit)
                if str(state[31-i]).count(s) & 1:
                    sys.stdout.write ("*")
                else:
                    sys.stdout.write (" ")
        sys.stdout.write ("\n")

( https://github.com/dennis714/yurichev.com/blob/master/blog/symbolic/4_CRC/2.py )

Now this how each input bit of 1-byte input buffer affects each bit of the final CRC state:

state 00:   *     
state 01: *  *    
state 02: **  *   
state 03:  **  *  
state 04: * **  * 
state 05:  * **  *
state 06:     **  
state 07: *    ** 
state 08:  *    **
state 09:        *
state 10:   *     
state 11:    *    
state 12: *   *   
state 13: **   *  
state 14:  **   * 
state 15:   **   *
state 16: * ***   
state 17: ** ***  
state 18: *** *** 
state 19:  *** ***
state 20:    ** **
state 21:   * ** *
state 22:   ** ** 
state 23:    ** **
state 24: * * ** *
state 25: **** ** 
state 26: ***** **
state 27:  * *** *
state 28: *   *** 
state 29: **   ***
state 30: **    **
state 31:  *     *

This is 8*8=64 bits of 8-byte input buffer:

state 00:  * ** *  ***  * ** **      *  * ***** ***       *   * **  *     
state 01: * * ** *  ***  * ** **      *  * ***** ***       *   * **  *    
state 02: ** * ** *  ***  * ** **      *  * ***** ***       *   * **  *   
state 03: *** * ** *  ***  * ** **      *  * ***** ***       *   * **  *  
state 04: **** * ** *  ***  * ** **      *  * ***** ***       *   * **  * 
state 05:  **** * ** *  ***  * ** **      *  * ***** ***       *   * **  *
state 06:  **  ***   ** **   *  ** ***  * * **     ** *** *   *  *    **  
state 07: * **  ***   ** **   *  ** ***  * * **     ** *** *   *  *    ** 
state 08:  * **  ***   ** **   *  ** ***  * * **     ** *** *   *  *    **
state 09:  *** ** *  *   ** *** *  *****  * * ** **   ** * * ** *        *
state 10:  **    *  *** *      * *  * **  * * ** * *   **   *  **   *     
state 11:   **    *  *** *      * *  * **  * * ** * *   **   *  **   *    
state 12:    **    *  *** *      * *  * **  * * ** * *   **   *  **   *   
state 13:     **    *  *** *      * *  * **  * * ** * *   **   *  **   *  
state 14:      **    *  *** *      * *  * **  * * ** * *   **   *  **   * 
state 15:       **    *  *** *      * *  * **  * * ** * *   **   *  **   *
state 16:  * ** ****** **   **         **  *  *  *  ** * **  *  *** ***   
state 17: * * ** ****** **   **         **  *  *  *  ** * **  *  *** ***  
state 18:  * * ** ****** **   **         **  *  *  *  ** * **  *  *** *** 
state 19: * * * ** ****** **   **         **  *  *  *  ** * **  *  *** ***
state 20:     ******  ** ** *** **   *  * *  *****   *  **** *  *    ** **
state 21: ** *** **  * *       * **  ** *** ** *      *  * **   *   * ** *
state 22:   ** *  * ***   ** ** * ** *****  *    **    *    *** *   ** ** 
state 23: *  ** *  * ***   ** ** * ** *****  *    **    *    *** *   ** **
state 24:    * *** * ***  *** *** * *  * *  **  *****    **    * ** * ** *
state 25:  * *   *** ***  * * **** *       **   *  ***     *  *  ***** ** 
state 26: * * *   *** ***  * * **** *       **   *  ***     *  *  ***** **
state 27: *   ***      * *****  ****    * ***   **   ***  *  **  * * *** *
state 28:    *** * ***      *    *****  ***   * *     *** **   ****   *** 
state 29:     *** * ***      *    *****  ***   * *     *** **   ****   ***
state 30: ** *** *  * *** ** *     ** ***    **  *      **  *** * **    **
state 31: * ** *  ***  * ** **      *  * ***** ***       *   * **  *     *

Linear congruential generator

This is popular PRNG from OpenWatcom CRT library: https://github.com/open-watcom/open-watcom-v2/blob/d468b609ba6ca61eeddad80dd2485e3256fc5261/bld/clib/math/c/rand.c.

What expression it generates on each step?

#!/usr/bin/env python
class Expr:
    def __init__(self,s):
        self.s=s
    
    def convert_to_Expr_if_int(self, n):
        if isinstance(n, int):
            return Expr(str(n))
        if isinstance(n, Expr):
            return n
        raise AssertionError # unsupported type

    def __str__(self):
        return self.s

    def __xor__(self, other):
        return Expr("(" + self.s + "^" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __mul__(self, other):
        return Expr("(" + self.s + "*" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __add__(self, other):
        return Expr("(" + self.s + "+" + self.convert_to_Expr_if_int(other).s + ")")

    def __and__(self, other):
        return Expr("(" + self.s + "&" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __rshift__(self, other):
        return Expr("(" + self.s + ">>" + self.convert_to_Expr_if_int(other).s + ")")

seed=Expr("initial_seed")

def rand():
    global seed
    seed=seed*1103515245+12345
    return (seed>>16) & 0x7fff

for i in range(10):
    print i, ":", rand()

0 : ((((initial_seed*1103515245)+12345)>>16)&32767)
1 : ((((((initial_seed*1103515245)+12345)*1103515245)+12345)>>16)&32767)
2 : ((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)
3 : ((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)
4 : ((((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)
5 : ((((((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)
6 : ((((((((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)
7 : ((((((((((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)
8 : ((((((((((((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)
9 : ((((((((((((((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)

Now if we once got several values from this PRNG, like 4583, 16304, 14440, 32315, 28670, 12568..., how would we recover the initial seed? The problem in fact is solving a system of equations:

((((initial_seed*1103515245)+12345)>>16)&32767)==4583
((((((initial_seed*1103515245)+12345)*1103515245)+12345)>>16)&32767)==16304
((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)==14440
((((((((((initial_seed*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)*1103515245)+12345)>>16)&32767)==32315

As it turns out, Z3 can solve this system correctly using only two equations:

#!/usr/bin/env python
from z3 import *

s=Solver()

x=BitVec("x",32)

a=1103515245
c=12345
s.add((((x*a)+c)>>16)&32767==4583)
s.add((((((x*a)+c)*a)+c)>>16)&32767==16304)
#s.add((((((((x*a)+c)*a)+c)*a)+c)>>16)&32767==14440)
#s.add((((((((((x*a)+c)*a)+c)*a)+c)*a)+c)>>16)&32767==32315)

s.check()
print s.model()


[x = 11223344]

(Though, it takes ~20s on my ancient Intel Atom netbook.)

Path constraint

How to get weekday from UNIX timestamp?

#!/usr/bin/env python

input=...
SECS_DAY=24*60*60
dayno = input / SECS_DAY
wday = (dayno + 4) % 7
if wday==5:
    print "Thanks God, it's Friday!"

Let's say, we should find a way to run the block with print() call in it. What input value should be?

First, let's build expression of $wday$ variable:

#!/usr/bin/env python
class Expr:
    def __init__(self,s):
        self.s=s
    
    def convert_to_Expr_if_int(self, n):
        if isinstance(n, int):
            return Expr(str(n))
        if isinstance(n, Expr):
            return n
        raise AssertionError # unsupported type

    def __str__(self):
        return self.s

    def __div__(self, other):
        return Expr("(" + self.s + "/" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __mod__(self, other):
        return Expr("(" + self.s + "%" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __add__(self, other):
        return Expr("(" + self.s + "+" + self.convert_to_Expr_if_int(other).s + ")")

input=Expr("input")
SECS_DAY=24*60*60
dayno = input / SECS_DAY
wday = (dayno + 4) % 7
print wday
if wday==5:
    print "Thanks God, it's Friday!"


(((input/86400)+4)%7)

In order to execute the block, we should solve this equation: (((input/86400)+4)%7)==5

So far, this is easy task for Z3:

#!/usr/bin/env python
from z3 import *

s=Solver()

x=Int("x")

s.add(((x/86400)+4)%7==5)

s.check()
print s.model()


[x = 86438]

This is indeed correct UNIX timestamp for Friday:

% date --date='@86438'
Fri Jan  2 03:00:38 MSK 1970

Though the date back in year 1970, but it's still correct!

This is also called "path constraint", i.e., what constraint must be satisified to execute the path into specific block? Several tools has "path" in their names, like "pathgrind", Symbolic PathFinder, etc.

Like the shell game, this task is also often encounters in practice. You can see that something dangerous can be executed inside some basic block and you're trying to deduce, what input values can cause execution of it. It may be buffer overflow, etc. Input values are sometimes also called "inputs of death".

Many crackmes are solved in this way, all you need is find a path into block which prints "key is correct" or something like that.

We can extend this tiny example:

input=...
SECS_DAY=24*60*60
dayno = input / SECS_DAY
wday = (dayno + 4) % 7
print wday
if wday==5:
    print "Thanks God, it's Friday!"
else:
    print "Got to wait a little"

Now we have two blocks: for the first we should solve this equation: (((input/86400)+4)%7)==5. But for the second we should solve inverted equation: (((input/86400)+4)%7)!=5. By solving these equations, we will find two paths into both blocks.

KLEE (or similar tool) tries to find path to each [basic] block and produces "ideal" unit test. Hence, KLEE can find a path into the block which crashes everything, or reporting about correctness of the input key/license, etc. Surprisingly, KLEE can find backdoors in the very same manner.

KLEE is also called "KLEE Symbolic Virtual Machine" -- by that its creators mean that the KLEE is VM which executes a code symbolically rather than numerically.

Division by zero

If division by zero is unwrapped and exception isn't caught, it can crash process.

Let's calculate simple expression $\frac{x}{2y + 4z - 12}$. We can add a warning into __div__ method:

#!/usr/bin/env python
class Expr:
    def __init__(self,s):
        self.s=s
    
    def convert_to_Expr_if_int(self, n):
        if isinstance(n, int):
            return Expr(str(n))
        if isinstance(n, Expr):
            return n
        raise AssertionError # unsupported type

    def __str__(self):
        return self.s

    def __mul__(self, other):
        return Expr("(" + self.s + "*" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __div__(self, other):
        op2=self.convert_to_Expr_if_int(other).s
        print "warning: division by zero if "+op2+"==0"
        return Expr("(" + self.s + "/" + op2 + ")")
    
    def __add__(self, other):
        return Expr("(" + self.s + "+" + self.convert_to_Expr_if_int(other).s + ")")
    
    def __sub__(self, other):
        return Expr("(" + self.s + "-" + self.convert_to_Expr_if_int(other).s + ")")

"""
     x
------------
2y + 4z - 12

"""

def f(x, y, z):
    return x/(y*2 + z*4 - 12)

print f(Expr("x"), Expr("y"), Expr("z"))


... so it will report about dangerous condition:

warning: division by zero if (((y*2)+(z*4))-12)==0
(x/(((y*2)+(z*4))-12))

This equation is easy to solve, let's try Wolfram Mathematica this time:

In[]:= FindInstance[{(y*2 + z*4) - 12 == 0}, {y, z}, Integers]
Out[]= {{y -> 0, z -> 3}}

These values for $y$ and $z$ can also be called "inputs of death".

Merge sort

How merge sort works? I have copypasted Python code from rosettacode.com almost intact:

#!/usr/bin/env python
class Expr:
    def __init__(self,s,i):
        self.s=s
        self.i=i
    
    def __str__(self):
        # return both symbolic and integer:
        return self.s+" (" + str(self.i)+")"
   
    def __le__(self, other):
        # compare only integer parts:
        return self.i <= other.i

# copypasted from http://rosettacode.org/wiki/Sorting_algorithms/Merge_sort#Python
def merge(left, right):
    result = []
    left_idx, right_idx = 0, 0
    while left_idx < len(left) and right_idx < len(right):
        # change the direction of this comparison to change the direction of the sort
        if left[left_idx] <= right[right_idx]:
            result.append(left[left_idx])
            left_idx += 1
        else:
            result.append(right[right_idx])
            right_idx += 1
 
    if left_idx < len(left):
        result.extend(left[left_idx:])
    if right_idx < len(right):
        result.extend(right[right_idx:])
    return result

def tabs (t):
    return "\t"*t

def merge_sort(m, indent=0):
    print tabs(indent)+"merge_sort() begin. input:"
    for i in m:
        print tabs(indent)+str(i)

    if len(m) <= 1:
        print tabs(indent)+"merge_sort() end. returning single element"
        return m

    middle = len(m) // 2
    left = m[:middle]
    right = m[middle:]
 
    left = merge_sort(left, indent+1)
    right = merge_sort(right, indent+1)
    rt=list(merge(left, right))
    print tabs(indent)+"merge_sort() end. returning:"
    for i in rt:
        print tabs(indent)+str(i)
    return rt

# input buffer has both symbolic and numerical values:
input=[Expr("input1",22), Expr("input2",7), Expr("input3",2), Expr("input4",1), Expr("input5",8), Expr("input6",4)]
merge_sort(input)


But here is a function which compares elements. Obviously, it wouldn't work correctly without it.

So we can track both expression for each element and numerical value. Both will be printed finally. But whenever values are to be compared, only numerical parts will be used.

Result:

merge_sort() begin. input:
input1 (22)
input2 (7)
input3 (2)
input4 (1)
input5 (8)
input6 (4)
	merge_sort() begin. input:
	input1 (22)
	input2 (7)
	input3 (2)
		merge_sort() begin. input:
		input1 (22)
		merge_sort() end. returning single element
		merge_sort() begin. input:
		input2 (7)
		input3 (2)
			merge_sort() begin. input:
			input2 (7)
			merge_sort() end. returning single element
			merge_sort() begin. input:
			input3 (2)
			merge_sort() end. returning single element
		merge_sort() end. returning:
		input3 (2)
		input2 (7)
	merge_sort() end. returning:
	input3 (2)
	input2 (7)
	input1 (22)
	merge_sort() begin. input:
	input4 (1)
	input5 (8)
	input6 (4)
		merge_sort() begin. input:
		input4 (1)
		merge_sort() end. returning single element
		merge_sort() begin. input:
		input5 (8)
		input6 (4)
			merge_sort() begin. input:
			input5 (8)
			merge_sort() end. returning single element
			merge_sort() begin. input:
			input6 (4)
			merge_sort() end. returning single element
		merge_sort() end. returning:
		input6 (4)
		input5 (8)
	merge_sort() end. returning:
	input4 (1)
	input6 (4)
	input5 (8)
merge_sort() end. returning:
input4 (1)
input3 (2)
input6 (4)
input2 (7)
input5 (8)
input1 (22)

Extending Expr class

This is somewhat senseless, nevertheless, it's easy task to extend my Expr class to support AST trees instead of plain strings. It's also possible to add folding steps (like I demonstrated in https://yurichev.com/writings/toy_decompiler.pdf). Maybe someone will want to do this as an exercise.

Conclusion

For the sake of demonstration, I made things as simple as possible. But reality is always harsh and inconvenient, so all this shouldn't be taken as a silver bullet.

More Z3 and KLEE examples: https://yurichev.com/tmp/SAT_SMT_DRAFT.pdf.

The files used in this blog post: https://github.com/dennis714/yurichev.com/tree/master/blog/symbolic.

As you noticed in my simple examples, expression is represented as a plain string, for the sake of simplicity. Advanced symbolic execution engines uses AST (Abstract Syntax Tree), which are much better and efficient. AST in its simplest form is used in my toy decompiler: https://yurichev.com/writings/toy_decompiler.pdf. The toy decompiler can be used as simple symbolic engine as well, just feed all the instructions to it and it will track contents of each register.


Next part: https://yurichev.com/blog/crypto.


→ [list of blog posts, my twitter/facebook]

The page last updated on 29-April-2017