## 2-Jun-2017: Making smallest possible test suite using Z3

(The following text has been copypasted to the SAT/SMT by example.)

I once worked on rewriting large piece of code into pure C, and there were a tests, several thousands. Testing process was painfully slow, so I thought if the test suite can be minimized somehow.

What we can do is to run each test and get code coverage. Then the task is to make such test suite, where coverage is maximum, and number of tests is minimal.

In fact, this is set cover problem (also known as "hitting set problem"). While simpler algorithms exist (see Wikipedia), it is also possible to solve with SMT-solver.

First, I took LZSS compression/decompression code for the example, from Apple sources. Such routines are not easy to test. Here is my version of it: I've added random generation of input data to be compressed. Random generation is dependent of some kind of input seed. Standard srand()/rand() are not recommended to be used, but for such simple task as ours, it's OK. I'll generate 1000 tests with 0..999 seeds, that would produce random data to be compressed/decompressed/checked.

After the compression/decompression routine has finished its work, GNU gcov utility is executed, which produces result like this:

...
3395:  189:        for (i = 1; i < F; i++) {
3395:  190:            if ((cmp = key[i] - sp->text_buf[p + i]) != 0)
2565:  191:                break;
-:  192:        }
2565:  193:        if (i > sp->match_length) {
1291:  194:            sp->match_position = p;
1291:  195:            if ((sp->match_length = i) >= F)
#####:  196:                break;
-:  197:        }
2565:  198:    }
#####:  199:    sp->parent[r] = sp->parent[p];
#####:  200:    sp->lchild[r] = sp->lchild[p];
#####:  201:    sp->rchild[r] = sp->rchild[p];
#####:  202:    sp->parent[sp->lchild[p]] = r;
#####:  203:    sp->parent[sp->rchild[p]] = r;
#####:  204:    if (sp->rchild[sp->parent[p]] == p)
#####:  205:        sp->rchild[sp->parent[p]] = r;
...


Leftmost number is an execution count for each line. ##### means the line of code hasn't been executed at all. The second column is a line number.

Now the Z3Py script, which will parse all these 1000 gcov results and produce minimal hitting set:

#!/usr/bin/env python

import re, sys
from z3 import *

TOTAL_TESTS=1000

# read gcov result and return list of lines executed:
def process_file (fname):
lines=[]
f=open(fname,"r")

while True:
m = re.search('^ *([0-9]+):  ([0-9]+):.*\$', l)
if m!=None:
lines.append(int(m.group(2)))
if len(l)==0:
break

f.close()
return lines

# k=test number; v=list of lines executed
stat={}
for test in range(TOTAL_TESTS):
stat[test]=process_file("compression.c.gcov."+str(test))

# that will be a list of all lines in all tests:
all_lines=set()
# k=line, v=list of tests, which trigger that line:
tests_for_line={}

for test in stat:
all_lines|=set(stat[test])
for line in stat[test]:
tests_for_line[line]=tests_for_line.get(line, []) + [test]

# int variable for each test:
tests=[Int('test_%d' % (t)) for t in range(TOTAL_TESTS)]

# this is optimization problem, so Optimize() instead of Solver():
opt = Optimize()

# each test variable is either 0 (absent) or 1 (present):
for t in tests:

# we know which tests can trigger each line
# so we enumerate all tests when preparing expression for each line
# we form expression like "test_1==1 OR test_2==1 OR ..." for each line:
for line in list(all_lines):
expressions=[tests[s]==1 for s in tests_for_line[line]]
# expression is a list which unfolds as list of arguments into Z3's Or() function (see asterisk)
# that results in big expression like "Or(test_1==1, test_2==1, ...)"
# the expression is then added as a constraint:

# we need to find a such solution, where minimal number of all "test_X" variables will have 1
# "*tests" unfolds to a list of arguments: [test_1, test_2, test_3,...]
# "Sum(*tests)" unfolds to the following expression: "Sum(test_1, test_2, ...)"
# the sum of all "test_X" variables should be as minimal as possible:
h=opt.minimize(Sum(*tests))

print (opt.check())
m=opt.model()

# print all variables set to 1:
for t in tests:
if m[t].as_long()==1:
print (t)



And what it produces (on my old Intel Quad-Core Xeon E3-1220 3.10GHz):

% time python set_cover.py
sat
test_7
test_48
test_134
python set_cover.py  18.95s user 0.03s system 99% cpu 18.988 total


We need just these 3 tests to execute (almost) all lines in the code: looks impressive, given the fact, that it would be notoriously hard to pick these tests by hand! The result can be checked easily, again, using gcov utility.

This is sometimes also called MaxSAT/MaxSMT - the problem is just to find solution, but the solution where some variable is as maximal as possible, or as minimal as possible.

Also, the code gives incorrect results on Z3 4.4.1, but working correctly on Z3 4.5.0 (so please upgrade). This is relatively fresh feature in Z3, so probably it was not stable in previous versions?