This is a true story. I once worked with Xilinx Microblaze code and found this sequence of the addk instructions:
addk r3,r5,r5 addk r3,r3,r3 addk r3,r3,r3
The format is: addk dst, src1, src2
This is actually multiplication, but implemented via addition, because it's faster than multiplication instruction.
Also, Xilinx Microblaze can be compiled with multiplication instruction omitted. Ctrl-F: "C_USE_HW_MUL".
Let's rewrite it manually:
addk r3,r5,r5 addk r3,r3,r3 addk r3,r3,r3
addk r3,r5,r5 ; now r3 = r5+r5 = r5*2 addk r3,r3,r3 ; now r3 = r3*r3 = r5*2 + r5*2 = r5*4 addk r3,r3,r3 ; now r3 = r3*r3 = r5*4 + r5*4 = r5*8
Aha, now we see. Multiplication by 8.
Next example:
addk r3,r5,r5 addk r3,r3,r5
Let's rewrite:
addk r3,r5,r5 ; now r3 = r5+r5 = r5*2 addk r3,r3,r5 ; now r3 = r3+r5 = r5*2+r5 = r5*3
This is multiplication by 3.
Next example with rsubk instruction, which is: rsubk dst, x, y. And dst=y-x (hence 'r-' prefix, meaning 'reversed', like in MIPS).
addk r3,r5,r5 addk r3,r3,r3 addk r3,r3,r3 rsubk r3,r5,r3
Let's rewrite:
addk r3,r5,r5 ; now r3 = r5+r5 = r5*2 addk r3,r3,r3 ; now r3 = r3+r3 = r3*2 = r5*2*2 = r5*4 addk r3,r3,r3 ; now r3 = r3+r3 = r3*2 = r5*4*2 = r5*8 rsubk r3,r5,r3 ; now r3 = r3-r5 = r5*8-r5 = r5*7
This is multiplication by 7.
We start to feel this process is like a shell game (you have to keep in mind what is in which register) and can be automated.
This partial decompiler written in Python can do something already. We parse input assembly listings using regexps. Rewriting also happens using regexps.
regs=[("initial_r%d" % i) for i in range(32)]
regs[0]="0" # r0 always has zero, as in MIPS
def dump_regs():
global regs
for reg in range(32):
if regs[reg] != f"initial_r{reg}":
print (f"r{reg}", "=", rewrite(regs[reg]))
def regname_to_regno(reg):
return int(reg.removeprefix("r").removesuffix(","))
def parse_addk(l):
global regs
tmp=l.split(" ")
assert tmp[0]=="addk"
dst=regname_to_regno(tmp[1])
src1=regname_to_regno(tmp[2])
src2=regname_to_regno(tmp[3])
#print (dst, src1, src2)
if src1==src2:
regs[dst]=f"({regs[src1]}*2)"
else:
regs[dst]=f"({regs[src1]}+{regs[src2]})"
...
# these regexp subexpressions are unusual a bit
# ?P<name> means that subexpression will have a name:
# https://docs.python.org/3/howto/regex.html#non-capturing-and-named-groups
# ? at the end means non-greedy regexp:
# https://docs.python.org/3/howto/regex.html#greedy-versus-non-greedy
# non-greedy regexps are used here because we want to find smallest possible subexpression
# default is greedy -- largest possible subexpression is to be found, but we don't want this
re_X= "(?P<X>[a-z0-9_]+?)"
re_X1="(?P<X1>[a-z0-9_]+?)"
re_X2="(?P<X2>[a-z0-9_]+?)"
re_N= "(?P<N>[0-9]+?)"
re_N1="(?P<N1>[0-9]+?)"
re_N2="(?P<N2>[0-9]+?)"
# ((X*N1)*N2) -> X*(N1+N2)
def rewrite_two_mul(s):
x=re.compile("\\(\\("+re_X+"\\*"+re_N1+"\\)\\*"+re_N2+"\\)")
m=x.search(s)
if m==None:
return s
m_start, m_end = m.start(0), m.end(0)
X=m.group("X")
N1=int(m.group("N1"))
N2=int(m.group("N2"))
new_expr=f"({X}*{N1*N2})"
rt=s[0:m_start]+new_expr+s[m_end:] # replace substring
return rt
# test
assert rewrite_two_mul("((x*2)*2)")=="(x*4)"
assert rewrite_two_mul("y+((x*2)*2)+z")=="y+(x*4)+z"
...
def rewrite_one_step(s):
s=rewrite_two_mul(s)
s=rewrite_1(s)
s=rewrite_1_rvr(s)
s=rewrite_2(s)
s=rewrite_3(s)
s=rewrite_4(s)
return s
def rewrite(s):
#print (f"Before rewrite: {s}")
while True:
new_s=rewrite_one_step(s)
if new_s==s:
#print (f"After rewrite: {new_s}")
return new_s
s=new_s
for l in sys.stdin:
l=l.rstrip()
if l.startswith ("addk"):
parse_addk(l)
elif l.startswith ("rsubk"):
parse_rsubk(l)
else:
print ("panic, can't parse: ", l)
exit(0)
print (f"state of registers after {l}:")
dump_regs()
print ("final state of registers:")
dump_regs()
% ./decomp.py < f8 state of registers after addk r3, r5, r5: r3 = (initial_r5*2) state of registers after addk r3, r3, r3: r3 = (initial_r5*4) state of registers after addk r3, r3, r3: r3 = (initial_r5*8) final state of registers: r3 = (initial_r5*8)
All rewriting rules to be used here are:
((X*N1)*N2) -> X*(N1+N2) ((X*N)+X) -> X*(N+1) (X+(X*N)) -> X*(N+1) ((X*N)-X) -> X*(N-1) ((X*N1)+(X*N2)) -> X*(N1+N2) ((X*N1)-(X*N2)) -> X*(N1-N2)
And some tests from decomp.py:
assert rewrite_two_mul("((x*2)*2)")=="(x*4)"
assert rewrite_1("((x*2)+x)")=="(x*3)"
assert rewrite_1_rvr("(x+(x*2))")=="(x*3)"
assert rewrite_2("((x*10)-x)")=="(x*9)"
assert rewrite_3("((x*10)+(x*15))")=="(x*25)"
assert rewrite_4("((x*16)-(x*10))")=="(x*6)"
Now this can be 'decompiled':
addk r3, r5, r5 addk r3, r3, r5
% ./decomp.py < f3 state of registers after addk r3, r5, r5: r3 = (initial_r5*2) state of registers after addk r3, r3, r5: r3 = (initial_r5*3) final state of registers: r3 = (initial_r5*3)
And this:
addk r3, r5, r5 addk r3, r3, r3 addk r3, r3, r3 rsubk r3, r5, r3
% ./decomp.py < f7 state of registers after addk r3, r5, r5: r3 = (initial_r5*2) state of registers after addk r3, r3, r3: r3 = (initial_r5*4) state of registers after addk r3, r3, r3: r3 = (initial_r5*8) state of registers after rsubk r3, r5, r3: r3 = (initial_r5*7) final state of registers: r3 = (initial_r5*7)
This is multiplication by 86400 (number of seconds in day), but some intermediate results are left in r3 and r4 registers:
addk r3, r22, r22 addk r3, r3, r22 addk r4, r3, r3 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 rsubk r4, r3, r4 addk r3, r4, r4 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r3 rsubk r3, r4, r3 addk r25, r3, r3 addk r25, r25, r25 addk r25, r25, r25 addk r25, r25, r25 addk r25, r25, r25 addk r25, r25, r25 addk r25, r25, r25
The result:
state of registers after addk r3, r22, r22: r3 = (initial_r22*2) state of registers after addk r3, r3, r22: r3 = (initial_r22*3) state of registers after addk r4, r3, r3: r3 = (initial_r22*3) r4 = (initial_r22*6) state of registers after addk r4, r4, r4: r3 = (initial_r22*3) r4 = (initial_r22*12) state of registers after addk r4, r4, r4: r3 = (initial_r22*3) r4 = (initial_r22*24) state of registers after addk r4, r4, r4: r3 = (initial_r22*3) r4 = (initial_r22*48) state of registers after rsubk r4, r3, r4: r3 = (initial_r22*3) r4 = (initial_r22*45) state of registers after addk r3, r4, r4: r3 = (initial_r22*90) r4 = (initial_r22*45) state of registers after addk r3, r3, r3: r3 = (initial_r22*180) r4 = (initial_r22*45) state of registers after addk r3, r3, r3: r3 = (initial_r22*360) r4 = (initial_r22*45) state of registers after addk r3, r3, r3: r3 = (initial_r22*720) r4 = (initial_r22*45) state of registers after rsubk r3, r4, r3: r3 = (initial_r22*675) r4 = (initial_r22*45) state of registers after addk r25, r3, r3: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*1350) state of registers after addk r25, r25, r25: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*2700) state of registers after addk r25, r25, r25: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*5400) state of registers after addk r25, r25, r25: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*10800) state of registers after addk r25, r25, r25: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*21600) state of registers after addk r25, r25, r25: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*43200) state of registers after addk r25, r25, r25: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*86400) final state of registers: r3 = (initial_r22*675) r4 = (initial_r22*45) r25 = (initial_r22*86400)
Also, multiplication by 1103515245 (the constant used in simple PRNG) with one intermediate result left in r4:
addk r4, r5, r5 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 rsubk r4, r5, r4 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r5 addk r3, r4, r4 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r3 rsubk r3, r4, r3 addk r3, r3, r3 addk r3, r3, r5 addk r4, r3, r3 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 addk r4, r4, r4 addk r3, r3, r4 addk r3, r3, r3 addk r3, r3, r3 rsubk r3, r5, r3 addk r3, r3, r3 addk r3, r3, r3 addk r3, r3, r5
state of registers after addk r4, r5, r5: r4 = (initial_r5*2) state of registers after addk r4, r4, r4: r4 = (initial_r5*4) state of registers after addk r4, r4, r4: r4 = (initial_r5*8) state of registers after addk r4, r4, r4: r4 = (initial_r5*16) state of registers after addk r4, r4, r4: r4 = (initial_r5*32) state of registers after addk r4, r4, r4: r4 = (initial_r5*64) state of registers after addk r4, r4, r4: r4 = (initial_r5*128) state of registers after addk r4, r4, r4: r4 = (initial_r5*256) state of registers after addk r4, r4, r4: r4 = (initial_r5*512) state of registers after rsubk r4, r5, r4: r4 = (initial_r5*511) state of registers after addk r4, r4, r4: r4 = (initial_r5*1022) state of registers after addk r4, r4, r4: r4 = (initial_r5*2044) state of registers after addk r4, r4, r5: r4 = (initial_r5*2045) state of registers after addk r3, r4, r4: r3 = (initial_r5*4090) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*8180) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*16360) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*32720) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*65440) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*130880) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*261760) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*523520) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*1047040) r4 = (initial_r5*2045) state of registers after rsubk r3, r4, r3: r3 = (initial_r5*1044995) r4 = (initial_r5*2045) state of registers after addk r3, r3, r3: r3 = (initial_r5*2089990) r4 = (initial_r5*2045) state of registers after addk r3, r3, r5: r3 = (initial_r5*2089991) r4 = (initial_r5*2045) state of registers after addk r4, r3, r3: r3 = (initial_r5*2089991) r4 = (initial_r5*4179982) state of registers after addk r4, r4, r4: r3 = (initial_r5*2089991) r4 = (initial_r5*8359964) state of registers after addk r4, r4, r4: r3 = (initial_r5*2089991) r4 = (initial_r5*16719928) state of registers after addk r4, r4, r4: r3 = (initial_r5*2089991) r4 = (initial_r5*33439856) state of registers after addk r4, r4, r4: r3 = (initial_r5*2089991) r4 = (initial_r5*66879712) state of registers after addk r3, r3, r4: r3 = (initial_r5*68969703) r4 = (initial_r5*66879712) state of registers after addk r3, r3, r3: r3 = (initial_r5*137939406) r4 = (initial_r5*66879712) state of registers after addk r3, r3, r3: r3 = (initial_r5*275878812) r4 = (initial_r5*66879712) state of registers after rsubk r3, r5, r3: r3 = (initial_r5*275878811) r4 = (initial_r5*66879712) state of registers after addk r3, r3, r3: r3 = (initial_r5*551757622) r4 = (initial_r5*66879712) state of registers after addk r3, r3, r3: r3 = (initial_r5*1103515244) r4 = (initial_r5*66879712) state of registers after addk r3, r3, r5: r3 = (initial_r5*1103515245) r4 = (initial_r5*66879712) final state of registers: r3 = (initial_r5*1103515245) r4 = (initial_r5*66879712)
We successfully decompile (long) add/sub chains, given that the only integer data type is used, like 'int'.
Let's rewrite multiplication by 7 to pure C:
addk r3,r5,r5 addk r3,r3,r3 addk r3,r3,r3 rsubk r3,r5,r3
int f(int r5)
{
int r3;
r3=r5+r5;
r3=r3+r3;
r3=r3+r3;
r3=r3-r5;
return r3;
}
Let's try to compile this:
gcc -S -O3 -masm=intel test7.c
The output:
lea eax, 0[0+rdi*8]
sub eax, edi
ret
Alas, GCC can optimize chain of additions, but can't go further.
Full python code. (This Python program I used and cited is just under 200 SLOC.)
Working with (sub)expressions in form of strings -- absurdic. No (de-)compiler do this. Of course, I did this as a quick/dirty/experimental hack, for demonstration.
My second toy decompile works with structures akin to what (toy) compiler usually does. Ctrl-F: "Toy decompiler".
This post is like a 'prequel' to the toy decompiler in my SAT/SMT by Example.

Some time ago (before 24-Mar-2025) there was Disqus JS script for comments. I dropped it --- it was so motley, distracting, animated, with too much ads. I never liked it. Also, comments didn"t appeared correctly (Disqus was buggy). Also, my blog is too chamberlike --- not many people write comments here. So I decided to switch to the model I once had at least in 2020 --- send me your comments by email (don"t forget to include URL to this blog post) and I"ll copy&paste it here manually.
Let"s party like it"s ~1993-1996, in this ultimate, radical and uncompromisingly primitive pre-web1.0-style blog and website.