Integer factorization using regex (with backreferences)

(Previosly.)

gbacon at HN pointed to a method of integer factorization using regex.

(Unary encoding is "" for 0, "1" for 1, "11" for 2, "11111" for 5, etc.)

I simplified it a bit, becase the first part of ^1?$|^(11+?)\1+$ is just a check against an "1" string (which is prime) and emptry string (which is for 0) (^1?$), and I removed it for clarity:

#!/usr/bin/env python3

import re

#n=12300 # composite
#n=123001 # prime, 27s
#n=12300200 # composite
n=123023 # composite, one factor: 43
#n=123027 # composite, one factor: 3
#n=223099 # prime, 87s

regex=re.compile("^(11+?)\\1+$")
res=regex.match("1"*n)
if res==None:
    print ("prime")
else:
    x=len(res[1])
    y=n/x
    print ("composite: %d * %d = %d" % (x, y, n))

It can find factors for small numbers. And here is how it works. In plain English, we asking regex matcher to find such a string, that consists of some number (>=2) of "1"'s ((11+?)), which is glueled with the same string (\1) arbitrary number of times (+).

Of course it's extremely slow, and even worse than bruteforce. For ~87 seconds on my old 2GHz CPU it can find that 223099 is a prime.

But again, this is like a thought experiment. A reduction from one problem (integer factorization) to another (find equal substrings in a string). Find a better algorithm for strings or for regex with backreferences, better than bruteforce (with or without backtracking) and this will be a revolution in computer science.

You can even simplify it further by removing +. This will divide (unary) numbers by 2 or fail it the number is odd: ^(11+?)\1$.

#!/usr/bin/env python3

import re

n=45682

regex=re.compile("^(11+?)\\1$")
res=regex.match("1"*n)
if res==None:
    print ("even number")
else:
    x=len(res[1])
    y=n/x
    print ("%d * %d = %d" % (x, y, n))