## Integer factorization using regex (with backreferences)

(Previosly.)

*gbacon* at HN pointed to
a method of integer factorization using regex.

(Unary encoding is "" for 0, "1" for 1, "11" for 2, "11111" for 5, etc.)

I simplified it a bit, becase the first part of **^1?$|^(11+?)\1+$** is just a check against an "1" string
(which is prime) and emptry string (which is for 0) (**^1?$**),
and I removed it for clarity:

#!/usr/bin/env python3
import re
#n=12300 # composite
#n=123001 # prime, 27s
#n=12300200 # composite
n=123023 # composite, one factor: 43
#n=123027 # composite, one factor: 3
#n=223099 # prime, 87s
regex=re.compile("^(11+?)\\1+$")
res=regex.match("1"*n)
if res==None:
print ("prime")
else:
x=len(res[1])
y=n/x
print ("composite: %d * %d = %d" % (x, y, n))

It can find factors for small numbers.
And here is how it works.
In plain English, we asking regex matcher to find such a string, that consists of some number (>=2)
of "1"'s (**(11+?)**),
which is glueled with the same string (**\1**) arbitrary number of times (**+**).

Of course it's extremely slow, and even worse than bruteforce.
For ~87 seconds on my old 2GHz CPU it can find that 223099 is a prime.

But again, this is like a thought experiment. A reduction from one problem (integer factorization) to another
(find equal substrings in a string).
Find a better algorithm for strings or for regex with backreferences, better than bruteforce (with or without backtracking)
and this will be a revolution in computer science.

You can even simplify it further by removing **+**. This will divide (unary) numbers by 2 or fail it the number is odd: **^(11+?)\1$**.

#!/usr/bin/env python3
import re
n=45682
regex=re.compile("^(11+?)\\1$")
res=regex.match("1"*n)
if res==None:
print ("even number")
else:
x=len(res[1])
y=n/x
print ("%d * %d = %d" % (x, y, n))