In this post, you will see an approach to Optimize the Trial Division Algorithm. Starting with the 'Trial Division' algorithm and a twist, I ended with a new algorithm.
Background
This article is part of a set of articles. Each article highlights an aspect around Integer Factorization.
The xPrompt Download and prg Files
xPrompt.zip contains the binary of the xHarbour Scripting Engine for Windows. This is the app needed to run the prg code.
Usage:
- Copy prg file in the same directory than xPrompt
- Launch xPrompt
- Type "
do FileName.prg
" to run the code
xHarbour is freeware and can be dowloaded at xHarbour.org[^].
xHarbour is part of the xBase family languages: xBase - Wikipedia[^].
ub files
The .ub file are UBASIC programs files. UBASIC is a DOS executable, I use it in a VirtualBox
which runs a copy of my old DOS HDD.
Why do I use a basic interpreter?
Because UBASIC integers are up to 2600 digits, natively, no need of library, no nothing.
Introduction
When starting to play with Integer Factorization, trying all possible factors is the first idea, that algorithm is named Trial Division.
The problem is that with each division, if it doesn't lead to a factor, all the work done is lost. That annoyed me and I have searched another way.
Analyze of Situation
A study of Trial Division algorithm leads to a few observations:
- Basically, the algorithm is about testing all possible factors with a division.
- We don't care about the result of division.
- We don't care about the reminder, just if it is zero or not.
- A compound integer is the result of multiplication of at least 2 integers.
With divisions, the only optimization is about removing as many non primes as possible.
I was annoyed because an integer division was a lot of work to check a single possible factor, and I didn't like that all that work was lost after factor was checked. So I searched for ways to improve that part.
Then the idea came while studying the multiplication approach. Parts of the calculation can be mutualized across factors.
Reorder and Conquer
The twist is to take advantage of a feature of multiplication: Any least significant part of a product is fully defined by the same least significant part of both factors (multiplicand and multiplier). This mean that for any factor sharing the same least significant part, one can mutualize that least significant part calculation. One just has to work at the level of additions and shifts the multiplication is made of.
To take advantage of this, one has to reorder factors to group the factors with same left parts. And then, we are exploring a tree. As factors are reordered, sometimes we are lucky, and sometimes we are not. Trying to factorize a prime is also a worst case.
Equation Rewriting
The equation needs to be rewritten in order to show how it works.
A compound integer can be written as:
PQ = P * Q
In order to do the reverse multiplication, I rewrite it as:
PQ = PQh * Scale + Pl * Ql
where
PQh is ( PQ - Pl * Ql ) / Scale
Scale is a power of 2 (power of 10 for demonstration)
Pl is the least significant digits of P
Ql is the least significant digits of Q
| Multiplicand
Multiplier
----------
Partial Product
Partial Product
----------
Full Product
Integer to factorize
Remaining |
As one can see, any green part is reused in each yellow try. Note that this animation is using base 10 for demonstration, it complicates things.
Animation comes from RMDemo10.prg in downloads.
Unoptimized UBASIC code from SPLIT_00.UB, SPLIT_05.UB contain an optimized version; but code is much longer. Both versions are recursive.
1000
1010
1020 fnSpliter(Produit)
1030
1040
1050
1060 if Produit@2=0 then return(2)
1070 if Produit@5=0 then return(5)
1080
1090 return(fnSpliter_calc(Produit,0,0,1))
1100
1110
1120 fnSpliter_calc(Pq,P,Q,Coef)
1130 local Coefn,X,Y,Pn,Qn,PQn,Diviseur
1140 Coefn=Coef*10
1150 X=0
1160 Y=0
1170 while (Pq-(P*X+Q*Y+X*Y*Coef))@10<>0
1180 Y+=1:if Y>9 then goto 1300
1190 wend
1200 inc Tour:gosub *Disp(Pq,P,Q,Coef,X,Y)
1210 Pn=P+Y*Coef
1220 Qn=Q+X*Coef
1230 PQn=(Pq-(P*X+Qn*Y))\10
1240 if PQn>0 then
1250 :Diviseur=fnSpliter_calc(PQn,Pn,Qn,Coefn)
1260 :if Diviseur<>1 then return(Diviseur) endif
1270 :elseif PQn=0 then
1280 :if Pn<>1 and Qn<>1 then return(Pn) endif
1290 endif
1300 X+=1:if X<=9 then goto 1160
1310 return(1)
Base 2
Using base 2 simplify things, and the algorithm is a binary tree exploration.
Main Algorithm
The number to factorize is PQ.
- Check if PQ is 1: In this case, 1 is a factor.
- Check if PQ is even: In this case, 2 is a factor.
- The algorithm maintains the equation is: PQ = PQh * Scale + Pl * Ql
- Start by root node with setting PQh = (PQ-1)/2, Pl = 1, Ql = 1 and Scale = 2.
- For each node:
- Factor is found when PQh is 0 and Pl <> 1 and Ql <> 1.
- If PQh is lower than smallest of Pl and Ql, node is a dead end.
- If PQh is even, 2 new child nodes, new bits will be (0,0) and (1,1). Scale*2. PQh updated acordingly.
- If PQh is odd, 2 new child nodes, new bits will be (0,1) and (1,0). Scale*2. PQh updated acordingly.
- Number is a prime when exploration ends with no factor found.
Unfortunately, the code is rather long and small snipsets are not of great interest.
Factorization Examples
Program RMTree2.prg from downloads gives a log of factorization, and generates an XML file used to show exploration as a tree.
Integer to factorize: 53 = 0b110101
Status Action Parent NodeId Equation
Root Process 0 0 0b110101 * 0b1 + 0b0 * 0b0
New 11 Process 0 1 0b11010 * 0b10 + 0b1 * 0b1
New 11 Push 1 2 0b1011 * 0b100 + 0b11 * 0b11
New 00 Process 1 3 0b1101 * 0b100 + 0b01 * 0b01
New 01 Symetry 3 4 0b110 * 0b1000 + 0b001 * 0b101
New 10 Process 3 5 0b110 * 0b1000 + 0b101 * 0b001
Tail P Reject 5 6 0b11 * 0b10000 + 0b0101 * 0b0001
Tail Q Process 5 7 0b11 * 0b10000 + 0b0101 * 0b0001
Tail Q Process 7 8 0b1 * 0b10000 + 0b10101 * 0b0001
Tail Q Reject 8 9 0b0 * 0b10000 + 0b100101 * 0b0001
New 01 Symetry 2 10 0b100 * 0b1000 + 0b011 * 0b111
New 10 Process 2 11 0b100 * 0b1000 + 0b111 * 0b011
Tail Q Reject 11 12 0b10 * 0b10000 + 0b0111 * 0b0011
Factorization 53 = 1 * 53
Integer to factorize: 251 = 0b11111011
Status Action Parent NodeId Equation
Root Process 0 0 0b11111011 * 0b1 + 0b0 * 0b0
New 11 Process 0 1 0b1111101 * 0b10 + 0b1 * 0b1
New 01 Symetry 1 2 0b111110 * 0b100 + 0b01 * 0b11
New 10 Process 1 3 0b111110 * 0b100 + 0b11 * 0b01
New 11 Push 3 4 0b11011 * 0b1000 + 0b111 * 0b101
New 00 Process 3 5 0b11111 * 0b1000 + 0b011 * 0b001
New 01 Push 5 6 0b1111 * 0b10000 + 0b1011 * 0b0001
New 10 Process 5 7 0b1110 * 0b10000 + 0b1001 * 0b0011
Tail P Reject 7 8 0b111 * 0b100000 + 0b01001 * 0b00011
Tail Q Process 7 9 0b111 * 0b100000 + 0b01001 * 0b00011
Tail Q Reject 9 10 0b10 * 0b100000 + 0b101001 * 0b00011
Tail P Reject 6 11 0b10 * 0b100000 + 0b01011 * 0b10001
Tail Q Process 6 12 0b111 * 0b100000 + 0b11011 * 0b00001
Tail Q Process 12 13 0b11 * 0b100000 + 0b111011 * 0b00001
Tail Q Process 13 14 0b1 * 0b100000 + 0b1011011 * 0b00001
Tail Q Reject 14 15 0b0 * 0b100000 + 0b1111011 * 0b00001
New 01 Push 4 16 0b1011 * 0b10000 + 0b1111 * 0b0101
New 10 Process 4 17 0b1010 * 0b10000 + 0b1101 * 0b0111
Tail Q Reject 17 18 0b101 * 0b100000 + 0b01101 * 0b00111
Tail Q Reject 16 19 0b11 * 0b100000 + 0b11111 * 0b00101
Factorization 251 = 1 * 251
Graphs
To get the graphs, I use yEd, a freeware from yWorks at yEd - Graph Editor[^].
I like the neat little feature which is automatic placement, so that I don't have care of during generation.
Note that the files generated are over simplified and contain only necessary information to get the tree after beautifying with yEd.
Make the Generated Graph Readeable
- Open the generated file in directory of xPrompt MyFile.graphml
- Menu > Tools > Fit Node to Label > Ok
- Menu > Layout > Tree > Directed > Directed tab > Orientation = Top-Down > Ok
Benchmark
Here is how the Multiplication Reversion evolve with numbers. As one can see, the workload follows the Square Root curve.
For timed benchmark, I need to carry the algorithm to C++ or even assembler to take advantage of every optimization.
Number Nodes - >> & | Stack TD_SR
11 6 4 9 3 3 0 2
31 8 7 12 3 6 1 4
53 13 10 22 6 8 1 6
71 13 8 23 8 7 1 7
97 21 21 35 8 16 2 8
101 17 17 28 6 13 1 9
1009 53 55 98 26 47 7 30
2003 54 59 98 21 51 11 43
3001 86 73 158 39 61 13 53
4001 90 90 167 43 74 12 62
5003 86 84 160 35 71 19 69
6007 98 89 181 41 75 19 76
7001 123 120 219 42 98 22 82
8009 132 132 238 49 108 22 88
9001 135 133 243 52 107 21 93
10007 123 105 231 58 90 22 99
20011 174 156 323 73 130 34 140
30011 214 210 391 85 172 39 172
40009 268 259 495 117 213 42 199
49999 264 226 507 135 198 49 222
1000003 1193 990 2325 680 854 198 999
9000011 3548 3359 6614 1499 2845 719 2999
Columns are :
- Nodes are nodes in binary tree.
- - are subtractions.
- >> are Shifts.
- & are bitwise ANDs.
- | are bitwise ORs.
- Stack are stack usage in recursion.
- TD_SR are number of divisions in Trial Division (Square Root version).
Points of Interest
Starting with the 'Trial Division' algorithm and a twist, I ended with a new algorithm.
The algorithm uses only very fast processor operations like Add/Sub and shifts which scale very well with multi precision (Big Integers). Since the algorithm explores a binary tree, multi threading is efficient. If stack usage become a bottleneck, it is possible to rewrite code in a non recursive fashion.
This is an original work. I have not been able to find any reference about this algorithm, feel free to drop a comment if you know such work.
Links
History
- 1990s: Algorithm discovery, first experiments
- 16th January, 2021: First version
- 31st January, 2021: Corrections and improvement
- 30th September, 2023: Changed Images to svg