Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / algorithm

Integer Factorization: Optimizing Small Factors Checking

5.00/5 (1 vote)
16 Jan 2021CPOL5 min read 12.9K   176  
How to optimize small factors checking
In this article, I show how I adapted a mental calculation technique to check some small factors faster than division.

Download

Background

This article is part of a set of articles. Each article highlight an aspect around Integer Factorization.

The xPrompt Download and prg files

xPrompt.zip contains the binary of the xHarbour Scripting Engine for Windows. This is the app needed to run the prg code.

Usage :
- Copy prg file in same directory than xPrompt
- Launch xPromt
- Type "do FileName.prg" to run the code

xHarbour is freeware and can be dowloaded at xHarbour.org[^]

xHarbour is part of the xBase family languages: xBase - Wikipedia[^]

Introduction

Division is a good general purpose algorithm to check if a prime is factor of an integer, but sometimes we can get faster.

Small primes: Faster than division

When dealing with integer factorization, it is common to have a first stage for checking small factors before starting main algorithm. It happen that it is possible to check some of those factors faster than with a division.

Division is an efficient general purpose algorithm, but faster algorithms exist for particular divisors. Those algorithms resort to Mental Calculation techniques.

Some mental calculation techniques allow you to check easily the divisibility of a number for specific factors. The same techniques can be used on computer to check a few factors faster than with divisions because those thechniques also works with base 2.

Checking the Base

Humans Counting

Base 10 is how we count. Everyone can tell instantly if a number is multiple of 10 without resorting to division because the unit digit is all what matters. And everyone can also deduce if multiple of 2 and 5 as they are factors of 10.

2 is a factor if unit is 0, 2, 4, 6, 8.
5 is a factor if unit is 0, 5.
Ex: 3141592653
unit is 3, so 2 and 5 are not factor of 3141592653.

Cost is constant time. O(n) = 1.

Computer Counting

Base 2 is how computers count. The unit allow to know if 2 is a factor of a number. The fast operation is a bitwise AND.

2 is a factor if unit is 0.
Ex: 3141592653 => 0b10111011010000001110011001001101
unit is 1, so 2 is not a factor of 3141592653.

Cost is constant time. O(n) = 1.

dBase
if (N & 1) = 0
    return(2)
endif
C++
if ((Cand & 1) == 0) {
    return 2;
}

Checking Other Factors

Humans Counting

Everyone having learned to do additions and multiplications by hand have also learned how to proof check those operations by using the casting out nines method. The method can be used to know if a number have 3 as factor, and also 11 and others with a little teaking. All this is possible because, 3 is a factor of 10 - 1 and 11 is a factor of 100 - 1.

The casting out nines method is about adding the digits of a number and reapply the method until the result is 1 digit.

Factor 3

The techbique is about checking 9 which is base 10 - 1, 3 being only a factor of 9.

3 is a factor if result is 3, 6 or 9.
Ex: 3141592653 => 3+1+4+1+5+9+2+6+5+3 => 39 => 3+9 => 12 => 1+2 => 3
Result is 3, so 3 is a factor of 3141592653.

Cost for n is log(n). O(n) = log(n).

Factor 11

The technique is about checking 11 which is base 10 + 1. We adapt the method with packets of 2 digits.

for 11, we use the fact that 99 (100-1) is a multiple of 11.
11 is a factor if result is 0, 11, 22, 33 ... 88, 99 or if difference of the 2 digits is 0.
3141592653 => 31+41+59+26+53 => 210 => 2+10 => 12
Result is 12, so 11 is not a factor of 3141592653 because 1<>2.

Cost for n is log(n). O(n) = log(n).

Factor 101

The technique is about checking 101 which is base 10^2 + 1. We adapt the method with packets of 4 digits.

for 101, we use the fact that 9999 (10000-1) is a multiple of 101.
101 is a factor if result is 0, 101, 202, 303 ... 9999. or if difference of pairs of digits is 0.
3141592653 => 31+4159+2653 => 6843
So 101 is not a factor of 3141592653 because 68<>43.

Cost for n is log(n). O(n) = log(n).

Cascade

Those factor check can be combined when the number of digits in a check is a multiple of the number of digit of another.

Because 10000 is a power of 100, the result for factor 101 can be used to check factor 11 and 9.
3141592653 => 31+4159+2653 => 6843 => 101 is not a factor

Because 10000 is a power of 100, the result for 10000 can be reused for 100.
6843 => 68+43 => 111 => 1+11 => 12 => 11 is not a factor

Because 100 is a power of 10, the result for 100 can be reused for 10.
12 => 1+2 => 3 => 3 is a factor

The cost for factors 11 and 3 are in constant time when done after 101.

Computer Counting

The principle of "casting out nines" can be applied to base 2. Processors have very efficient tools that can be used to butcher numbers, those tools are addition/subtraction, bitwise operation and shifting.

Interesting Small Primes

3	2^2-1=3=1*3
5	2^2+1=5=1*5
7	2^3-1=7=1*7
11	2^5+1=33=1*3*11
13	2^6+1=65=1*5*13
17	2^4+1=17=1*17
19	2^9+1=513=1*3*3*3*19
23	2^11-1=2047=1*23*89
29	2^14+1=16385=1*5*29*113
31	2^5-1=31=1*31
37	2^18+1=262145=1*5*13*37*109
41	2^10+1=1025=1*5*5*41
43	2^7+1=129=1*3*43
47	2^23-1=8388607=1*47*178481
53	2^26+1=67108865=1*5*53*157*1613
59	2^29+1=536870913=1*3*59*3033169

Cascade Factors 17, 5 and 3

Ex: 3141592653 to binary 0b10111011010000001110011001001101

Reduce by 2 ^ 16 as intermediate optimization
0b10111011010000001110011001001101 => 0b1011101101000000+0b1110011001001101
=> 0b11010000110001101 => 0b1+0b1010000110001101 => 0b1010000110001110

Check 17: Reduce by 2 ^ 8
0b10100001+0b10001110 => 0b1000101110 => 0b10+0b00101110 => 0b00110000 => 0b0011 0b0000
17 is not a factor of 3141592653 because 0b0011<>0b0000

Check 5: Reduce by 2 ^ 4
0b00110000 => 0b0011+0b0000 => 0b0011 => 0b00 0b11
5 is not a factor of 3141592653 because 0b00<>0b11

Check 3: Reduce by 2 ^ 2
0b0011 => 0b00+0b11 => 0b11
3 is a factor of 3141592653

Cost for n is log(n) + constant. O(n) = log(n).

Intermediate oprimization

In modern processors, registers are 64 bits and simple operation take 1 clockcycle, no matter the size of an operation, the cost is the same. When one want to reduce a 64 bits value to a 8 bits value, it takes 7 operations.

By using intermediate steps, we can divide the size of operation by 2 every time. A 64 bits value is first reduced to 32 bits, the 16 bits, and finally to 8 bits. That 3 operations.

Algorithm

There are 4 cases to handle:

  • Factor is a power of 2 minus 1
  • Reduce to power of 2.
    Compare result with factor.

  • Factor is a factor of a power of 2 minus 1
  • Reduce to power of 2.
    do a modulo.

  • Factor is a power of 2 plus 1
  • Reduce to double of power of 2.
    test if both half are same.

  • Factor is a factor of a power of 2 plus 1
  • Reduce to double of power of 2.
    do a modulo on the difference of both half.

and optimizations:

  • Cascade
  • Do a topological sort.

  • Intermediate optimization
  • Add intermediate reduction where needed.

Generator

As things can get complicated, It was easier to automate the code generation.

List of prime to integrate

dBase
*	List of: Prime, power of 2, +-1, is multiple
list={ {3, 2, -1, .F.}, {5, 2, +1, .F.}, {7, 3, -1, .F.}, {11, 5, +1, .T.}, ;
	{13, 6, +1, .T.}, {17, 4, +1, .F.}, {19, 9, +1, .T.}, {23, 11, -1, .T.}, ;
	{29, 14, +1, .T.}, {31, 5, -1, .F.}, {37, 18, +1, .T.}, {41, 10, +1, .T.}, ;
	{43, 7, +1, .T.}, {47, 23, -1, .T.}, {53, 26, +1, .T.}, {59, 29, +1, .T.}}

Topological sort for cascade

dBase
*	tri topologique
*	Optimization by chaining operations	for scan=2 to len(list)
	ptr= 0
	for place= 1 to scan-1
		if list[scan,5] % list[place,5] = 0 // scan multiple de place
			if list[scan,5] = list[place,5]
				tmp= list[scan]
				adel(list, scan, .T.)
				ains(list, place, tmp, .T.)
				ptr=0
				exit
			endif
			if ispower(list[scan,5] / list[place,5], 2)= 1
				tmp= list[scan]
				adel(list, scan, .T.)
				ains(list, place, tmp, .T.)
				ptr=0
				exit
			endif
		endif
		if list[place,5] % list[scan,5] = 0
			if ispower(list[place,5] / list[scan,5], 2)= 1
				ptr= place
			endif
		endif
	next
	if ptr != 0
		tmp= list[scan]
		adel(list, scan, .T.)
		ains(list, ptr+1, tmp, .T.)
	endif
next

Adding intermediate steps

dBase
*	pad list with intermediate opeartions
for scan=len(list) to 2 step -1
	while list[scan-1,5] % list[scan,5] = 0
		tmp= list[scan-1,5] / list[scan,5]
		tmpp= ispower(list[scan-1,5] / list[scan,5], 2)
		if tmp > 2 .and. tmpp=1
			ains(list, scan, {0,0,0,.F.,list[scan,5]*2}, .T.)
		else
			exit
		endif
	enddo
	while list[scan-1,5] < list[scan,5]
		if list[scan,5] < int_size/2
			ains(list, scan, {0,0,0,.F.,list[scan,5]*2}, .T.)
		else
			exit
		endif
	enddo
next
while list[1,5] < int_size/2
	ains(list, 1, {0,0,0,.F.,list[1,5]*2}, .T.)
enddo

Code generator

dBase
*	generate C code
? int_type+" IsSFactNonDiv("+int_type+" Cand) {"
? "  "+int_type+" Mask, Prime, tmp;"
? "  int Size;"
? "  // fact 2"
? "  if ((Cand & 1) == 0) {"
? "    return 2;"
? "  }"
?

for scan=1 to len(list)
	if list[scan,1] != 0
		? "  Prime="+str(list[scan,1],,,.T.)+";"
	endif
	if scan=1
		org= "Cand"
	elseif list[scan,5]*2=list[scan-1,5]
		org= "N"+str(list[scan,5]*2,,,.T.)
	else
		org= "Cand"
	endif
	if (scan=1 .or. list[scan,5] != list[scan-1,5])
		if list[scan,5] < int_size
			? "  Size="+str(list[scan,5],,,.T.)+";"
			? "  Mask= (0ULL-1) >> ("+str(int_size,,,.T.)+"-Size);"
	
			if list[scan,3]=0
				? "  "+int_type+" N"+str(list[scan,5],,,.T.)+"= ("+org+" >> Size) + ("+org+" & Mask);"
			else
				? "  "+int_type+" N"+str(list[scan,5],,,.T.)+"= FactFold("+org+", Size, Mask);"
			endif
		else
			? "  "+int_type+" N"+str(list[scan,5],,,.T.)+"= "+org+";"
		endif
	endif

	if list[scan,3]=  1 .and. list[scan,4] // P2+1 et multiple
		? "  Size="+str(list[scan,2],,,.T.)+";"
		? "  Mask= (0ULL-1) >> ("+str(int_size,,,.T.)+"-Size);"
		? "  tmp = abs(( N"+str(list[scan,5],,,.T.)+" >> Size) - ( N"+str(list[scan,5],,,.T.)+" & Mask));"
		? "  tmp = FactReduce(tmp, Prime);"
		? "  if (tmp == 0) {"
		? "    return Prime;"
		? "  }"
	elseif list[scan,3]= 1 // P2+1
		? "  Size="+str(list[scan,2],,,.T.)+";"
		? "  Mask= (0ULL-1) >> ("+str(int_size,,,.T.)+"-Size);"
		? "  if (( N"+str(list[scan,5],,,.T.)+" >> Size) == ( N"+str(list[scan,5],,,.T.)+" & Mask)) {"
		? "    return Prime;"
		? "  }"
	elseif list[scan,3]= -1 .and. list[scan,4] // P2-1 et multiple
		? "  tmp = FactReduce( N"+str(list[scan,5],,,.T.)+", Prime);"
		? "  if (tmp == 0) {"
		? "    return Prime;"
		? "  }"
	elseif list[scan,3]= -1 // P2-1
		? "  if ( N"+str(list[scan,5],,,.T.)+" == Prime) {"
		? "    return Prime;"
		? "  }"
	endif

	?
next
? "  return Cand;"
? "}"

Resulting code

dBase
	// Fast checking of small primes
function SmallPrimes(N)
    //  Check factor 2
    if (N & 1) = 0
        return(2)
    endif

    //  Check factor 257
    N16= Reduce(N, 16)
    if  (N16 >> 8) = (N16 & 255)
        return(257)
    endif
    //  Check factor 17
    N8= Reduce(N16, 8)
    if  (N8 >> 4) = (N8 & 15)
        return(17)
    endif
    //  Check factor 5
    N4= Reduce(N8, 4)
    if  (N4 >> 2) = (N4 & 3)
        return(5)
    endif
    //  Check factor 3
    N2= Reduce(N4, 2)
    if  N2 = 3
        return(3)
    endif

    // Check factor 13
    N12= Reduce(N, 12)
    Tmp= abs((N12 >> 6) - (N12 & 63))
    while Tmp >= 13
      Tmp= Tmp- 13
    enddo
    if  Tmp = 0
        return(13)
    endif
    //  Check factor 7
    N3= Reduce(N12, 3)
    if  N3 = 7
        return(7)
    endif
    
    //  Check factor 11
    N10= Reduce(N, 10)
    Tmp= abs((N10 >> 5) - (N10 & 31))
    while Tmp >= 11
      Tmp= Tmp- 11
    enddo
    if  Tmp = 0
        return(11)
    endif
    //  Check factor 31
    N5= Reduce(N10, 5)
    if  N5 = 31
        return(31)
    endif
    
    //  Check factor 127
    N7= Reduce(N, 7)
    if  N7 = 127
        return(127)
    endif
    
    return 1

    // Binary 'casting out nines'
function Reduce(N, Size)
    local Tmp, Mask
    Tmp= N
    Mask= (1 << Size) -1
    while (Tmp >> Size) <> 0
      tmp= (Tmp >> Size) + (Tmp & Mask)
    enddo
    return(Tmp)
C++
// My modulo routine
inline long long FactModulo(long long cand, long long fact) {
	// my modulo
	while (cand >= fact) {
		if (cand & 1) {
			cand = cand - fact;
		}
		cand = cand >> 1;
	}
	return cand;
}

// the reduction routine
inline long long FactFold(long long cand, long long shift, long long Mask) {
	// fold integer on itself
	do {
		cand = (cand >> shift) + (cand & Mask);
	} while (cand >> shift);
	return cand;
}

// My factor checker, result of automated code generation
long long IsSFactNonDiv(long long Cand) {
	long long Mask, Prime, tmp;
	int Size;
	// fact 2
	if ((Cand & 1) == 0) {
		return 2;
	}

	Size = 32;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N32 = (Cand >> Size) + (Cand & Mask);

	Size = 16;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N16 = (N32 >> Size) + (N32 & Mask);

	Prime = 17;
	Size = 8;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N8 = FactFold(N16, Size, Mask);
	Size = 4;
	Mask = (0ULL - 1) >> (64 - Size);
	if ((N8 >> Size) == (N8 & Mask)) {
		return Prime;
	}

	Prime = 5;
	Size = 4;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N4 = FactFold(N8, Size, Mask);
	Size = 2;
	Mask = (0ULL - 1) >> (64 - Size);
	if ((N4 >> Size) == (N4 & Mask)) {
		return Prime;
	}

	Prime = 3;
	Size = 2;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N2 = FactFold(N4, Size, Mask);
	if (N2 == Prime) {
		return Prime;
	}

	Size = 48;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N48 = (Cand >> Size) + (Cand & Mask);

	Size = 24;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N24 = (N48 >> Size) + (N48 & Mask);

	Prime = 13;
	Size = 12;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N12 = FactFold(N24, Size, Mask);
	Size = 6;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N12 >> Size) - (N12 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Size = 6;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N6 = (N12 >> Size) + (N12 & Mask);

	Prime = 7;
	Size = 3;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N3 = FactFold(N6, Size, Mask);
	if (N3 == Prime) {
		return Prime;
	}

	Size = 40;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N40 = (Cand >> Size) + (Cand & Mask);

	Prime = 41;
	Size = 20;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N20 = FactFold(N40, Size, Mask);
	Size = 10;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N20 >> Size) - (N20 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Prime = 11;
	Size = 10;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N10 = FactFold(N20, Size, Mask);
	Size = 5;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N10 >> Size) - (N10 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Prime = 31;
	Size = 5;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N5 = FactFold(N10, Size, Mask);
	if (N5 == Prime) {
		return Prime;
	}

	Prime = 37;
	Size = 36;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N36 = FactFold(Cand, Size, Mask);
	Size = 18;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N36 >> Size) - (N36 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Prime = 19;
	Size = 18;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N18 = FactFold(N36, Size, Mask);
	Size = 9;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N18 >> Size) - (N18 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Prime = 23;
	Size = 11;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N11 = FactFold(Cand, Size, Mask);
	tmp = FactModulo(N11, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Size = 56;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N56 = (Cand >> Size) + (Cand & Mask);

	Prime = 29;
	Size = 28;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N28 = FactFold(N56, Size, Mask);
	Size = 14;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N28 >> Size) - (N28 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Prime = 43;
	Size = 14;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N14 = FactFold(N28, Size, Mask);
	Size = 7;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N14 >> Size) - (N14 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Size = 46;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N46 = (Cand >> Size) + (Cand & Mask);

	Prime = 47;
	Size = 23;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N23 = FactFold(N46, Size, Mask);
	tmp = FactModulo(N23, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Prime = 53;
	Size = 52;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N52 = FactFold(Cand, Size, Mask);
	Size = 26;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N52 >> Size) - (N52 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	Prime = 59;
	Size = 58;
	Mask = (0ULL - 1) >> (64 - Size);
	long long N58 = FactFold(Cand, Size, Mask);
	Size = 29;
	Mask = (0ULL - 1) >> (64 - Size);
	tmp = abs((N58 >> Size) - (N58 & Mask));
	tmp = FactModulo(tmp, Prime);
	if (tmp == 0) {
		return Prime;
	}

	return Cand;
}

Benchmark

Runtime is so fast that timing is done on repeated execution of routines. Timing done on a i3 3GHz.

Timing
Integer				Factor				Division	My Method
6561				3					9 ms		6 ms
125					5					18.001 ms	4 ms
2401				7					39.002 ms	9 ms
14641				11					41.002 ms	20.001 ms
3601				13					42.002 ms	11 ms
83521				17					52.003 ms	3 ms
49999				49999				147.008 ms	59.003 ms
4611686018427387899	4611686018427387899	143.008 ms	55.003 ms
4611686018427387877	4611686018427387877	143.008 ms	55.003 ms

On last three numbers, my method is 2.5 times faster than division.

Points of Interest

Interest is limited if value is within register size, is value is big integer, the technique can be extended with more gains.

Unfortunately, it works only on a limited set of primes.

External Links:

History

  • 20190501: First draft
  • 20201230: Second version

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)