Factorization: November 2007

Friday, 30 November 2007

FFTs-part1

The Fourier Transform - what is it?
Well, suppose we have some sum or computation we wish to evaluate, for example to calculate a*b for some a and b, e.g. large integers (of several thousands of digits or more).
Sometimes it can be easier to map each of the inputs to a new domain via some transform, perform the multiplication (approximating suitably if necessary) and then apply the reverse transform to get the original answer. Such a transform is the "Fourier" transform (and its inverse), and because of the way it maps objects/events to wave functions (expressed in terms of sin's and cos's) (and back) its respective domains of operation are termed 'time' and 'frequency'.
http://en.wikipedia.org/wiki/Fourier_transform
http://mathworld.wolfram.com/FourierTransform.html
http://mathworld.wolfram.com/FourierSeries.html
Obviously, in order for the process to work, it depends on the reciprocity of the transforming function. This is achieved by the orthogonality of the underlying individual waveforms (Sturm-Liouville).
http://en.wikipedia.org/wiki/Sturm-Liouville_theory

Thursday, 29 November 2007

Plancherel

This is a (PD) picture of Michel Plancherel.
...which should lead me on to a discussion of Fourier Transforms.

Wednesday, 28 November 2007

C127_113_36

Well, I finally completed my factorization of C127_113_36 for the XYYXF project - and by the recommended method - ie sieving by GGNFS, followed by post-processing with msieve. Thanks to Greg Childers, Bob Backstrom and Hallstein Hansen, who helped me with this transfer.
Basically msieve needs three files:
1) worktodo.ini - just containing the input number
2) msieve.fb - with the details (roughly) transferred from the .poly file of GGNFS
3) msieve.dat - with all the actual relations from GGNFS, translated to msieve format by procrels w/ the following crucial command:
"procrels -fb C127_113_36.fb -prel rels.bin -dump"

After this it's just a case of running msieve w/
"msieve -nc -v"
In the end the sieving [Lintel] took me about 3 months(!), because of memory limitations, and the GGNFS 'bounce', while the postprocessing in msieve [PPC-Tiger] finished to schedule in less than 2 days.

Tuesday, 27 November 2007

Sieving Records

In fact, here is a nice page detailing historical records of factorizations using sieving methods:
http://www.crypto-world.com/FactorRecords.html

Monday, 26 November 2007

Quadratic Sieve

Also, here is some info on the Quadratic Sieve (QS or MPQS, or, with slight variation, SIQS) method, invented in 1981 by Carl Pomerance, as an improvement to Dixon's Method:
http://en.wikipedia.org/wiki/Quadratic_sieve
http://mathworld.wolfram.com/QuadraticSieve.html
From the former link:
"On April 2, 1994, the factorization of RSA-129 was completed using QS. It was a 129-digit number, the product of two large primes, one of 64 digits and the other of 65. The factor base for this factorization contained 524339 primes. The data collection phase took 5000 MIPS-years, done in distributed fashion over the Internet. The data collected totaled 2GB. The data processing phase took 45 hours on Bellcore's MasPar (massively parallel) supercomputer. This was the largest published factorization by a general-purpose algorithm, until NFS was used to factor RSA-130, completed April 10, 1996."

Dixon's Method

And here is some info about Dixon's method, which is related to CFRAC, and the precursor to most other modern sieving methods.
http://en.wikipedia.org/wiki/Dixon%27s_factorization_method
http://mathworld.wolfram.com/DixonsFactorizationMethod.html
This method was first published in 1981.

Sunday, 25 November 2007

CFRAC

Here are some links describing the CFRAC, or "continued fraction" factorization method:
http://en.wikipedia.org/wiki/Continued_fraction_factorization
http://mathworld.wolfram.com/ContinuedFractionFactorizationAlgorithm.html
Notable achievements of this method, first envisaged in 1931, include the factorization of F7, the seventh Fermat number, in 1970 by Morrison and Brillhart.

Saturday, 24 November 2007

Rho:x^2-2

Most polynomials work nicely with Pollard Rho.
However, f(x)=x^2 and f(x)=x^2-2 should be avoided.
Here's what John Pollard had to say by way of reason, in 1975 [BIT 15, P.333]:
"(i) that all polynomials x^2+b seem equally good in (1) except that x^2 and x^2-2 should not be used (whatever the starting value x0), the latter for reasons connected with its appearance in the Lucas-Lehmer test for primality of the Mersenne Numbers [3],"
while Knuth has this to say [TAOCP Vol.2 P.386]:
"In those rare cases where failure occurs for large N, we could try using f(x)=x^2+c for some c<>0 or 1. The value c=-2 should also be avoided, since the recurrence x_(m+1) = (x_m)^2-2 has solutions of the form x_m = r^(2^m) + r^-(2^m). Other values of c do not seem to lead to simple relationships mod p, and they should all be satisfactory when used with suitable starting values."

Friday, 23 November 2007

2007 chips

I recently came across the following page, which has some interesting speed comparisons of the various types of modern chips...
http://www.tomshardware.com/2007/07/16/cpu_charts_2007/page36.html

Thursday, 22 November 2007

Desktop2

This is Lenny's desktop.
It's currently running mprime255OSX on M+2's.
I like the new 'spaces' feature of Leopard, it's something that's been good about Linux desktops for a while.

Wednesday, 21 November 2007

LIM (Part 4) - Schonhage-Strassen

The Schonhage-Strassen (SSA) is an asymptotically fast multiplicative algorithm for large integers, developed in 1971.
http://en.wikipedia.org/wiki/Schönhage-Strassen_algorithm
It uses Fast Fourier transforms (FFTs) (more on these at a later date hopefully) and its run-time complexity is of order nlognloglogn. [Note that the FFT must be performed modulo 2^n+1 for a suitable n, but by choosing n large enough this equates to a regular multiplication]
This means that SSA outperforms Karatsuba or Toom-Cook for numbers with tens of thousands of digits or more. An example of its implementation is in GIMPS' Prime95/mprime software. A second example is the recent addition of SSA to the open-source math library GMP.

Tuesday, 20 November 2007

LIM (Part 3) - Toom-Cook

Another method of multiplication is called Toom-Cook, first described in 1963.
http://en.wikipedia.org/wiki/Toom-Cook_multiplication
This is basically a generalization of the Karatsuba Method, by splitting the input numbers into multiple parts at a time, rather than just two (as in Karatsuba) or 1 (ie no splitting) (as in classical long multiplication).
Toom-3 (3-way Toom-Cook) reduces 9 multiplications to 5, and runs in order n^(log5/log3) time.

Monday, 19 November 2007

WEP-M+2 milestone

The WEP-M+2 Project has just announced that it has reached the milestone of 1000 instances of finding the 12-digit factor of (M+2)2203. Thanks to everyone who has participated so far.
http://bearnol.is-a-geek.com/wanless2/
I estimate (I'm the project admin :) that those 1000 instances equate to about 27 CPU-years (modern CPU cores).

LIM (Part 2) - Karatsuba

A first improvement to long multiplication is the Karatsuba Algorithm.
http://en.wikipedia.org/wiki/Karatsuba_algorithm
This was invented in 1960, and has a time complexity of order n^(log_2(3)).
It relies on the observation that two-digit multiplication can be done with only 3, rather than 4, multiplications classically required. By "dividing and conquering" (ie splitting) the numbers to be multiplied this can be extended to larger numbers.

Sunday, 18 November 2007

Large Integer Multiplication (LIM) Part 1

Large Integer Multiplication is an algorithm to multiply two large integers together efficiently.
This is often used by factorization algorithms, both for exponentiation, for example - when it is combined with the Russian Peasant method for extra speed (see earlier) or in fact, often, for division (by multiplying by the inverse of a number as the dividend)
"Long Multiplication" is the obvious, and naive, method, but the time complexity of this is of order n^2 for two n-digit integers. So a number of improvements have been suggested. These will be examined in later parts of this series. For the moment read all about LIM on Wikipedia:
http://en.wikipedia.org/wiki/Multiplication_algorithm

Saturday, 17 November 2007

Integer Factorization Records

"Integer factorization records" on Wikipedia has a summary of the current, and recent, state-of-play for the biggest non-trivial numbers yet factored:
http://en.wikipedia.org/wiki/Integer_factorization_records
Note that these have all been achieved with some form of Number Field Sieve, either General (for numbers of no especial form), or Special (for the two Mersenne numbers cited).

Friday, 16 November 2007

Msieve v1.30

New version [primarily a bugfix] of msieve (by Jason Papadopoulos) now available.
Download from:
http://www.boo.net/~jasonp/qs.html

The Magic Words are Squeamish Ossifrage

Here is a rather magnificent looking ossifrage [picture from Wikipedia] (though I don't know whether he's especially squeamish! :)
The connection with factorization?
Well the first ever RSA challenge, posed by Martin Gardner in 1977, had this phrase as its encrypted solution - deciphered in 1994 for the $100 prize:
http://en.wikipedia.org/wiki/The_Magic_Words_are_Squeamish_Ossifrage

Thursday, 15 November 2007

SQUFOF

Shanks' square forms factorization was devised as an improvement on Fermat's method.

Here is its entry on Wikipedia:
http://en.wikipedia.org/wiki/SQUFOF

and here is its implementation in superfac9:
BigInteger factorizeshanks(BigInteger n) {
BigInteger a = new BigInteger("0");
BigInteger f = new BigInteger("0");
BigInteger h1 = new BigInteger("0");
BigInteger h2 = new BigInteger("0");
BigInteger k = new BigInteger("0");
BigInteger p = new BigInteger("0");
BigInteger pp = new BigInteger("0");
BigInteger q = new BigInteger("0");
BigInteger qq = new BigInteger("0");
BigInteger qqq = new BigInteger("0");
BigInteger r = new BigInteger("0");
BigInteger te = new BigInteger("0");
BigInteger i = new BigInteger("0");
BigInteger count = new BigInteger("0");

k = sqrt(n);

if (fastsquareQ(n)) return k;

a=k; h1=k; h2=ONE; pp=ZERO; qq=ONE; qqq=n; r=ZERO;

for (count=ONE;count.compareTo(TENTHOUSAND)<0;count=count.add(ONE)) {
p=k.subtract(r);
q=qqq.add(a.multiply(pp.subtract(p)));
a=(p.add(k)).divide(q);
r=(p.add(k)).remainder(q);
te=(a.multiply(h1)).add(h2);
h2=h1;
h1=te;
pp=p;
qqq=qq;
qq=q;
te = sqrt(q);
i=i.add(ONE);
if ((i.remainder(TWO).compareTo(ZERO))!=0 || !fastsquareQ(q)) continue;
te=h2.subtract(te);
f=n.gcd(te);
if (f.compareTo(ONE) > 0 && f.compareTo(n) < 0)
return f;
}

return f;
}

Wednesday, 14 November 2007

RSA-155

RSA-155 (a 155 digit semiprime) was factored on August 22, 1999 by GNFS.
RSA-155 = 102639592829741105772054196573991675900716567808038066803341933521790711307779
* 106603488380168454820927220360012878679207958575989291522270608237193062808643

Read more about it at Wikipedia:
http://en.wikipedia.org/wiki/RSA-155
and on the official announcement:
http://listserv.nodak.edu/cgi-bin/wa.exe?A2=ind9908&L=nmbrthry&P=1905

Tuesday, 13 November 2007

snfspoly

There is also 'snfspoly' - the equivalent of 'phi' for XYYXF composites, which can be difficult to find - search for it on the yahoo XYYXF mailing list, or email me...

Monday, 12 November 2007

Phi

Alex Kruppa has written a small (but growing) program, licensed under the GPL, called 'phi', for generating SNFS polynomials for use with GGNFS or msieve. It can currently produce correct polys for cyclotomic numbers (eg Cunningham Project) and 'Homogeneous' Cunninghams. Search for the source code (written in 'C', and using GMP) in the factoring section of the mersenneforum.

Sunday, 11 November 2007

Prime Numbers and Computer Methods for Factorization

http://www.amazon.com/Numbers-Computer-Factorization-Progress-Mathematics/dp/0817637435/ref=pd_bbs_sr_1/002-0912694-6296825?ie=UTF8&s=books&qid=1193823716&sr=8-1
The above is a book I'd like to get!

Saturday, 10 November 2007

What does the term "embarrassingly parallel" mean?

An algorithm that can be run (very) profitably on many threads simultaneously. For example, trial-division, with a different random seed in each thread as the potential factor, to avoid repeating work.

Friday, 9 November 2007

FireStream

I imagine these new chips from AMD would be great for embarrassingly parallel apps like most factorization methods...
http://www.pcworld.com/article/id,139413-c,amd/article.html
Here is some more on this series of chips, from Wikipedia:
http://en.wikipedia.org/wiki/AMD_Stream_Processor

Thursday, 8 November 2007

Velocity Engine

Apple has produced a sample application, to demonstrate their "Velocity Engine", which uses the vector capability of PPC chips, G4 & G5.
http://en.wikipedia.org/wiki/AltiVec
It happens to be a factorization program! :)
http://developer.apple.com/samplecode/VelEng_Multiprecision/index.html
The program proceeds by trial-division, then rho, and finally ECM - and is quite fast...

Wednesday, 7 November 2007

Lenny

Well I've only been and gone and done it! Bought a new MacBook that is, complete with Leopard. It's called "Lenny"
I've already got it running George Woltman's mprime 25.5 for Mac OSX (beta), on medium-sized M+2 numbers, and have high expectations of finding a new factor or two :)

Tuesday, 6 November 2007

Richard Brent

Here is a picture of Richard Brent, from his homepage:
http://wwwmaths.anu.edu.au/~brent/
(image linked to in situ)

Monday, 5 November 2007

Brent's Method

Brent's improvement (in 1980) to Pollard's rho method is to extend the core idea by generalizing the cycle by powers of two.
Full details here, with due reference to Floyd -
http://web.comlab.ox.ac.uk/oucl/work/richard.brent/pd/rpb051i.pdf

The following from Wikipedia:
Input: n, the integer to be factored; x0, such that 0 ≤ x0 ≤ n; m such that m > 0; and f(x), a pseudo-random function modulo n.
Output: a non-trivial factor of n, or failure.
1. y ← x0, r ← 1, q ← 1.
2. Do:
1. x ← y
2. For i = 1 To r:
1. y ← f(y)
3. k ← 0
4. Do:
1. ys ← y
2. For i = 1 To min(m, r − k):
1. y ← f(y)
2. q ← (q × |x − y|) mod n
3. g ← GCD(q, n)
4. k ← k + m
5. Until (k ≥ r or g > 1)
6. r ← 2r
3. Until g > 1
4. If g = n then
1. Do:
1. ys ← f(ys)
2. g ← GCD(|x − ys|, n)
2. Until g > 1
5. If g = n then return failure, else return g

Also, the pseudocode immediately below is taken from a PD document by Connelly Barnes of Oregon State University
http://oregonstate.edu/~barnesc/documents/factoring.pdf

function brentFactor(N)
# Initial values x(i) and x(m) for i = 0.
xi := 2
xm := 2
for i from 1 to infinity
# Find x(i) from x(i-1).
xi := (xi ^ 2 + 1) % N
s := gcd(xi - xm, N)
if s <> 1 and s <> N then
return s, N/s
end if
if integralPowerOf2(i) then
xm := xi
end if
end do
end function

Here is its Java implementation from superfac9:

BigInteger factorizebrent(BigInteger n) {
BigInteger k = new BigInteger("1");
BigInteger r = new BigInteger("1");
BigInteger i = new BigInteger("1");
BigInteger m = new BigInteger("1");
BigInteger iter = new BigInteger("1");
BigInteger z = new BigInteger("1");
BigInteger x = new BigInteger("1");
BigInteger y = new BigInteger("1");
BigInteger q = new BigInteger("1");
BigInteger ys = new BigInteger("1");

m=TEN;
r=ONE;
iter=ZERO;
z=ZERO;
y=z;
q=ONE;

do {
x=y;
for (i=ONE;i.compareTo(r)<=0;i=i.add(ONE)) y=((y.multiply(y)).add(THREE)).mod(n);
k=ZERO;
do {
iter=iter.add(ONE);
// System.out.print("iter=" + iter.toString() + '\r');
ys=y;
for (i=ONE;i.compareTo(mr_min(m,r.subtract(k)))<=0;i=i.add(ONE)) {
y=((y.multiply(y)).add(THREE)).mod(n);
q=((y.subtract(x)).multiply(q)).mod(n);
}
z=n.gcd(q);
k=k.add(m);
} while (k.compareTo(r)<0 && z.compareTo(ONE)==0);
r=r.multiply(TWO);
} while (z.compareTo(ONE)==0 && iter.compareTo(TENTHOUSAND)<0);

if (z.compareTo(n)==0) do {
ys=((ys.multiply(ys)).add(THREE)).mod(n);
z=n.gcd(ys.subtract(x));
} while (z.compareTo(ONE)==0);

return z;
}

Achievements of this method include the factorization, in 1980, of the eighth Fermat number:
http://wwwmaths.anu.edu.au/~brent/pub/pub061.html

Sunday, 4 November 2007

Pollard P-1 Method

There is also the Pollard "p-1" method, invented in 1974. It relies on Fermat's Little Theorem.
Here is a Wikipedia article about this method:
http://en.wikipedia.org/wiki/Pollard%27s_p_-_1_algorithm

And here is its description on the mersenneforum:
http://mersennewiki.org/index.php/P-1_Factorization_Method

Also, the pseudocode immediately below is taken from a PD document by Connelly Barnes of Oregon State University
http://oregonstate.edu/~barnesc/documents/factoring.pdf

function pollard_p1(N)
# Initial value 2^(k!) for k = 0.
two_k_fact := 1
for k from 1 to infinity
# Calculate 2^(k!) (mod N) from 2^((k-1)!).
two_k_fact := modPow(two_k_fact, k, N)
rk := gcd(two_k_fact - 1, N)
if rk <> 1 and rk <> N then
return rk, N/rk
end if
end for
end function

Saturday, 3 November 2007

A Random Factorization

Below is the factorization of a random 100-digit number.

First in Pari/GP: (which uses Rho, ECM and MPQS)
[NB the time quoted is process-time, rather than real-time]
? #
timer = 1 (on)
? factor(905771525917281232131519213461223147373627632478259763073719184206592688398458994971036043749073482)
time = 12mn, 17,641 ms.
%1 =
[2 1]

[3 1]

[11 1]

[18701 1]

[111977 1]

[122016508135030794072521 1]

[3174449800530489735869567 1]

[16919752823495547077187437987066464785943 1]

...and then with superfac9:
tiggatoo:~/math james$ time java superfac9 < random100d.txt
[905771525917281232131519213461223147373627632478259763073719184206592688398458994971036043749073482]
wanless...
brutep: 2
wanless...
wanless...
brutep: 3
brutep: 11
ecm...
ecm...
aprtcle: 18701
aprtcle: 111977
ecm...
aprtcle: 122016508135030794072521
siqs...
aprtcle: 3174449800530489735869567
aprtcle: 16919752823495547077187437987066464785943
duration: 27490 seconds

Exception in thread "main" java.lang.StringIndexOutOfBoundsException: String index out of range: 0
at java.lang.String.charAt(String.java:558)
at superfac9.main(superfac9.java:192)

real 458m11.888s
user 59m52.469s
sys 0m29.617s

Friday, 2 November 2007

Trial Division

Perhaps I should have started with this, but...
Trial division is the simplest and most naive of factorization methods.
It _is_ however, the fastest method for easily eliminating very small factors from composite inputs.
Note also, though, interestingly, that in fact it is also just about the only method capable of finding small (or indeed any) factors of really large numbers of specific algebraic form(s), because those algebraic forms can then be calculated modulo the suspected factor, keeping the required calculations small.

http://en.wikipedia.org/wiki/Trial_division

Also, the pseudocode immediately below is taken from a PD document by Connelly Barnes of Oregon State University
http://oregonstate.edu/~barnesc/documents/factoring.pdf

function trialDivision(N)
for s from 2 to floor(sqrt(N))
if s divides N then
return s, N/s
end if
end for
end function

Here is its Java implementation (as 'brute' - standing for 'brute force') from superfac9:

BigInteger factorizebrute(BigInteger n) {
BigInteger a = new BigInteger("2");

while (a.compareTo(TENTHOUSAND) < 0 && a.multiply(a).compareTo(n) <= 0) {
if (n.remainder(a).compareTo(ZERO) == 0)
return a;
else
a = a.add(ONE);
}

return n;
}

Thursday, 1 November 2007

P+1 p60

This just in... Alex Kruppa is reporting a new, record-size factor, of 60 digits, found with the P+1 method.
Read all about it here:
http://mersenneforum.org/showthread.php?p=117442#post117442

Wave-particle duality

Today I (also) have some musings on the wave-particle duality for you, and its effect on factorization.
Clearly there is a mapping (according to QT), from the wave-to-particle domains, or very equivalently, the time-to-space domains.
As integers get larger, the required precision to express that integer increases - leading to a more efficient representation/manipulation in the wave domain, rather than the particle. Hence FFT-methods for eg Large-integer-multiplication (more on that some other time hopefully).
Note that a so-called quantum computer, would also be operating naturally largely in the wave-domain.
This leads me to the suggestion, that maybe a QC would operate best on native representations of FFTs...

Russian Peasant

'Russian Peasant' is a method for fast exponentiation. It is also known as 'exponentiation by squaring'. The basic idea is to exploit knowledge of the binary expansion of the exponent, by using selective repeated squaring. Depending on the factorization algorithm, and the size of the input number, this can lead to significant performance enhancement.

http://en.wikipedia.org/wiki/Exponentiation_by_squaring

Here is an on-line demo:
http://www.math.umbc.edu/~campbell/NumbThy/Class/BasicNumbThy.html#Modular-PowMod

The classic description is in Knuth (The Art of Computer Programming) 4.6.3
Java code as below:
static BigInteger power(BigInteger x, BigInteger n, BigInteger mod) {  // Knuth 4.6.3 - computes x^n modulo mod   BigInteger N = new BigInteger("0");   BigInteger Y = new BigInteger("0");   BigInteger Z = new BigInteger("0");
N=n;   Y=one;   Z=x;
while (true) {   if (N.remainder(two).compareTo(zero) > 0) {   N=N.divide(two);   Y=Z.multiply(Y).remainder(mod);   if (N.compareTo(zero) == 0)   return Y;   }   else {   N=N.divide(two);   }   Z=Z.multiply(Z).remainder(mod);   }   }