Discussion:
Pollard revisited
(too old to reply)
none) (albert
2022-12-07 14:17:57 UTC
Permalink
In Forth Dimensions 6 number 6, Grossman published an article
about the pollard method of factoring numbers.
See the excellent explanation via
http://soton.mpeforth.com/flag/fd/index.html

He was obliged to implement quad precision operators.
The example 4225910033 was at the limit, gaining a factor
2 from using unsigned numbers. This took 40 seconds.

The same example takes less that 400 uS on my AMD 64 bit.
More over this implementation can handle double the amount
of digits (64 bit signed) despite it has no need to resort
to double precision except the standard UM/MOD UM* .

\ ----------------------------------
\ Pollard (rho) factoring method

: GCD BEGIN OVER MOD DUP WHILE SWAP REPEAT DROP ;

VARIABLE n
VARIABLE b 1 b !
VARIABLE x 6 x !

: next DUP UM* n @ UM/MOD DROP b @ + ;

: try SWAP next SWAP next next 2DUP = ABORT" failure" ;

: factorize DUP n ! x @ DUP next
BEGIN 2DUP - ABS n @ GCD DUP 1 = WHILE DROP try REPEAT
NIP NIP GCD ;
\ REGRESS 4294967297 factorize S: 641
\ REGRESS 4225910033 factorize S: 65011
\ REGRESS 1,000,000,007 DUP * factorize S: 1,000,000,007

\ ----------------------------------

The REGRESS lines are tests or examples.
This is a Monte Carlo program. You may not succeed with
the values of x and b given. You can change them as long
as b is not 0 or -2 .

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
***@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Marcel Hendrix
2022-12-07 19:15:51 UTC
Permalink
On Wednesday, December 7, 2022 at 3:18:01 PM UTC+1, none albert wrote:
[..]
Post by none) (albert
This is a Monte Carlo program. You may not succeed with
the values of x and b given. You can change them as long
as b is not 0 or -2 .
FORTH> 1111 factorize . 11 ok
FORTH> 1111 11 /mod . . 101 0 ok
FORTH> 11111 dup factorize /mod . . 271 0 ok
FORTH> 111111 dup factorize /mod . . 10101 0 ok
FORTH> 1111111 dup factorize /mod . . 4649 0 ok
FORTH> 11111111 dup factorize /mod . . 1010101 0 ok
FORTH> 111111111 dup factorize /mod . . 12345679 0 ok
FORTH> 1111111111 dup factorize /mod . . 101010101 0 ok
FORTH> 11111111111 dup factorize /mod . . 513239 0 ok
FORTH> 111111111111 dup factorize /mod . . 10101010101 0 ok
FORTH> 1111111111111 dup factorize /mod . . 20964360587 0 ok
FORTH> 11111111111111 dup factorize /mod . . 1010101010101 0 ok
FORTH> 111111111111111 dup factorize /mod . . 3584229390681 0 ok
FORTH> 1111111111111111 dup factorize /mod . . 101010101010101 0 ok
FORTH> 11111111111111111 dup factorize /mod . . 5363222357 0 ok
FORTH> 111111111111111111 dup factorize /mod . . 10101010101010101 0 ok
FORTH> 1111111111111111111 dup factorize /mod . .

Hmmm, apparently it's not only 0 and -2 that are special ...

-marcel
lehs
2022-12-08 09:07:18 UTC
Permalink
[..]
Post by none) (albert
This is a Monte Carlo program. You may not succeed with
the values of x and b given. You can change them as long
as b is not 0 or -2 .
FORTH> 1111 factorize . 11 ok
FORTH> 1111 11 /mod . . 101 0 ok
FORTH> 11111 dup factorize /mod . . 271 0 ok
FORTH> 111111 dup factorize /mod . . 10101 0 ok
FORTH> 1111111 dup factorize /mod . . 4649 0 ok
FORTH> 11111111 dup factorize /mod . . 1010101 0 ok
FORTH> 111111111 dup factorize /mod . . 12345679 0 ok
FORTH> 1111111111 dup factorize /mod . . 101010101 0 ok
FORTH> 11111111111 dup factorize /mod . . 513239 0 ok
FORTH> 111111111111 dup factorize /mod . . 10101010101 0 ok
FORTH> 1111111111111 dup factorize /mod . . 20964360587 0 ok
FORTH> 11111111111111 dup factorize /mod . . 1010101010101 0 ok
FORTH> 111111111111111 dup factorize /mod . . 3584229390681 0 ok
FORTH> 1111111111111111 dup factorize /mod . . 101010101010101 0 ok
FORTH> 11111111111111111 dup factorize /mod . . 5363222357 0 ok
FORTH> 111111111111111111 dup factorize /mod . . 10101010101010101 0 ok
FORTH> 1111111111111111111 dup factorize /mod . .
Hmmm, apparently it's not only 0 and -2 that are special ...
-marcel
Pollard rho will always give an answer if increasing start values at failure.

https://forthmath.blogspot.com/2020/01/about-pollard-rho.html
minf...@arcor.de
2022-12-08 09:27:23 UTC
Permalink
Post by lehs
Pollard rho will always give an answer if increasing start values at failure.
https://forthmath.blogspot.com/2020/01/about-pollard-rho.html
Thanks! I did not see "the tree in the forest". I am now gathering that with unchanged
function (x^2+1)mod n, simply stepping up x0 after failure guarantees to get
a factorization. So after max 2 rounds there is a result, correct?

BTW I like your web page, I did not know it existed.
lehs
2022-12-08 09:47:53 UTC
Permalink
Post by ***@arcor.de
Post by lehs
Pollard rho will always give an answer if increasing start values at failure.
https://forthmath.blogspot.com/2020/01/about-pollard-rho.html
Thanks! I did not see "the tree in the forest". I am now gathering that with unchanged
function (x^2+1)mod n, simply stepping up x0 after failure guarantees to get
a factorization. So after max 2 rounds there is a result, correct?
BTW I like your web page, I did not know it existed.
Thanks! You may have to increase several times.

Here is the code I use (all singles unsigned):


2 value start

: func \ n numb -- m
swap dup um* 1 0 d+ rot um/mod drop ;

: pollard1 \ numb start -- pfac flag numb is an odd composite
swap locals| numb | dup 1
begin drop numb func swap
numb func numb func swap
2dup - abs
numb ugcd dup numb = \ gcd flag
if 2drop false exit
then dup 1 = 0= \ gcd<>1
until nip nip true ;

: pollard2 \ numb -- prime numb>1
dup 1 and 0= if drop 2 exit then
dup isprime if exit then
dup sqrtf dup * over =
if sqrtf exit then -1 start \ numb 100 0
do dup i pollard1 \ numb pfac flag
if leave \ numb pfac
then drop \ numb
loop nip ; \ pfac
minf...@arcor.de
2022-12-08 08:42:24 UTC
Permalink
Post by none) (albert
In Forth Dimensions 6 number 6, Grossman published an article
about the pollard method of factoring numbers.
See the excellent explanation via
http://soton.mpeforth.com/flag/fd/index.html
He was obliged to implement quad precision operators.
The example 4225910033 was at the limit, gaining a factor
2 from using unsigned numbers. This took 40 seconds.
The same example takes less that 400 uS on my AMD 64 bit.
More over this implementation can handle double the amount
of digits (64 bit signed) despite it has no need to resort
to double precision except the standard UM/MOD UM* .
\ ----------------------------------
\ Pollard (rho) factoring method
: GCD BEGIN OVER MOD DUP WHILE SWAP REPEAT DROP ;
VARIABLE n
VARIABLE b 1 b !
VARIABLE x 6 x !
: try SWAP next SWAP next next 2DUP = ABORT" failure" ;
NIP NIP GCD ;
\ REGRESS 4294967297 factorize S: 641
\ REGRESS 4225910033 factorize S: 65011
\ REGRESS 1,000,000,007 DUP * factorize S: 1,000,000,007
\ ----------------------------------
The REGRESS lines are tests or examples.
This is a Monte Carlo program. You may not succeed with
the values of x and b given. You can change them as long
as b is not 0 or -2 .
Your Forth code is very concise, congrats!

A long while ago I experimented with Brent's algorithm with
128-bit integers, but did not pursue it because I did not find
good ways to select modified start parameters and function
(your next) after factorization failure.

You are probably more versed in mathematics than I am,
so what do you suggest: simply stepping parameters up, random
generation, or?
Paul Rubin
2022-12-08 10:16:55 UTC
Permalink
Post by ***@arcor.de
You are probably more versed in mathematics than I am,
so what do you suggest: simply stepping parameters up, random
generation, or?
Pollard's rho is conceptually simple but runs out of gas sooner than
other, fancier algorithms do. I think the satisfying thing to do is
implement the fancier algorithms. I confess I've implemented Pollard
Rho (not in Forth) but haven't gone further.

Neal Koblitz's book "A Course in Number Theory and Cryptography" has a
good introduction to the subject, up through the Quadratic Sieve (QS).
It unfortunately doesn't discuss the Number Field Sieve.
lehs
2022-12-08 10:40:23 UTC
Permalink
Post by Paul Rubin
Post by ***@arcor.de
You are probably more versed in mathematics than I am,
so what do you suggest: simply stepping parameters up, random
generation, or?
Pollard's rho is conceptually simple but runs out of gas sooner than
other, fancier algorithms do. I think the satisfying thing to do is
implement the fancier algorithms. I confess I've implemented Pollard
Rho (not in Forth) but haven't gone further.
Neal Koblitz's book "A Course in Number Theory and Cryptography" has a
good introduction to the subject, up through the Quadratic Sieve (QS).
It unfortunately doesn't discuss the Number Field Sieve.
I guess 64 bit singles is close to the limit for Pollard rho.
none) (albert
2022-12-08 11:26:38 UTC
Permalink
Post by Paul Rubin
Post by ***@arcor.de
You are probably more versed in mathematics than I am,
so what do you suggest: simply stepping parameters up, random
generation, or?
Pollard's rho is conceptually simple but runs out of gas sooner than
other, fancier algorithms do. I think the satisfying thing to do is
implement the fancier algorithms. I confess I've implemented Pollard
Rho (not in Forth) but haven't gone further.
Neal Koblitz's book "A Course in Number Theory and Cryptography" has a
good introduction to the subject, up through the Quadratic Sieve (QS).
It unfortunately doesn't discuss the Number Field Sieve.
We (Forth fig) had a quadratic sieve running in 1996 on 4 transputers.
The charm of the simple algorithm presented that is short and easy
to understand. I can't fathom that you think that I present here the
end-all of factorization algorithms.

Implementing fancier algorithm is not satisfying at all.
You soon realize that you need to be a good mathematician (I
think I am) but that you need invest years of your life to get
at the frontier.

I challenge you to present an algorithm that has a better
performance to conciseness ratio than what I presented here.

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
***@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Paul Rubin
2022-12-10 06:39:07 UTC
Permalink
I can't fathom that you think that I present here [Pollard rho] the
end-all of factorization algorithms.
Of course I don't think that. It's a nice simple way to factor numbers
that are a little too big for trial division.
Implementing fancier algorithm is not satisfying at all. You soon
realize that you need to be a good mathematician (I think I am) but
that you need invest years of your life to get at the frontier.
The book I mentioned discusses ECM and the Quadratic Sieve in addition
to the rho method and a few others. Those aren't the frontier and don't
take years to understand or implement, given a reasonable math
background (a long way from the frontier). The remaining algorithm is
the NFS which I don't understand, but which doesn't seem to be alien
technology either.
I challenge you to present an algorithm that has a better
performance to conciseness ratio than what I presented here.
I agree that the fancier methods take more code. But they have better
asymptotic performance so you can factor bigger numbers with them.
minf...@arcor.de
2022-12-10 08:07:12 UTC
Permalink
I can't fathom that you think that I present here [Pollard rho] the
end-all of factorization algorithms.
Of course I don't think that. It's a nice simple way to factor numbers
that are a little too big for trial division.
Implementing fancier algorithm is not satisfying at all. You soon
realize that you need to be a good mathematician (I think I am) but
that you need invest years of your life to get at the frontier.
The book I mentioned discusses ECM and the Quadratic Sieve in addition
to the rho method and a few others. Those aren't the frontier and don't
take years to understand or implement, given a reasonable math
background (a long way from the frontier). The remaining algorithm is
the NFS which I don't understand, but which doesn't seem to be alien
technology either.
I challenge you to present an algorithm that has a better
performance to conciseness ratio than what I presented here.
I agree that the fancier methods take more code. But they have better
asymptotic performance so you can factor bigger numbers with them.
The question is, when do numbers start becoming "bigger numbers"?
When I start thinking about elliptic curve methods for factorization e.g.
https://github.com/sethtroisi/gmp-ecm
and doing it in Forth, the lack of bignum math becomes apparent. Instead
of reinventing wheels, bignum library support a la GMP MP is required
plus an adequate I/O and API interface on Forth system side.
But well, this would still be far away from the alleged frontier I fear .. ;-)
Marcel Hendrix
2022-12-10 11:34:21 UTC
Permalink
On Saturday, December 10, 2022 at 9:07:15 AM UTC+1, ***@arcor.de wrote:
[..]
Post by ***@arcor.de
The question is, when do numbers start becoming "bigger numbers"?
When I start thinking about elliptic curve methods for factorization e.g.
https://github.com/sethtroisi/gmp-ecm
and doing it in Forth, the lack of bignum math becomes apparent. Instead
of reinventing wheels, bignum library support a la GMP MP is required
plus an adequate I/O and API interface on Forth system side.
But well, this would still be far away from the alleged frontier I fear .. ;-)
iForth has three bignum math libraries (one of them in pure Forth), plus the
interfaces to gmp, mpc, and mpfr. I have to admit I never felt the need for
numbers with more than 2048 decimal places, but feel free.

I started out with iSPICE having arbitrary precision, but found out that there
are fundamental numerical decisions made in the SPICE foundations
that make that not a solution to anything. It is still possible to set a compile-time
constant to get 80-bit precision (native FPU format), but the only noticeable
difference is a 2x slowdown.

-marcel
minf...@arcor.de
2022-12-10 13:47:38 UTC
Permalink
Post by Marcel Hendrix
[..]
Post by ***@arcor.de
The question is, when do numbers start becoming "bigger numbers"?
When I start thinking about elliptic curve methods for factorization e.g.
https://github.com/sethtroisi/gmp-ecm
and doing it in Forth, the lack of bignum math becomes apparent. Instead
of reinventing wheels, bignum library support a la GMP MP is required
plus an adequate I/O and API interface on Forth system side.
But well, this would still be far away from the alleged frontier I fear .. ;-)
I started out with iSPICE having arbitrary precision, but found out that there
are fundamental numerical decisions made in the SPICE foundations
that make that not a solution to anything. It is still possible to set a compile-time
constant to get 80-bit precision (native FPU format), but the only noticeable
difference is a 2x slowdown.
With that slowdown one could also use _float128 libraries like libquadmath.
But of course it would not help with arbitrary length integer factorization.

BTW there is also that funny double-double format that uses pairs of normal
_float64 numbers - f.ex. available for Julia and Python. It might present an interesting
exercise for Forth too because it could be implemented with already existing
fp-stack infrastructure.
Marcel Hendrix
2022-12-10 14:35:16 UTC
Permalink
[..]
Post by ***@arcor.de
BTW there is also that funny double-double format that uses pairs of normal
_float64 numbers - f.ex. available for Julia and Python. It might present an interesting
exercise for Forth too because it could be implemented with already existing
fp-stack infrastructure.
I have that, too. Julian Noble introduced it and I added a 112-bit FP library.
(The iForth vector-matrix module supports it as a plugin.) There are quite
some CLF postings about this variant.

I'm hoping that I can drop the exotics when hardware quad-precision
becomes available (although of course then somebody will publish a
240-bit package :--)

-marcel
Paul Rubin
2022-12-12 00:45:13 UTC
Permalink
Post by ***@arcor.de
The question is, when do numbers start becoming "bigger numbers"?
When I start thinking about elliptic curve methods for factorization e.g.
https://github.com/sethtroisi/gmp-ecm
To factor the number N by trial division, you need around sqrt(N)
operations. The rho method takes around 4throot(N), so you can probably
get out to 35 or 40 decimal digits with it if you can let it run for a
while. State of the art methods can do maybe 150 digits on a small
cluster of workstations. I believe the record was a big distributed
computation that took some months to factor a number of around 230
digits.

I agree that trying to do it in Forth probably gets in the way of the
math. You could look at the MIRACL library, that implements a bunch of
those algorthms, up through the Quadratic Sieve (QS) but not the Number
Field Sieve (NFS), iirc. https://github.com/miracl/MIRACL
none) (albert
2022-12-12 10:46:04 UTC
Permalink
Post by Paul Rubin
Post by ***@arcor.de
The question is, when do numbers start becoming "bigger numbers"?
When I start thinking about elliptic curve methods for factorization e.g.
https://github.com/sethtroisi/gmp-ecm
To factor the number N by trial division, you need around sqrt(N)
operations. The rho method takes around 4throot(N), so you can probably
get out to 35 or 40 decimal digits with it if you can let it run for a
while. State of the art methods can do maybe 150 digits on a small
cluster of workstations. I believe the record was a big distributed
computation that took some months to factor a number of around 230
digits.
I agree that trying to do it in Forth probably gets in the way of the
math. You could look at the MIRACL library, that implements a bunch of
those algorthms, up through the Quadratic Sieve (QS) but not the Number
Field Sieve (NFS), iirc. https://github.com/miracl/MIRACL
We have implemented the Multiple Polynomial Quadratic Sieve on
transputers. (1996)
It was more a demonstration of parallelism showing
the solutions found by the transputer in a triangular display
indicated by colors. However the primes involved in the sieve
are relatively small and can be handled by 32 bit processors,
such as the transputers.
To be honest I borrowed the polynomials from ucbasic and understood
them poorly.
(3 transputers sieving in parallel)

Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
***@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Marcel Hendrix
2022-12-12 12:42:02 UTC
Permalink
On Monday, December 12, 2022 at 11:46:07 AM UTC+1, none albert wrote:
[..]
Post by none) (albert
We have implemented the Multiple Polynomial Quadratic Sieve on
transputers. (1996)
It was more a demonstration of parallelism showing
the solutions found by the transputer in a triangular display
indicated by colors. However the primes involved in the sieve
are relatively small and can be handled by 32 bit processors,
such as the transputers.
To be honest I borrowed the polynomials from ucbasic and understood
them poorly.
(3 transputers sieving in parallel)
According to the original source code, mpqss/config.frt,

\ The following defines the number of primes contained in the bit table
\ For a 32 bit processor multiply by 32.
\ The number of primes is related to the maximum prime MaxP
#40 =: MaxB \ Length in CELLs of a bit array

\ >> This may no longer be applicable to the new Vnumber system. <<
\ We want to handle 100 digit numbers, Maxv =20 allows squaring
\ #20 =: MaxV \ Length in CELLs of Vnumber
#40 =: MaxV \ Length in CELLs of Vnumber

So we used 1280 bits arbitrary precision to factor 100-digit numbers.

-marcel
minf...@arcor.de
2022-12-12 13:15:18 UTC
Permalink
Post by Marcel Hendrix
[..]
Post by none) (albert
We have implemented the Multiple Polynomial Quadratic Sieve on
transputers. (1996)
It was more a demonstration of parallelism showing
the solutions found by the transputer in a triangular display
indicated by colors. However the primes involved in the sieve
are relatively small and can be handled by 32 bit processors,
such as the transputers.
To be honest I borrowed the polynomials from ucbasic and understood
them poorly.
(3 transputers sieving in parallel)
According to the original source code, mpqss/config.frt,
\ The following defines the number of primes contained in the bit table
\ For a 32 bit processor multiply by 32.
\ The number of primes is related to the maximum prime MaxP
#40 =: MaxB \ Length in CELLs of a bit array
\ >> This may no longer be applicable to the new Vnumber system. <<
\ We want to handle 100 digit numbers, Maxv =20 allows squaring
\ #20 =: MaxV \ Length in CELLs of Vnumber
#40 =: MaxV \ Length in CELLs of Vnumber
So we used 1280 bits arbitrary precision to factor 100-digit numbers.
This is really impressive !!

Msieve has some info about number field sieves:
https://sourceforge.net/projects/msieve/files/msieve/Msieve%20v1.53/msieve153_src.tar.gz/download
in their Readme files. An interesting and sobering confirmation that I will
never reach more than better dilettantism in this numeric discipline.

And I don't own spare Nvidia cards for number crunching. ;-)
Bitcoin miners plundered the market...
none) (albert
2022-12-12 15:11:20 UTC
Permalink
Post by Marcel Hendrix
[..]
Post by none) (albert
We have implemented the Multiple Polynomial Quadratic Sieve on
transputers. (1996)
It was more a demonstration of parallelism showing
the solutions found by the transputer in a triangular display
indicated by colors. However the primes involved in the sieve
are relatively small and can be handled by 32 bit processors,
such as the transputers.
To be honest I borrowed the polynomials from ucbasic and understood
them poorly.
(3 transputers sieving in parallel)
According to the original source code, mpqss/config.frt,
\ The following defines the number of primes contained in the bit table
\ For a 32 bit processor multiply by 32.
\ The number of primes is related to the maximum prime MaxP
#40 =: MaxB \ Length in CELLs of a bit array
\ >> This may no longer be applicable to the new Vnumber system. <<
\ We want to handle 100 digit numbers, Maxv =20 allows squaring
\ #20 =: MaxV \ Length in CELLs of Vnumber
#40 =: MaxV \ Length in CELLs of Vnumber
So we used 1280 bits arbitrary precision to factor 100-digit numbers.
You can download the code via
https://home.hccnet.nl/a.w.m.van.der.horst/transputer.html
I was mistaken, this was 1993. It won a (first?) prize at HCC-DAGEN
the legend Dutch computer fair, where it was demonstrated using a
24 kg 24 inch crt.
This was before the ISO 93 standard was stable.
VALUE's don't get an initialisation and
tons of idiosyncrasies
REVISION
/* comment
=:
:ABOUT

The example used was the 8-th Fermat number, 1 bit out of reach for
the Pollard example with a 274... factor.

There is a ton of comment, start reading with LINKS.FRT and MAIN.FRT.
Post by Marcel Hendrix
-marcel
Groetjes Albert
--
"in our communism country Viet Nam, people are forced to be
alive and in the western country like US, people are free to
die from Covid 19 lol" duc ha
***@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
minf...@arcor.de
2022-12-12 11:10:19 UTC
Permalink
Post by Paul Rubin
Post by ***@arcor.de
The question is, when do numbers start becoming "bigger numbers"?
When I start thinking about elliptic curve methods for factorization e.g.
https://github.com/sethtroisi/gmp-ecm
To factor the number N by trial division, you need around sqrt(N)
operations. The rho method takes around 4throot(N), so you can probably
get out to 35 or 40 decimal digits with it if you can let it run for a
while. State of the art methods can do maybe 150 digits on a small
cluster of workstations. I believe the record was a big distributed
computation that took some months to factor a number of around 230
digits.
I agree that trying to do it in Forth probably gets in the way of the
math. You could look at the MIRACL library, that implements a bunch of
those algorthms, up through the Quadratic Sieve (QS) but not the Number
Field Sieve (NFS), iirc. https://github.com/miracl/MIRACL
Acc to Wikipedia "In number theory, the general number field sieve (GNFS)
is the most efficient classical algorithm known for factoring integers
larger than 10^100." This is a bit heavy for Forth.

But there is a language with similar name: Fortran... ;-)
Loading...