mpz version of cofactorization methods are very very slow

Imported issue: Initially reported by Cyril Bouvier in https://gforge.inria.fr/tracker/?group_id=2065&aid=21649

All methods (P-1, P+1, ECM) for the cofactorization comes in 4 flavours (ul, 15ul, 2ul2, mpz) to deal with the different sizes of the input numbers.

It seems that the mpz versions is much slower, even for numbers that fit in 3 ul. Some data for ecm:

ul

$ ./build/zaphod/sieve/ecm/testbench -cof 4294967311 -p -ecm 315 5355 11 3 10000000

Strategy has 1 method(s)

Of 664578 tries there were 422826 with a factor found

Ratio: 0.636232

Total time: 19.93 s, per call: 29.992482 usec, per factor: 47.140772 usec

15ul

$ ./build/zaphod/sieve/ecm/testbench -cof 18446744073709551629 -p -ecm 315 5355 11 3 10000000

Strategy has 1 method(s)

Of 664578 tries there were 422826 with a factor found

Ratio: 0.636232

Total time: 38.86 s, per call: 58.480199 usec, per factor: 91.916424 usec

2ul2

$ ./build/zaphod/sieve/ecm/testbench -cof 79228162514264337593543950397 -p -ecm 315 5355 11 3 10000000

Strategy has 1 method(s)

Of 664578 tries there were 422826 with a factor found

Ratio: 0.636232

Total time: 40.43 s, per call: 60.836562 usec, per factor: 95.620045 usec

mpz

$ ./build/zaphod/sieve/ecm/testbench -cof 340282366920938463463374607431768211507 -p -ecm 315 5355 11 3 10000000

Strategy has 1 method(s)

Of 664578 tries there were 422826 with a factor found

Ratio: 0.636232

Total time: 582.84 s, per call: 877.004117 usec, per factor: 1378.433781 usec

After investigation, it seems it is due to the fact that most functions use mod_init/mod_clear function. These functions are noop for ul, 15ul, 2ul2 but not for mpz where they allocate and free memory. They are called thousands of time for a simple ecm computation, which corresponds to thousands of small malloc and free.

Is it a big problem ? Not really because even for very large numbers the input numbers of the cofactorization step fit on 2ul (in params.c320, mfb1 is 105).

Solutions:

Write a 3ul arith
Rewrite every functions in the cofact to minimize the numbers of allocated variables, for example by passing temporary variables to the function.
could we used mpfq for arith in the cofact ?

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information