mpz version of cofactorization methods are very very slow
Imported issue: Initially reported by Cyril Bouvier in https://gforge.inria.fr/tracker/?group_id=2065&aid=21649
All methods (P-1, P+1, ECM) for the cofactorization comes in 4 flavours (ul, 15ul, 2ul2, mpz) to deal with the different sizes of the input numbers.
It seems that the mpz versions is much slower, even for numbers that fit in 3 ul. Some data for ecm:
ul
$ ./build/zaphod/sieve/ecm/testbench -cof 4294967311 -p -ecm 315 5355 11 3 10000000
Strategy has 1 method(s)
Of 664578 tries there were 422826 with a factor found
Ratio: 0.636232
Total time: 19.93 s, per call: 29.992482 usec, per factor: 47.140772 usec
15ul
$ ./build/zaphod/sieve/ecm/testbench -cof 18446744073709551629 -p -ecm 315 5355 11 3 10000000
Strategy has 1 method(s)
Of 664578 tries there were 422826 with a factor found
Ratio: 0.636232
Total time: 38.86 s, per call: 58.480199 usec, per factor: 91.916424 usec
2ul2
$ ./build/zaphod/sieve/ecm/testbench -cof 79228162514264337593543950397 -p -ecm 315 5355 11 3 10000000
Strategy has 1 method(s)
Of 664578 tries there were 422826 with a factor found
Ratio: 0.636232
Total time: 40.43 s, per call: 60.836562 usec, per factor: 95.620045 usec
mpz
$ ./build/zaphod/sieve/ecm/testbench -cof 340282366920938463463374607431768211507 -p -ecm 315 5355 11 3 10000000
Strategy has 1 method(s)
Of 664578 tries there were 422826 with a factor found
Ratio: 0.636232
Total time: 582.84 s, per call: 877.004117 usec, per factor: 1378.433781 usec
After investigation, it seems it is due to the fact that most functions use mod_init/mod_clear function. These functions are noop for ul, 15ul, 2ul2 but not for mpz where they allocate and free memory. They are called thousands of time for a simple ecm computation, which corresponds to thousands of small malloc and free.
Is it a big problem ? Not really because even for very large numbers the input numbers of the cofactorization step fit on 2ul (in params.c320, mfb1 is 105).
Solutions:
-
Write a 3ul arith
-
Rewrite every functions in the cofact to minimize the numbers of allocated variables, for example by passing temporary variables to the function.
-
could we used mpfq for arith in the cofact ?