Newer
Older
2019-08-16 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpn/generic/brootinv.c: Shorten computations, using even exponent.
* mpn/generic/powlo.c: Avoid copies with a flipflop.
Speed support for gcd_22. Calls mpn_gcd_22(al, al, bl, bl), so
that B+1 is a common factor.
* tune/speed.h (SPEED_ROUTINE_MPN_GCD_22): New macro.
* tune/speed.c (routine): Add mpn_gcd_22.
* tune/common.c (speed_mpn_gcd_22): New function.
* mpn/generic/gcd.c (gcd_2): Moved to gcd_22.c below.
(mpn_gcd): Adapt for calling gcd_22.
* mpn/generic/gcd_22.c (mpn_gcd_22): New file and function.
* gmp-impl.h (mp_double_limb_t): New (typedef) struct.
* configure.ac (gmp_mpn_functions): Added gcd_22.
* tests/mpn/t-gcd_22.c: New test.
* tests/mpn/Makefile.am (check_PROGRAMS): Add t-gcd_22.
* tests/refmpz.c (refmpz_gcd): New function (plain binary gcd).
2019-08-13 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64: Add more gcd_11 variants of of x86_64 gcd_11.asm and
tweak existing ones.
2019-08-13 Marco Bodrato <bodrato@mail.dm.unipi.it>
From Seth Troisi:
* doc/gmp.texi: Update mpz_millerrabin documentation.
* mpn/x86_64/bd2/gcd_11.asm: Micro-optimisation.
* doc/gmp.texi: Further update in mpz_millerrabin.
* tests/misc.c: Silence a warning.
* tests/mpz/t-pprime_p.c (const primes): One more prime in the list.
* mpz/millerrabin.c: Return 2 for surely prime numbers (BPSW checked).
2019-08-08 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86/gcd_11.asm: New file.
* config.sub: Make arm cpu types match what config.guess returns.
2019-08-08 Niels Möller <nisse@lysator.liu.se>
* tests/refmpn.c (refmpn_gcd_11): New function, based on refmpn_gcd_1.
(refmpn_gcd_1): Use it.
* tests/mpn/t-gcd_11.c: New file, test mpn_gcd_11.
* tests/mpn/Makefile.am (check_PROGRAMS): Add t-gcd_11.
2019-08-07 Torbjörn Granlund <tg@gmplib.org>
* mpn/alpha/ev67/gcd_11.asm: New file, mostly extracted from gcd_1.asm.
* mpn/arm/v5/gcd_11.asm: Likewise.
* mpn/arm/v6t2/gcd_11.asm: Likewise.
* mpn/arm64/gcd_11.asm: Likewise.
* mpn/powerpc64/mode64/gcd_11.asm: Likewise.
* mpn/powerpc64/mode64/p7/gcd_11.asm: Likewise.
* mpn/powerpc64/mode64/p9/gcd_11.asm: Likewise.
* mpn/sparc64/gcd_11.asm: Likewise.
* mpn/x86/k7/gcd_11.asm: Likewise.
* mpn/x86/p6/gcd_11.asm: Likewise.
* mpn/x86_64/bd2/gcd_11.asm: Likewise.
* mpn/x86_64/core2/gcd_11.asm: Likewise.
* mpn/x86_64/gcd_11.asm: Likewise.
* tune/common.c (speed_mpn_gcd_11): New function.
* tune/speed.h (speed_mpn_gcd_11): Declare it.
(SPEED_ROUTINE_MPN_GCD_11): New macro.
* tune/speed.c (routine): Add mpn_gcd_11.
* configure.ac (gmp_mpn_functions): Added gcd_11. Also add
HAVE_NATIVE_mpn_gcd_11.
* mpn/generic/gcd_11.c (mpn_gcd_11): New file and function,
extracted from mpn_gcd_1.
* gmp-h.in (mpn_gcd_11): Declare it.
* mpn/generic/gcd_1.c (mpn_gcd_1): Adapted to call mpn_gcd_11.
2019-08-04 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/bt2/gcd_1.asm: New grabber file.
* mpn/x86_64/zen/gcd_1.asm: Grab from "bd2" directory, was "core2".
2019-08-02 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/bd2/gcd_1.asm: New file.
2019-08-01 Torbjörn Granlund <tg@gmplib.org>
* tests/mpf/t-conv.c: Add several more fixed test cases.
* mpf/set_str.c: Ignore leading zeros including ones after radix point
to avoid invalid output formats.
2019-07-30 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/p9/gcd_1.asm: New file.
2019-07-30 Niels Möller <nisse@lysator.liu.se>
From Seth Troisi:
* doc/gmp.texi (Jacobi Symbol): Update algorithm documentation.
* tests/mpz/t-jac.c: Comment update.
2019-07-13 Torbjörn Granlund <tg@gmplib.org>
* configure.ac (arm): Generalise arm a72 pattern to match a73...a79.
2019-07-08 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm/arm-defs.m4 (ASM_START): Rewrite (fix broken error handling).
2019-07-02 Torbjörn Granlund <tg@gmplib.org>
* acinclude.m4 (GMP_C_DOUBLE_FORMAT): Compile conftest.c to executable
in order to trigger final compile in case of LTO.
2019-06-17 Torbjörn Granlund <tg@gmplib.org>
* config.guess: Work around upstream configfsf.guess's regression wrt
mips vs mips64.
2019-06-14 Torbjörn Granlund <tg@gmplib.org>
* longlong.h (mips64): Provide r6 asm code as default expression yields
libcall.
* configure.ac (mips64): Use separate paths for r6 and non-r6 as these
architectures are mutually incompatible.
* mpn/mips64/{addmul_1,mul_1,sqr_diagonal,submul_1,umul}.asm:
Move into hilo subdir.
2019-05-28 Torbjörn Granlund <tg@gmplib.org>
* config.sub: Fixes to which cpu types end with a "*".
2019-04-20 Niels Möller <nisse@lysator.liu.se>
* doc/gmp.texi (References): Link to paper on subquadratic GCD.
* mpn/x86_64/bd1/hamdist.asm: Really make 2017-06-01 change: Use
3-operand DEF_OBJECT.
* mpn/x86_64/invert_limb.asm: Simplify mpn_invert_limb_table ref.
* mpn/x86_64/x86_64-defs.m4 (LEA): Use rip addressing for non-PIC.
2019-04-17 Niels Möller <nisse@lysator.liu.se>
* mpn/generic/jacobi.c (mpn_jacobi_n): Use JACOBI_DC_THRESHOLD,
not GCD_DC_THRESHOLD. Inconsistency spotted by Seth Troisi.
2019-04-02 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/fat/fat.c (__gmpn_cpuvec_init):
Split out silvermont handling, add handling of goldmont.
* configure.ac: Setup distinct paths for silvermont and goldmont.
(fat_path): Add missing x86_64/goldmont.
* config.guess: Recognise "Goldmont Plus".
2018-12-09 Torbjörn Granlund <tg@gmplib.org>
* mpn/generic/mul_fft.c (mpn_fft_add_sub_modF): New function.
(mpn_fft_fft, mpn_fft_fftinv): Use it.
2018-11-30 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/p9/gmp-mparam.h: New file.
* mpn/powerpc64/mode64/p9/add_n_sub_n.asm: New file.
2018-11-28 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/p9/sqr_basecase.asm: New file.
* mpn/powerpc64/mode64/p9/mul_1.asm: New file.
2018-11-18 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/p9/mul_basecase.asm: New file.
* mpn/powerpc64/mode64/p9/addmul_1.asm: Optimise.
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
2018-11-12 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/p9/aorsmul_1.asm: New file, providing fast
submul_1 (and redundant addmul_1).
2018-11-11 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/p9/addmul_1.asm: Tweak for slightly better
speed.
* mpn/powerpc32/powerpc-defs.m4: Define addex.
* mpn/powerpc64/mode64/p9/mul_2.asm: Use it.
* mpn/powerpc64/mode64/p9/addmul_2.asm: Likewise.
2018-11-08 Torbjörn Granlund <tg@gmplib.org>
* tests/devel/cnd_aors_n.c: New file.
* mpn/arm/neon/lorrshift.asm: Declare use of neon insns.
* mpn/arm/neon/lshiftc.asm: Likewise + cleanup.
* tests/devel/Makefile.am (EXTRA_PROGRAMS): Add missing files.
* mpn/powerpc64/mode64/p9/mul_2.asm: New file.
* mpn/powerpc64/mode64/p9/addmul_2.asm: New file.
2018-11-07 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpz/lucnum2_ui.c: Use mpn_rsblsh1_n if available.
* tests/mpz/t-nextprime.c: Add one more interval.
* tests/mpz/t-pprime_p.c (check_fermat_mersenne): New tests.
* mpn/generic/mod_1_3.c: typo in a comment.
* mpn/generic/fib2m.c: New file, Fibonacci numbers modulo.
* configure.ac (gmp_mpn_functions): Add it.
* gmp-impl.h: Declare mpn_fib2m.
* tests/mpn/t-fib2m.c: New file, tests for mpn_fib2m.
* tests/mpn/Makefile.am (check_PROGRAMS): Add t-fib2m.
* mpn/generic/mod_34lsub1.c: Initialise c[012] once.
* tests/mpz/t-pprime_p.c (check_primes): Two more primes.
* tests/mp?: Use TESTS_REPS in many files.
* mpn/generic/strongfibo.c: New file, Fibonacci primality test.
* configure.ac (gmp_mpn_functions): Add it.
* gmp-impl.h: Declare mpn_strongfibo.
* mpz/stronglucas.c: New file, strong Lucas primality test.
* Makefile.am (MPZ_OBJECTS): Add it.
* mpz/Makefile.am (libmpz_la_SOURCES): Add it.
* gmp-impl.h: Declare mpz_stronglucas.
* mpz/millerrabin.c: Implement BPSW test for primality.
2018-11-07 Torbjörn Granlund <tg@gmplib.org>
* configure.ac (arm): Support a12 and a17.
* config.sub: Generalise arm matching.
* config.guess: Recognise additional arm CPUs.
* mpn/arm/arm-defs.m4 (ASM_START): Provide local definition.
2018-10-30 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm/v7a/cora17/mod_34lsub1.asm: New file.
* mpn/arm/v7a/cora17/gmp-mparam.h: New file.
* mpn/arm/v7a/cora17/mul_1.asm: New grabber file.
* mpn/arm/v7a/cora17/addmul_1.asm: Likewise.
* mpn/arm/v7a/cora17/submul_1.asm: Likewise.
2018-10-18 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpn/generic/fib2_ui.c: Simplify the possible -2 case.
2018-07-19 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpz/millerrabin.c (mod_eq_m1): New function, equality with -1.
* mpz/powm_ui.c: Small optimisations.
2018-07-03 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/lshift.asm: Remove cnt = 1 special code.
* mpn/x86_64/silvermont/popcount.asm: Add missing ABI_SUPPORT decls.
* mpn/x86_64/silvermont/hamdist.asm: Likewise.
* mpn/x86_64/zen/mul_1.asm: Likewise.
* mpn/x86_64/fastsse/lshift.asm: Support DOS64.
* mpn/x86_64/fastsse/lshiftc.asm: Likewise.
* mpn/x86_64/pentium4/gmp-mparam.h: Retune.
2018-07-01 Torbjörn Granlund <tg@gmplib.org>
* lshift.asm: Replace with grabber file.
* lshiftc.asm: Replace with grabber file.
* x86_64/pentium4/addmul_2.asm: New grabber file.
* x86_64/pentium4/aorsmul_1.asm: New grabber file.
* x86_64/pentium4/mul_1.asm: New grabber file.
* x86_64/pentium4/mul_2.asm: New grabber file.
* x86_64/pentium4/mul_basecase.asm: New grabber file.
* x86_64/pentium4/mullo_basecase.asm: New grabber file.
* x86_64/pentium4/redc_1.asm: New grabber file.
* x86_64/pentium4/sqr_basecase.asm: New grabber file.
2018-06-13 Niels Möller <nisse@lysator.liu.se>
* mpn/generic/gcd_1.c (mpn_gcd_1): Delete unused code variant for
GCD_1_METHOD == 1, and delete GCD_1_METHOD macro. Simplify the
structure of the remaining code variant, without gotos to the
mid-loop strip_u_maybe label.
2018-05-30 Torbjörn Granlund <tg@gmplib.org>
* configure.ac (x86): Provide goldmont specific path.
* mpn/x86_64/goldmont/gmp-mparam.h: New file.
2018-05-29 Torbjörn Granlund <tg@gmplib.org>
* configure.ac (x86): Pass more exact arch/tune options for nehalem.
2018-05-28 Niels Möller <nisse@lysator.liu.se>
* mpn/generic/gcd_1.c (mpn_gcd_1) [USE_ZEROTAB]: Delete unused code
variant for USE_ZEROTAB != 0.
2018-05-20 Marco Bodrato <bodrato@mail.dm.unipi.it>
* bootstrap.c: Define DONT_USE_FLOAT_H before including mini-gmp.
2018-05-14 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpn/generic/dcpi1_bdiv_q.c (mpn_dcpi1_bdiv_q_n): Decl. static.
(mpn_dcpi1_bdiv_q_n_itch): Declare static.
* mpn/generic/dcpi1_divappr_q.c (mpn_dcpi1_divappr_q_n): static.
* mpn/generic/matrix22_mul.c (mpn_matrix22_mul_strassen): static.
* mpn/generic/mu_div_qr.c (mpn_mu_div_qr_choose_in): static.
* mpn/generic/mu_divappr_q.c (mpn_preinv_mu_divappr_q): static.
(mpn_mu_divappr_q_choose_in): static.
* gmp-impl.h: Remove declaration of previous functions.
* mpn/generic/get_d.c: Enhance generic code using DBL_MANT_DIG.
* printf/repl-vsnprintf.c: Better handling floating-point
specifiers "EeGgFf" (Thanks Vincent Lefevre).
2018-05-04 Marco Bodrato <bodrato@mail.dm.unipi.it>
* doc/gmp.texi (mpq_*_str): Document the full base allowed range.
* mpq/get_str.c: Make all bases either work or return an error.
* doc/gmp.texi (Integer Internals): Lazy allocation and read-only.
2018-04-27 Niels Möller <nisse@lysator.liu.se>
* mpn/generic/div_q.c (mpn_div_q): Replace dead code with ASSERT.
Spotted by Paul Zimmermann and Raphaël Rieu-Hleft.
* tests/mpn/t-div.c (main): Fill quotient area with junk before
calling mpn_div_q.
2018-04-26 Marco Bodrato <bodrato@mail.dm.unipi.it>
* Makefile.am (EXTRA_DIST): Add mini-gmp/mini-mpq.[ch].
2018-04-23 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpn/generic/toom2_sqr.c: Handle the cy=-1 branch slightly faster.
* mpn/generic/toom22_mul.c: Likewise. (Thanks Paul and Raphaël!)
2018-04-22 Niels Möller <nisse@lysator.liu.se>
From Martin Storsjö:
* configure.ac (aarch64): Just as on windows/x86_64, "long" still
is 32 bit on aarch64. To distinguish between 32-bit and 64-bit
ABI, test sizeof(void*) instead of sizeof(long). Use long long for
mp_limb_t for mingw targets.
* acinclude.m4 (GMP_C_TEST_SIZEOF): Allow '*' in the type name,
e.g., void*.
2018-04-18 Marc Glisse <marc.glisse@inria.fr>
* mpq/clear.c: Handle lazy numerator.
* mpq/clears.c: Likewise.
* mpq/init.c: Likewise.
* mpq/set_si.c: Likewise.
* mpq/set_ui.c: Likewise.
* tests/cxx/t-ops2z.cc: Add parentheses to quiet a warning.
2018-03-28 Torbjörn Granlund <tg@gmplib.org>
* configure.ac (mips): Recognise "mipsisa64*".
2018-03-23 Torbjörn Granlund <tg@gmplib.org>
* mpn/generic/sec_powm.c: Remove unused macros.
Simplify code for choosing between redc_1 and redc_2.
Compute power table with squaring for even powers.
2018-02-28 Marco Bodrato <bodrato@mail.dm.unipi.it>
* gmpxx.h (__gmp_binary_plus): Special case for mpq + 1.
(__gmp_binary_minus): Special case for mpq - 1.
(__gmp_binary_equal): Optimised comparison mpq == integer.
* tests/cxx/t-ops2qf.cc (checkqf): Some check for +/- 1, +/- 0.
2018-02-18 Marco Bodrato <bodrato@mail.dm.unipi.it>
* tune/Makefile.am: Disallow parallel make (thanks Vincent Lefevre).
* mpq/swap.c: Use *_SWAP_* macros.
* mpq/cmp_ui.c: One more little shortcut, comparing fractions to 1.
* mpq/get_d.c: Compare (zeros > 0) once, replace tdiv_qr with div_q.
* mpq/equal.c: Check size early.
* printf/obprintf.c: Adda dummy typedef to avoid empty unit.
* printf/obvprintf.c: Likewise.
* printf/obprntffuns.c: Likewise.
* printf/repl-vsnprintf.c: Move #ifdef after #include gmp-impl.h .
2018-02-08 Marco Bodrato <bodrato@mail.dm.unipi.it>
* printf/snprntffuns.c: Report -1 as an error.
* acinclude.m4 (GMP_FUNC_VSNPRINTF): Refuse -1 as return value.
* mpz/bin_uiui.c (mpz_smallk_bin_uiui): One more shortcut for small k.
* gmp-impl.h (popc_limb): Use fewer constants (GMP_LIMB_BITS == 16).
* mpz/divegcd.c (mpz_divexact_limb): Use MPN_DIVREM_OR_DIVEXACT_1.
* primesieve.c (fill_bitpattern): Use MPN_FILL.
2018-02-01 Marc Glisse <marc.glisse@inria.fr>
* longlong.h (i586): Remove assert.
2018-01-30 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpz/and.c: Rearrange the 3 cases, both <0, both >=0, one and one.
* mpz/ior.c: Likewise.
* mpz/xor.c: Likewise.
* mpz/bin_uiui.c (mul[4-8]): Reduce the number of multiplications.
* printf/doprnt.c: Use __GMP_FREE_FUNC_TYPE.
* printf/doprntf.c: Likewise.
* printf/snprntffuns.c: Likewise, and use size_t instead of int.
2018-01-15 Marco Bodrato <bodrato@mail.dm.unipi.it>
* tests/mpz/t-bin.c: Extended tests for bin_ui and uint border cases.
2018-01-10 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86/fat/fat.c (__gmpn_cpuvec_init): Fix old pentium recog.
* config.guess: Likewise.
* mpn/x86/fat/fat.c (__gmpn_cpuvec_init): Add many 64-bit CPUs.
2018-01-07 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm: Revert last change, it hides a symbol needed for testing.
2018-01-04 Torbjörn Granlund <tg@gmplib.org>
* bdiv_q_1.asm (binvert_limb_table): Declare as ".hidden".
* v7a/cora8/bdiv_q_1.asm: Likewise.
* dive_1.asm: Likewise.
* v6/dive_1.asm: Likewise.
* mode1o.asm (binvert_limb_table): Remove ".protected", add ".hidden".
* v6/mode1o.asm: Likewise.
2018-01-02 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/divrem_2.asm: Use different rlwinm variant (to
2018-01-01 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc32/powerpc-defs.m4: Define maddld, maddhdu, popcntd, and
divdeu.
* mpn/powerpc64/mode64/p8/invert_limb.asm: Use new insn defs.
* mpn/powerpc64/mode64/p9/addmul_1.asm: Use new insn defs.
* mpn/powerpc64/p7/hamdist.asm: Use new insn defs.
* mpn/powerpc64/p7/popcount.asm: Use new insn defs.
2017-12-31 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/p9/addmul_1.asm: Moved from
mpn/powerpc64/p9/addmul_1.asm.
2017-12-30 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpz/bin_ui.c: Rewrite, using Fredrik Johansson's suggestions.
2017-12-27 Niels Möller <nisse@lysator.liu.se>
* longlong.h (arm32/arm64): Leave COUNT_LEADING_ZEROS_0 undefined,
since we use gcc's __builtin_clzl, which doesn't allow zero inputs.
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
2017-12-27 Torbjörn Granlund <tg@gmplib.org>
* mpn/powerpc64/mode64/bdiv_q_1.asm: Use 64-bit cmp for sizes.
* mpn/powerpc64/p9/addmul_1.asm: New file.
* configure.ac: Separate handling of POWER8 and POWER9.
* config.guess: Recognise POWER9 and more variants of POWER8.
Reorder recog code to favour PVR over proc/cpuinfo.
2017-12-14 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/fastsse/com.asm: Adhere to DOS64 xmm callee-saves rules.
* mpn/x86_64/fastsse/com-palignr.asm: Likewise.
* mpn/x86_64/fastsse/copyd.asm: Likewise.
* mpn/x86_64/fastsse/copyi.asm: Likewise.
* mpn/x86_64/fastsse/lshiftc.asm: Likewise.
* mpn/x86_64/fastsse/sec_tabselect.asm: Likewise.
2017-12-11 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/bd1/aorrlsh_n.asm: New grabber file.
2017-12-10 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/bd1/aors_n.asm: New grabber file.
* mpn/x86_64/bd4/aorrlsh_n.asm: New grabber file.
2017-08-31 Torbjörn Granlund <tg@gmplib.org>
* mpn/sparc32/sparc-defs.m4 (LEA64): Rewrite for both PIC and non-PIC.
* mpn/sparc64/ultrasparct3/cnd_aors_n.asm: Allow arbitrary cnd arg.
2017-08-29 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/silvermont/gmp-mparam.h: Disable mul_2 and addmul_2.
2017-08-28 Torbjörn Granlund <tg@gmplib.org>
* mpn/sparc64: Revert 2017-07-23 PIC changes.
* mpn/powerpc32/divrem_2.asm: Avoid bc+ insn form to accommodate clang.
* mpn/generic/sbpi1_bdiv_qr.c: Correct ASSERT.
* mpn/generic/sbpi1_bdiv_q.c: Likewise.
* mpn/generic/sbpi1_bdiv_r.c: Likewise.
2017-08-18 Torbjörn Granlund <tg@gmplib.org>
* acinclude.m4 (X86_64_PATTERN): Match zen*.
* configure.ac (x86): Support AVX challenged systems for Zen.
2017-07-24 Torbjörn Granlund <tg@gmplib.org>
* mpn/generic/sbpi1_bdiv_q.c: Add ASSERT for inverse correctness.
* mpn/generic/sbpi1_bdiv_qr.c: Likewise.
* tests/mpn/t-bdiv.c (main): Amend last change.
* tests/devel/try.c (choice_array): Amend 2013-05-03 change to include
more functions.
* mpn/sparc64/gcd_1.asm: Enforce PIC as GNU/Linux toolchain workaround.
* mpn/sparc64/ultrasparct3/bdiv_q_1.asm: Likewise.
* mpn/sparc64/ultrasparct3/dive_1.asm: Likewise.
* mpn/sparc64/ultrasparct3/invert_limb.asm: Likewise.
* mpn/sparc64/ultrasparct3/mode1o.asm: Likewise.
* gmp-impl.h: Reorganise foolshC_ip1 -> foolshC -> foolsh chain to make
it transitive.
2017-07-23 Niels Möller <nisse@lysator.liu.se>
* longlong.h: Purge definitions of obsolete UMUL_TIME and
UDIV_TIME constants. Also mentioned (but unused) in
mpn/cray/gmp-mparam.h and tune/common.c.
2017-07-21 Torbjörn Granlund <tg@gmplib.org>
* tests/mpn/t-bdiv.c (main): Test mpn_sbpi1_bdiv_r.
* tune/common.c (speed_mpn_sbpi1_bdiv_r): New function.
* tune/speed.h: Declare it.
(SPEED_ROUTINE_MPN_PI1_BDIV_R): New macro.
* tune/speed.c (routine): Add mpn_sbpi1_bdiv_r.
* gmp-impl.h (mpn_sbpi1_bdiv_r): Declare.
* mpn/asm-defs.m4 (define_mpn): Add sbpi1_bdiv_q, sbpi1_bdiv_qr,
sbpi1_bdiv_r.
* mpn/generic/sbpi1_bdiv_r.c: New file.
* mpn/x86_64/zen/sbpi1_bdiv_r.asm: New file.
2017-07-19 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86/p6/sse2/submul_1.asm: Get pentium4 code instead of k6 code
for better speed on modern Intel P6 cores.
2017-07-02 Torbjörn Granlund <tg@gmplib.org>
* tune/tuneup.c (tune_mullo): For MULLO_BASECASE_THRESHOLD start at 2.
(tune_sqrlo): Likewise.
2017-06-28 Torbjörn Granlund <tg@gmplib.org>
* tests/Makefile.am tests/*/Makefile.am tune/Makefile.am (AM_LDFLAGS):
Define. (Thanks to Emmanuel Thomé and Vincent Lefevre.)
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
2017-06-27 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/zen/sqr_basecase.asm: Expand to use 4 addmul_1 loops.
* mpn/x86_64/x86_64-defs.m4 (sarx): New macro.
2017-06-26 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/coreibwl/sqr_basecase.asm: Rewrite to do 2x and limb
squaring in main loop.
2017-06-20 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/atom/cnd_add_n.asm: New grabber file.
* mpn/x86_64/atom/cnd_sub_n.asm: Likewise.
* mpn/x86_64/coreihwl/aorrlsh_n.asm: New grabber file.
2017-06-16 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/zen/mul_basecase.asm: Do overlapped software pipelining.
* mpn/x86_64/silvermont/mul_basecase.asm: New grabber file.
* mpn/x86_64/silvermont/sqr_basecase.asm: Likewise.
* mpn/x86_64/silvermont/mullo_basecase.asm: Likewise.
2017-06-14 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/zen/mullo_basecase.asm: New file.
* mpn/x86_64/coreibwl/mullo_basecase.asm: New file.
2017-06-11 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/popham.asm: Crossjump for code size and mix lead-in insns
for lower overhead.
2017-06-09 Torbjörn Granlund <tg@gmplib.org>
* configure.ac: Set GMP_NONSTD_ABI protecting against dots in the abi.
(hppa): Remove old GNU/Linux restriction to 32-bit ABI.
2017-06-07 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/bd1/addmul_2.asm: New file.
* mpn/x86_64/x86_64-defs.m4 (c4_helper): New macro.
(mulx, shlx, shrx): Use c4_helper.
* mpn/x86_64/zen/mul_basecase.asm: Use 8-bit imm operands for "test".
* mpn/x86_64/zen/aorrlsh1_n.asm: New grabber file.
* mpn/x86_64/zen/sublsh1_n.asm: Likewise.
* mpn/x86_64/zen/aorrlsh_n.asm: New file
2017-06-04 Torbjörn Granlund <tg@gmplib.org>
* configure.ac: Use bt1/bt2 for bobcat and jaguar dirs.
(fat_path): Add x86_64/bt2.
* mpn/x86_64/fat/fat.c (__gmpn_cpuvec_init): Adapt to bt1/bt2 changes.
Make zen path correspond to non-fat path.
* mpn/x86/fat/fat.c (__gmpn_cpuvec_init): Adapt to bt1/bt2 changes.
* mpn/x86_64/bt2/copyi.asm: New grabber file.
* mpn/x86_64/bt2/copyd.asm: New grabber file.
* mpn/x86_64/bt2/com.asm: New grabber file.
2017-06-03 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/bd1/popcount.asm: Expand some instructions as .byte
sequences.
* mpn/x86_64/bd1/hamdist.asm: Likewise.
2017-06-02 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/x86_64-defs.m4 (DEF_OBJECT): Fix quoting (amends recent
change).
(JUMPTABSECT): Get rid of spurious "w".
2017-06-02 Marc Glisse <marc.glisse@inria.fr>
* gmpxx.h (mpf_class::operator bool): Use mpf_sgn to access _mp_size.
2017-06-02 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/bd1/popcount.asm: Use both SSE and XOP trickery, and
plain popcnt insn.
* mpn/x86_64/bd1/hamdist.asm: Likewise.
2017-06-01 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/x86_64-defs.m4 (DEF_OBJECT): Allow 3rd argument defining
section, while making alignment argument non-optional.
* mpn/x86_64/core2/popcount.asm: Use 3-operand DEF_OBJECT.
* mpn/x86_64/core2/hamdist.asm: Likewise.
* mpn/x86_64/bd1/popcount.asm: Likewise.
* mpn/x86_64/bd1/hamdist.asm: Likewise.
* configure.ac (GMP_AVX_NOT_REALLY_AVAILABLE): New m4 define.
* mpn/x86_64/bd1/popcount.asm: Use GMP_AVX_NOT_REALLY_AVAILABLE.
* mpn/x86_64/bd1/hamdist.asm: Likewise.
* mpn/x86_64/silvermont/popcount.asm: Reinstate, grabbing nehalem code.
* mpn/x86_64/silvermont/hamdist.asm: Replace with grabber.
* mpn/x86_64/core2/logops_n.asm: New file.
2017-05-30 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/coreisbr/popcount.asm: Remove.
2017-05-29 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/coreinhm/popcount.asm: Replace grabber code with
implementation proper.
* mpn/x86_64/coreinhm/hamdist.asm: Likewise.
* mpn/x86_64/bd1/popcount.asm: Likewise.
* mpn/x86_64/bd1/hamdist.asm: Likewise.
* mpn/x86_64/core2/popcount.asm: Likewise.
* mpn/x86_64/core2/hamdist.asm: New file.
2017-05-22 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/core2/com.asm: New grabber file.
* mpn/x86_64/core2/lshift.asm: Rewrite.
* mpn/x86_64/core2/rshift.asm: Rewrite.
* mpn/x86_64/core2/lshiftc.asm: Rewrite.
2017-05-16 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/zen/mul_1.asm: Port to DOS64.
* mpn/x86_64/zen/aorsmul_1.asm: Likewise.
* mpn/x86_64/coreibwl/addmul_1.asm: Likewise.
2017-05-16 Niels Möller <nisse@lysator.liu.se>
* mpn/generic/divis.c (mpn_divisible_p): Updated the divisibility
test; bdiv now returns R = D rather than R = 0 when D divides a
non-zero U.
* mpn/generic/binvert.c (mpn_binvert): Negate bdiv quotient.
* mpn/generic/divexact.c (mpn_divexact): Likewise.
* mpn/generic/remove.c (mpn_remove): Likewise.
* mpz/bin_uiui.c (mpz_bdiv_bin_uiui): Likewise.
* tests/mpn/t-bdiv.c (check_one): Updated to new convention,
B^{qn} R = U + QD.
* mpn/generic/sbpi1_bdiv_qr.c (mpn_sbpi1_bdiv_qr): Reimplemented,
for new bdiv convention.
* mpn/generic/sbpi1_bdiv_q.c (mpn_sbpi1_bdiv_q): Likewise.
* mpn/generic/dcpi1_bdiv_q.c (mpn_dcpi1_bdiv_q_n)
(mpn_dcpi1_bdiv_q): Adapted to new bdiv convention.
* mpn/generic/dcpi1_bdiv_qr.c (mpn_dcpi1_bdiv_qr_n)
(mpn_dcpi1_bdiv_qr): Likewise.
* mpn/generic/mu_bdiv_qr.c (mpn_mu_bdiv_qr): Adapted to new bdiv
convention, using a wrapper calling the old function.
* mpn/generic/mu_bdiv_q.c (mpn_mu_bdiv_q): Likewise.
2017-05-03 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/coreisbr/cnd_aors_n.asm: New file.
* mpn/x86_64/coreisbr/cnd_add_n.asm: New file.
* mpn/x86_64/core2/aorsmul_1.asm: Tune.
* mpn/x86_64/silvermont/mul_1.asm: New file, grabbing another asm file.
* mpn/x86_64/silvermont/aorsmul_1.asm: Likewise.
* mpn/x86_64/silvermont/aorrlsh1_n.asm: Likewise.
* mpn/x86_64/silvermont/aorrlsh2_n.asm: Likewise.
* mpn/x86_64/silvermont/lshift.asm: Likewise.
* mpn/x86_64/silvermont/rshift.asm: Likewise.
* mpn/x86_64/silvermont/lshiftc.asm: Likewise.
* mpn/x86_64/zen/mul_basecase.asm: Split outer loop into 4 loops.
2017-05-02 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/zen/sqr_basecase.asm: Use .byte for encoding all mulx.
Misc tuning.
2017-04-27 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/zen/sqr_basecase.asm: Rewrite to do 2x and limb squaring
in main loop.
2017-04-25 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/fat/fat_entry.asm: Allocate correct DOS64 frame.
* mpn/x86_64/divrem_2.asm: Likewise.
* mpn/x86_64/mod_1_2.asm: Likewise.
* mpn/x86_64/mod_1_4.asm: Likewise.
* mpn/x86_64/mod_1_1.asm: Likewise.
2017-04-23 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/coreisbr/mul_1.asm: Rewrite feed-in code and add mul_1c
entry point.
* mpn/x86_64/fat/fat.c (__gmpn_cpuvec_init): Amend last change.
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
* mpn/x86_64/zen/mul_basecase.asm: New file.
* mpn/x86_64/zen/sqr_basecase.asm: New file.
2017-04-16 Torbjörn Granlund <tg@gmplib.org>
* mpn/generic/sqr_basecase.c (addmul_1 variant): Rewrite to compute
in-place and to avoid a re-computation.
* configure.ac: Remove k8, k10 from zen path.
2017-04-15 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/zen/gmp-mparam.h: New file.
* mpn/x86_64/zen/com.asm: New file, grabbing another asm file.
* mpn/x86_64/zen/copyd.asm: Likewise.
* mpn/x86_64/zen/copyi.asm: Likewise.
* mpn/x86_64/zen/gcd_1.asm: Likewise.
* mpn/x86_64/zen/hamdist.asm: Likewise.
* mpn/x86_64/zen/lshift.asm: Likewise.
* mpn/x86_64/zen/lshiftc.asm: Likewise.
* mpn/x86_64/zen/popcount.asm: Likewise.
* mpn/x86_64/zen/rshift.asm: Likewise.
* config.guess: Recognise AMD zen.
* acinclude.m4 (X86_64_PATTERN): Add zen.
* config.sub: Corresponding changes.
* configure.ac: Corresponding changes.
* mpn/x86_64/fat/fat.c: Corresponding changes.
2017-03-27 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpz/oddfac_1.c (limb_apprsqrt): Better approximation.
* mpz/bin_uiui.c (limb_apprsqrt): Likewise.
2017-03-12 Torbjörn Granlund <tg@gmplib.org>
* tests/mpf/t-get_d_2exp.c (check_data): Rewrite of check_onebit.
2017-03-08 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpn/generic/sqrtrem.c: Direct use of sqrtrem2 when n==2.
* mpn/generic/div_qr_2.c (udiv_qr_4by2): Replace add_csaac with add_sssaaaa.
2017-03-07 Torbjörn Granlund <tg@gmplib.org>
* mpf/get_d_2exp.c: Return negative value for negative input.
2017-03-06 Torbjörn Granlund <tg@gmplib.org>
* longlong.h (aarch64): Provide asm-free umul_ppmm.
* longlong.h (powerpc64): Enable asm-free umul_ppmm.
2017-03-04 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm64/sqr_diag_addlsh1.asm: Complete rewrite.
2017-02-27 Torbjörn Granlund <tg@gmplib.org>
* longlong.h (arm32/arm64): Remove useless comparison to 0 introduced
in last change (spotted by Marco).
2017-02-26 Marco Bodrato <bodrato@mail.dm.unipi.it>
* mpn/generic/pow_1.c: Use umul_ppmm for a single limb.
* configure.ac: Allow MP_SIZE_T_MAX for thresholds exported to
config.m4.
* mpn/x86_64/gcd_1.asm: Handle BMOD_1_TO_MOD_1_THRESHOLD=MP_SIZE_T_MAX.
Streamline non-reduction path.
* mpn/x86_64/core2/gcd_1.asm: Streamline small operands cases similarly
to top-level code.
2017-02-24 Torbjörn Granlund <tg@gmplib.org>
* longlong.h (arm32/arm64 add_ssaaaa): Use "subs" for some immediates.
* longlong.h (arm32/arm64 sub_ssaaaa): Use "adds" for some immediates.
* mpn/arm64/copyi.asm: Avoid branching on flags.
* mpn/arm64/copyd.asm: Likewise.
* mpn/generic/div_qr_2.c (aarch64 add_sssaaaa): New.
* mpn/generic/div_qr_1n_pi2.c: Same.
* mpn/generic/div_qr_1u_pi2.c: Same.
* mpn/generic/div_qr_2.c (powerpc add_sssaaaa): Fix typo.
* mpn/generic/div_qr_1n_pi2.c: Same.
* mpn/generic/div_qr_1u_pi2.c: Same.
2017-02-23 Marco Bodrato <bodrato@mail.dm.unipi.it>
* tests/devel/sqrtrem_1_2.c: New exhaustive test for sqrtrem_[12].
* tests/devel/Makefile.am (EXTRA_PROGRAMS): add it.
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
2017-02-23 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/silvermont/gmp-mparam.h: New file.
2017-02-22 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm64/aorsmul_1.asm: Rewrite.
* mpn/arm64/lshiftc.asm: New file.
2017-02-21 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm64/lshift.asm: Rewrite.
* mpn/arm64/rshift.asm: Rewrite.
* mpn/arm64/rsh1aors_n.asm: New file.
2017-02-19 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm64/mul_1.asm: Rewrite.
* mpn/arm64/xgene1/mul_1.asm: Remove.
2017-02-17 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm64/cora53/cnd_aors_n.asm: Moved from "..".
* mpn/arm64/xgene1/cnd_aors_n.asm: Remove file since default code
performs better.
* mpn/arm64/cnd_aors_n.asm: Rewrite.
* mpn/arm64/logops_n.asm: Rewrite based on new aors_n.asm.
2017-02-16 Pedro Gimeno <pggimeno@wanadoo.es>
* rand/randmt.c (__gmp_randiset_mt): Set generator functions from
source.
2017-02-16 Torbjörn Granlund <tg@gmplib.org>
* mpn/arm64/aorsorrlshC_n.asm: New file.
* mpn/arm64/aorsorrlsh2_n.asm: New file.
* mpn/arm64/aorsorrlsh1_n.asm: New file.
* mpn/arm64/xgene1/aors_n.asm: Remove file since default code now
performs similarly.
* mpn/arm64/aors_n.asm: Rewrite to use 4x unrolling.
2017-02-15 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/silvermont/hamdist.asm: New file, based on k10 code.
* mpn/x86_64/silvermont/popcount.asm: Grab coreisbr/popcount.asm.
2017-02-14 Torbjörn Granlund <tg@gmplib.org>
* mpn/x86_64/silvermont/aors_n.asm: New file, grabbing coreisbr code.
* mpn/x86_64/atom/aors_n.asm: Replace coreisbr grabbing code with
code based on Marco's x64/atom/aors_n.asm.
* mpn/powerpc64/aix.m4 (AIX): New define.
* mpn/powerpc64/mode64/bdiv_q_1.asm: For AIX, don't jump from
mpn_bdiv_q_1 to middle of mpn_pi1_bdiv_q_1. Streamline.
* mpn/powerpc64/darwin.m4 (LEA): Put code in lea_list instead of in
EPILOGUE_cpu.
(EPILOGUE_cpu): Output lea_list, the zap it.
* mpn/sparc64/ultrasparct3/bdiv_q_1.asm: New file, based on dive_1.asm.