improving reduce_plattice
There are several things here. My goals are:
- to get entirely rid of the crippled asm code
- to behave more correctly when p is close to 2^32 (current code has tons of bugs, in fact).
- implement correctly all possible corner cases with projective / powers etc etc.
- possibly draw some performance benefit.
- use that as a base for the work on bucket-sieving powers (!29) ; I think that it's better to separate the two efforts, in fact.
A possible performance benefit might come from the simd variants of
reduce_plattice
.
Note that this should mostly supersede !3