Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
cadonfs
records
Commits
1e7f27f4
Commit
1e7f27f4
authored
May 27, 2020
by
GUILLEVIC Aurore
Browse files
proofreading aurore
parent
039f989f
Changes
1
Hide whitespace changes
Inline
Sidebyside
rsa250/README.md
View file @
1e7f27f4
...
...
@@ 10,20 +10,20 @@ This is exactly similar to RSA240.
The polynomial selection sizeoptimization
step was performed with cadonfs in client/server mode, using a myslq database:
```
```
shell
./cadonfs.py params.rsa250
database
=
db:mysql://USERNAME:PASSWORD@localhost:3306/rsa250
```
with the file
[
`params.rsa250`
](
params.rsa250
)
.
The rootoptimization step was performed on the best 100 sizeoptimized
polynomials (those with the smallest exp_E value) with the following command:
```
polyselect_ropt inputpolys candidates Bf 2.1e9 Bg 1.8e9 area 2.4e19 ropteffort 10 t 100
```
shell
$CADO_BUILD
/polyselect/
polyselect_ropt
inputpolys
candidates
Bf
2.1e9
Bg
1.8e9
area
2.4e19
ropteffort
10
t
100
```
And the winner is:
```
```
shell
cat
>
rsa250.poly
<<
EOF
n: 2140324650240744961264423072839333563008614715144755017797754920881418023447140136643345519095804679610992851872470914587687396261921557363047454770520805119056493106687691590019759405693457452230589325976697471681738069364894699871578494975937497937
poly0: 3256571715934047438664355774734330386901,185112968818638292881913
...
...
@@ 39,22 +39,22 @@ EOF
For estimating the number of relations produced by a set of parameters:

We compute the corresponding factor bases. In our case, the side 0 is
rational, so there is no need to precompute the factor base.

We create a "hint" file
that we tell
which strategy to use for which

We create a "hint" file
where we write
which strategy to use for which
specialq size.

We randomsample in the global qrange, using sieving
and no
t batch
:
this produces the same relations. This is slower but batch is
currently incompatible with online duplicate removal.

We randomsample in the global qrange, using sieving
but withou
t batch
,
this produces the same relations. This is slower but
`
batch
`
is
currently incompatible with online
(onthefly)
duplicate removal.
Here is what it gives with final parameters used in the computation:
Here is what it gives with
the
final parameters used in the computation:
```
```
shell
$CADO_BUILD
/sieve/makefb
poly
rsa250.poly
side
1
lim
2147483647
maxbits
16
t
8
out
rsa250.fb1.gz
```
The hint file has a weird format. The following basically says "Three
algebraic large primes for specialq less than 2^31, and two otherwise."
```
```
shell
cat
>
rsa250.hint
<<
EOF
30@1 1.0 1.0 A=33 2147483647,36,72,2.2 2147483647,37,111,3.2
31@1 1.0 1.0 A=33 2147483647,36,72,2.2 2147483647,37,111,3.2
...
...
@@ 67,7 +67,7 @@ EOF
We can now sieve for randomsampled specialq, removing duplicate
relations onthefly.
```
```
shell
$CADO_BUILD
/sieve/las
poly
rsa250.poly
fb1
rsa250.fb1.gz
lim0
2147483647
lim1
2147483647
lpb0
36
lpb1
37
q0
1e9
q1
12e9
dup
dupqmin
0,1000000000
sqside
1
A
33
mfb0
72
mfb1
111
lambda0
2.2
lambda1
3.2
randomsample
1024
t
auto
bkmult
1,1l:1.25975,1s:1.5,2s:1.1
v
bkthresh1
80000000
adjuststrategy
2
fbc
/tmp/fbc
hinttable
rsa250.hint
```
...
...
@@ 78,12 +78,13 @@ specialq in the global qrange. The latter can be precisely estimated
using the logarithmic integral function as an approximation of the number
of degree1 prime ideals below a bound. In Sagemath, this gives:
```
[sage]
```
python
#
[sage]
ave_rel_per_sq
=
12.7
## pick value ouput by las
number_of_sq
=
log_integral
(
12e9
)

log_integral
(
1e9
)
tot_rels
=
ave_rel_per_sq
*
number_of_sq
print
(
tot_rels
)
# we obtain 6.23205306433878e9
```
This estimate of 6.2e9 can be made more precise by increasing the number of
specialq that are sampled for sieving. It is also possible to have
...
...
@@ 107,7 +108,7 @@ to start a first run and interrupt it as soon as the cache is written.
In order to measure the cost of sieving in the specialq subrange where
sieving is used on both sides, the typical commandline is as follows:
```
```
shell
time
$CADO_BUILD
/sieve/las
poly
rsa250.poly
fb1
rsa250.fb1.gz
lim0
2147483647
lim1
2147483647
lpb0
36
lpb1
37
q0
1e9
q1
4e9
sqside
1
A
33
mfb0
72
mfb1
111
lambda0
2.2
lambda1
3.2
randomsample
1024
t
auto
bkmult
1,1l:1.25975,1s:1.5,2s:1.1
v
bkthresh1
80000000
adjuststrategy
2
fbc
/tmp/fbc
```
...
...
@@ 131,12 +132,13 @@ case, since there are 32 cores and we sieved 1024 specialqs, this gives
Finally, it remains to multiply by the number of specialq in this
subrange. We get (in Sagemath):
```
[sage]
```
python
#
[sage]
cost_in_core_sec
=
(
log_integral
(
4e9
)

log_integral
(
1e9
))
*
(
128
*
60
+
8.1
)
*
32
/
1024
cost_in_core_hours
=
cost_in_core_sec
/
3600
cost_in_core_years
=
cost_in_core_hours
/
24
/
365
print
(
cost_in_core_hours
,
cost_in_core_years
)
# (9.28413860267641e6, 1059.83317382151)
```
With this experiment, we get therefore about 1060 core.years for this subrange.
...
...
@@ 144,23 +146,23 @@ With this experiment, we get therefore about 1060 core.years for this subrange.
#### Cost of 1sided sieving + batch in the qrange [4e9,12e9]
For specialqs larger th
e
n 4e9, since we are using batch smoothness
For specialqs larger th
a
n 4e9, since we are using batch smoothness
detection on side 0, we have to precompute the rsa250.batch0 file that
contains the product of all primes to be extracted. (Note the batch1
contains the product of all primes to be extracted. (Note the
`
batch1
`
option is mandatory, even if for our parameters, no file is produced on
side 1.)
```
```
shell
$CADO_BUILD
/sieve/ecm/precompbatch
poly
rsa250.poly
lim0
0
lim1
2147483647
batch0
rsa250.batch0
batch1
rsa250.batch1
batchlpb0
31
batchlpb1
30
```
Then, we can use the
[
`sievebatch.sh`
](
sievebatch.sh
)
shellscript given in this
repository. This launches:

one instance of las, that does the sieving on side 1 and print the

one instance of
`
las
`
, that does the sieving on side 1 and print
s
the
survivors to files;

6 instances of the
`finishbatch`
program. Those instances process the files as they are
produced, do the batch smoothness detection, and produce relations.
They take as input on the commandline two arguments q0 xxx q1 xxx
They take as input on the commandline two arguments
`
q0 xxx q1 xxx
`
describing the range of specialq to process.
In order to run it on your own machine, there are some variables to
...
...
@@ 170,40 +172,42 @@ can also be adjusted depending on the number of cores available on the
machine.
When the paths are properly set, here is a typical invocation:
```
```
shell
./sievebatch.sh
q0
4000000000
q1
4000100000
```
The script prints on stdout the start and end date, and in the output of
las that can be found in $result_dir/log/las.${q0}${q1}.out
, the number
`
las
`
that can be found in
`
$result_dir/log/las.${q0}${q1}.out
`
, the number
of specialq that have been processed can be found. From this one can
again deduce the cost in core.seconds to process one specialq and then
the overall cost of sieving the qrange [4e9,12e9].
The design of this script imposes to have a rather long range of
specialq to handle for each run of sievebatch.sh
. Indeed, during the
last minutes, the finishbatch jobs need to take care of the last survivor
files while las is no longer running, so that the node is not fully
occupied. If the sievebatch.sh job takes a few hours, this fadeout
specialq to handle for each run of
`
sievebatch.sh
`
. Indeed, during the
last minutes, the
`
finishbatch
`
jobs need to take care of the last survivor
files while
`
las
`
is no longer running, so that the node is not fully
occupied. If the
`
sievebatch.sh
`
job takes a few hours, this fadeout
phase takes negligible time. Both for the benchmark and in production it
is then necessary to have jobs taking at least a few hours.
On our sample machine, here is an example of a benchmark:
```
```
shell
./sievebatch.sh
q0
4000000000
q1
4000100000
>
/tmp/sievebatch.out
[ wait ... ]
#
[ wait ... ]
start
=
$(
date
d
"
`
grep
"^Starting"
/tmp/sievebatch.out 
head
1

awk
F
" at "
'//{print $2}'
`
"
+%s
)
end
=
$(
date
d
"
`
grep
"^End"
/tmp/sievebatch.out 
tail
1

awk
F
" at "
'//{print $2}'
`
"
+%s
)
nb_q
=
`
grep
"# Discarded 0 specialq's out of"
/tmp/log/las.40000000004000100000.out 
awk
'{print $(NF1)}'
`
echo
n
"Cost in core.sec per specialq: "
;
echo
"(
$end

$start
)/
$nb_q
*32"
 bc
l
Cost
in
core.sec per specialq: 116.51841746248294679392
[sage]
```
```
python
# [sage]
cost_in_core_sec
=
(
log_integral
(
12e9
)

log_integral
(
4e9
))
*
116.5
cost_in_core_hours
=
cost_in_core_sec
/
3600
cost_in_core_years
=
cost_in_core_hours
/
24
/
365
print
(
cost_in_core_hours
,
cost_in_core_years
)
# (1.13780852428233e7, 1298.86817840449)
```
With this experiment, we get 116.5 core.sec per specialq, and therefore
...
...
@@ 211,8 +215,10 @@ we obtain about 1300 core.years for this subrange.
## Simulating the filtering output
Just use the script
`scripts/estimate_matsize.sh`
available in the
cadonfs repository.
Just use the script
[
`scripts/estimate_matsize.sh`
](
../../cadonfs/scripts/estimate_matsize.sh
)
# https://gitlab.inria.fr/cadonfs/cadonfs//blob/master/scripts/estimate_matsize.sh
available in the cadonfs repository.
This feature is still experimental and we do not claim full
reproducibility for this part of our work which was useful only to help
...
...
@@ 224,7 +230,7 @@ us choosing the parameters, but not for the proper computation.
The benchmark commandlines above can be used almost asis for
reproducing the full computation. It is just necessary to remove the
randomsample option and to adjust the q0 and q1 to create many small
`
randomsample
`
option and to adjust the
`
q0
`
and
`
q1
`
to create many small
work units that in the end cover exactly the global qrange.
Since we do not expect anyone to spend again as much computing resources
...
...
@@ 248,21 +254,22 @@ for the estimates in previous sections, and extrapolate.
Several filtering experiments were done during the sieving phase.
The final one can be reproduced as follows, with revision f59dcf48f:
```
purge out purged3.gz nrels 6132671469 keep 160 colminindex 0 colmaxindex 8460769956 t 56 required_excess 0.0 files
```
shell
$CADO_BUILD
/filter/
purge
out
purged3.gz
nrels
6132671469
keep
160
colminindex
0
colmaxindex
8460769956
t
56
required_excess
0.0 files
```
where
`files`
is the list of files with unique relations (output of
`dup2`
).
This took about 9.3 hours on the machine wurst, with 138GB of peak memory.
This took about 9.3 hours on the machine wurst (Xeon E74850 v3 @ 2.20GHz, 56
physical cores, 112 virtual cores, 4 NUMA nodes, 1.5TB memory), with 138GB of peak memory.
The merge step can be reproduced as follows (revision eaeb2053d):
```
merge out history5 t 112 target_density 250 mat purged3.gz skip 32
```
shell
$CADO_BUILD
/filter/
merge
out
history5
t
112
target_density
250
mat
purged3.gz
skip
32
```
and took about 3 hours on the machine wurst, with a peak memory of 1500GB.
Finally the replay step can be reproduced as follows:
```
replay purged purged3.gz his history5 out rsa250.matrix.250.bin
```
shell
$CADO_BUILD
/filter/
replay
purged
purged3.gz
his
history5
out
rsa250.matrix.250.bin
```
## Estimating linear algebra time more precisely, and choosing parameters
...
...
@@ 271,11 +278,11 @@ replay purged purged3.gz his history5 out rsa250.matrix.250.bin
## Reproducing the characters step
Let
`W`
be the kernel vector computed by the linear algebra step.
Let
**W**
be the kernel vector computed by the linear algebra step.
The characters step transforms this kernel vector into dependencies.
We used the following command on the machine
`wurst`
:
```
characters poly rsa250.poly purged purged3.gz index rsa250.index.gz heavyblock rsa250.matrix.250.dense.bin out rsa250.kernel ker K.sols064.0 lpb0 36 lpb1 37 nchar 50 t 56
```
shell
$CADO_BUILD
/linalg/
characters
poly
rsa250.poly
purged
purged3.gz
index
rsa250.index.gz
heavyblock
rsa250.matrix.250.dense.bin
out
rsa250.kernel
ker
K.sols064.0
lpb0
36
lpb1
37
nchar
50
t
56
```
This gave after about 2 hours 23 dependencies.
...
...
@@ 283,17 +290,17 @@ This gave after about 2 hours 23 dependencies.
The following command line can be used to produce the actual dependencies
(rsa250.dep.000.gz to rsa250.dep.022.gz) from the
`rsa250.kernel`
file:
```
sqrt poly rsa250.poly prefix rsa250.dep.gz purged purged3.gz index rsa250.index.gz ker rsa250.kernel t 56 ab
```
```
shell
$CADO_BUILD
/sqrt/
sqrt
poly
rsa250.poly
prefix
rsa250.dep.gz
purged
purged3.gz
index
rsa250.index.gz
ker
rsa250.kernel
t
56
ab
```
Then the following command line can be used to compute the square root on the
rational side of dependency
`nnn`
:
```
sqrt poly rsa250.poly prefix rsa250.dep dep nnn side0
```
shell
$CADO_BUILD
/sqrt/
sqrt
poly
rsa250.poly
prefix
rsa250.dep
dep
nnn
side0
```
and similarly on the algebraic side:
```
sqrt poly rsa250.poly prefix rsa250.dep dep nnn side1
```
shell
$CADO_BUILD
/sqrt/
sqrt
poly
rsa250.poly
prefix
rsa250.dep
dep
nnn
side1
```
Then we can simply compute
`gcd(xy,N)`
where
`x`
and
`y`
are the square
roots on the rational and algebraic sides respectively, and
`N`
is RSA250.
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment