Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
cadonfs
records
Commits
5053df90
Commit
5053df90
authored
Jun 09, 2020
by
ZIMMERMANN Paul
Browse files
another pass of proofreading
parent
8e6ec640
Changes
3
Hide whitespace changes
Inline
Sidebyside
rsa240/README.md
View file @
5053df90
...
...
@@ 60,7 +60,7 @@ debian 9 or debian 10). Typical software used were the GNU C compilers
versions 6 to 9, or Open MPI versions 4.0.1 to 4.0.3.
Most (if not all) information boxes in this document rely on two shell
variables,
`CADO_BUILD`
and
`DATA`
, be set and
`
export
`

ed to shell
variables,
`CADO_BUILD`
and
`DATA`
, be set and exported to shell
subprocess (as with
`export CADO_BUILD=/blah/... ; export
DATA=/foo/...`
). The
`CADO_BUILD`
variable is assumed to be the path to
a successful cadonfs build directory. The
`DATA`
variable, which is
...
...
@@ 68,7 +68,7 @@ used by some scripts, should point to a directory with plenty of storage,
possibly on some shared filesystem. Storage is also needed to store the
temporary files with collected relations. Overall, a full reproduction of
the computation would need in the vicinity of 10TB of storage. All
scripts provided in this
scrip
t expect to be run from the directory where
scripts provided in this
documen
t expect to be run from the directory where
they are placed, since they are also trying to access companion data
files.
...
...
@@ 127,7 +127,7 @@ to compute on the `grvingt` computers.
The hint file is
[
`rsa240.hint`
](
rsa240.hint
)
, and has a weird format.
The following basically says "Three algebraic large primes for specialq
The following basically says "Three algebraic large primes for specialq
's
less than 2^31, and two otherwise."
```
shell
...
...
@@ 139,7 +139,7 @@ cat > rsa240.hint <<EOF
EOF
```
We can now sieve for randomsampled specialq, and remove duplicate
We can now sieve for randomsampled specialq
's
, and remove duplicate
relations on the fly. In the output of the command line below, only the
number of unique relations per specialq matters. The timing does not
matter.
...
...
@@ 157,7 +157,7 @@ order to vary the random picks.)
In order to derive an estimate of the total number of (deduplicated)
relations, it is necessary to multiply the average number of relations per
specialq as obtained during the sample sieving by the number of
specialq in the global qrange. The latter can be precisely estimated
specialq
's
in the global qrange. The latter can be precisely estimated
using the logarithmic integral function as an approximation of the number
of degree1 prime ideals below a bound. In
[
Sagemath
](
https://www.sagemath.org/
)
code, this gives:
...
...
@@ 171,9 +171,9 @@ print (tot_rels)
# 5.88556387364565e9
```
This estimate (5.9G relations) can be made more precise by increasing the
number of specialq that are sampled for sieving. It is also possible to
number of specialq
's
that are sampled for sieving. It is also possible to
have different nodes sample different subranges of the global range to
get the result faster. Sampling 1024 specialqs should be
get the result faster. Sampling 1024 specialq
'
s should be
enough to get a reliable estimate.
## Estimating the cost of sieving
...
...
@@ 231,10 +231,10 @@ sys 43m15.877s
```
Then the
`75m54.351s=4554.3s`
value must be appropriately scaled in order
to convert it into physical coreseconds. For instance, in our case,
since there are 32 physical cores and we sieved 1024 specialqs, this
since there are 32 physical cores and we sieved 1024 specialq
'
s, this
gives
`4554.3*32/1024=142.32`
core.seconds per specialq.
Finally, we need to to multiply by the number of specialq in this
Finally, we need to to multiply by the number of specialq
's
in this
subrange. We get (in Sagemath):
```
python
...
...
@@ 250,7 +250,7 @@ With this experiment, we estimate about 279 core.years for this subrange.
#### Cost of 1sided sieving + batch in the qrange [2.1e9,7.4e9]
For specialqs larger then 2.1e9, since we are using batch smoothness
For specialq
'
s larger then 2.1e9, since we are using batch smoothness
detection on side 0, we have to precompute the
`rsa240.batch0`
file which
contains the product of all primes to be extracted. (Note that the
`batch1`
option is mandatory, even if for our parameters, no file is
...
...
@@ 285,13 +285,13 @@ is as follows:
./rsa240sievebatch.sh
q0
2100000000
q1
2100100000
```
The script prints the start and end date on stdout. The number
of specialqs that have been processed can be found in the output of
of specialq
'
s that have been processed can be found in the output of
`las`
, which is written to
`$DATA/log/las.${q0}${q1}.out`
. One can again
deduce the cost in coreseconds to process one specialq from this
information, and then the overall cost of sieving the qrange [2.1e9,7.4e9].
The design of this script imposes a rather long range of
specialq to handle for each run of
`rsa240sievebatch.sh`
. Indeed,
specialq
's
to handle for each run of
`rsa240sievebatch.sh`
. Indeed,
during the final minutes, the
`finishbatch`
jobs need to take care of the
last survivor files while
`las`
is no longer running, so the node is
not fully occupied. If the
`rsa240sievebatch.sh`
job takes a few hours,
...
...
@@ 363,7 +363,7 @@ takes only a few minutes). Within the script
several implementationlevel parameters are set, and should probably be
adjusted to the users' needs. Along with the
`DATA`
and
`CADO_BUILD`
variables, the script below also requires the
`MPI`
shell variable to
be set and
`
export
`

ed, so that
`$MPI/bin/mpiexec`
can actually run MPI
be set and exported, so that
`$MPI/bin/mpiexec`
can actually run MPI
programs. In all likelihood, this script needs to be tweaked depending on
the specifics of how MPI programs should be run on the target platform.
```
shell
...
...
@@ 375,7 +375,7 @@ inaccuracy, this experiment is sufficient to build confidence that the
time per iteration in the Krylov (a.k.a. "sequence") step of block
Wiedemann is about 1.2 to 1.5 seconds per iteration (handling 64bit wide
vectors). The time per iteration in the Mksol (a.k.a. "evaluation") step
is in the same ballpark. The time for
k
rylov+
m
ksol can then be estimated
is in the same ballpark. The time for
K
rylov+
M
ksol can then be estimated
as the product of this timing with
`(1+n/m+64/n)*(N/64)`
, with
`N`
the
number of rows, and
`m`
and
`n`
the block Wiedemann parameters (we chose
`m=512`
and
`n=256`
). Applied to our use case, this gives an anticipated
...
...
@@ 396,7 +396,7 @@ option and to adjust the `q0` and `q1` to create many small work units
that in the end cover exactly the global qrange.
Since we do not expect anyone to spend as many computing resources
to perform
again
exactly the same computation again, we provide the count
to perform exactly the same computation again, we provide the count
of how many (nonunique) relations were produced for each 100M specialq
subrange in the
[
`rsa240rel_count`
](
rsa240rel_count
)
file .
...
...
@@ 433,7 +433,7 @@ information on these steps.
The filtering output is controlled by a wealth of tunable parameters.
However on the very coarsegrain level we focus on two of them:
*
_when_ we decide to stop relation collection
.
*
_when_ we decide to stop relation collection
,
*
_how dense_ we want the final matrix to be.
Sieving more is expected to have a beneficial impact on the matrix size,
...
...
@@ 444,7 +444,7 @@ related concerns.
We did several filtering experiments based on the RSA240 data set, as
relations kept coming in. For each of these experiments, we give the
number of
raw
relations, the number of relations after the initial
number of
unique
relations, the number of relations after the initial
"pruning" step of filtering (called "purge" in cadonfs), as well as the
number of rows of the final matrix after "merge", for target densities
d=100, d=150, and d=200.
...
...
@@ 506,7 +506,7 @@ where the last 4 lines (steps `3krylov`) correspond to the 4 "sequences"
These sequences can be run concurrently on different sets of nodes, with
no synchronization needed. Each of these 4 sequences needs about 25 days
to complete. Jobs can be interrupted, and can simply be restarted
exactly from the point where they left off. E.g., if the latest of the
`V64128.*`
exactly from the point where they
were
left off. E.g., if the latest of the
`V64128.*`
files in
`$DATA`
is
`V64128.86016`
, then the job for sequence 1 can be
restarted with:
```
shell
...
...
@@ 548,8 +548,8 @@ export MPI
All steps
`8mksol.sh`
above can be run in parallel (they use the
`V*`
files produced in steps
`3krylov`
above as a means to jumpstart the
computation in the middle). Each uses 8 nodes and takes about 13 hours to
complete (1.43 seconds per iteration). Note that in order to bench the
m
ksol timings ahead of time, it is possible to create fake files named as
complete (1.43 seconds per iteration). Note that in order to bench
mark
the
M
ksol timings ahead of time, it is possible to create fake files named as
follows
```
rwrr 1 ethome users 564674048 Nov 20 21:47 F.sols064.064
...
...
rsa240/filtering.md
View file @
5053df90
...
...
@@ 7,17 +7,17 @@ socalled "renumber table", as follows.
```
$CADO_BUILD/sieve/freerel poly rsa240.poly renumber $DATA/rsa240.renumber.gz lpb0 36 lpb1 37 out $DATA/rsa240.freerel t 32
```
where
`t 32`
specifies the number of thread. This was done with revision
where
`t 32`
specifies the number of thread
s
. This was done with revision
`30a5f3eae`
of cadonfs, and takes several hours. (Note that newer
versions of cadonfs changed the format of this file.)
## Duplicate removal
Duplicate removal was done with revision
`50ad0f1fd`
of cadonfs.
c
adonfs proceeds through two passes. We used the default cadonfs
C
adonfs proceeds through two passes. We used the default cadonfs
setting which, on the first pass, splits the input into
`2^2=4`
independent slices, with no overlap.
c
adonfs supports doing this step
in an incremental way, so that we assume below the
the
shell variable
independent slices, with no overlap.
C
adonfs supports doing this step
in an incremental way, so that we assume below the shell variable
`EXP`
expands to an integer indicating the filtering experiment number.
In the command below,
`$new_files`
is expected to expand to a file
containing a list of file names of new relations (relative to
`$DATA`
) to
...
...
@@ 40,7 +40,7 @@ for i in {0..3} ; do
$CADO_BUILD/filter/dup2 nrels $nrels renumber $DATA/rsa240.renumber.gz $DATA/dedup/$i/dedup*gz > $DATA/dup2.$EXP.$i.stdout 2> $DATA/dup2.$EXP.$i.stderr
done
```
(Note: in newer versions of cadonfs, after
j
une 2020, the
`dup2`
(Note: in newer versions of cadonfs, after
J
une 2020, the
`dup2`
programs also requires the arguments
`poly rsa240.poly`
.)
## The "purge" step, a.k.a. singleton and "clique" removal.
...
...
rsa240/polyselect.md
View file @
5053df90
...
...
@@ 15,8 +15,7 @@ for admin from 0 to 2000000000000 by 2500000:
```
We found 39890071 sizeoptimized polynomials, we kept the 104 most promising
ones (i.e., the ones with the smallest
`exp_E`
value). The best
`exp_E`
was
57.
78, the worst
`exp_E`
was 59.49.
ones (i.e., the ones with the smallest
`exp_E`
value). Among them, the best
`exp_E`
was 57.78, the worst
`exp_E`
was 59.49.
We used the following command line for root optimization (still with cadonfs
revision
`52ac92746`
), where
`candidates`
is the file containing all candidates
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment