Commit 6da5748e authored by ZIMMERMANN Paul's avatar ZIMMERMANN Paul
Browse files

another pass of proof-reading

parent 5053df90
......@@ -115,7 +115,7 @@ To estimate the number of relations produced by a set of parameters:
- We compute the corresponding factor bases.
- We randomly sample in the global q-range, using sieving instead of batch:
this produces the same relations. This is slower but `-batch` is
incompatible (with the version of cado-nfs we used) with on-line duplicate removal.
incompatible (with the version of cado-nfs we used) with on-the-fly duplicate removal.
Here is the result with the final parameters used in the computation. Here,
`-t 16` specifies the number of threads (more is essentially useless, since
......
......@@ -110,7 +110,7 @@ To estimate the number of relations produced by a set of parameters:
special-q size.
- We random sample in the global q-range, using sieving and not batch:
this produces the same relations. This is slower but `-batch` is
currently incompatible with on-line (on-the-fly) duplicate removal.
incompatible (with the version of cado-nfs we used) with on-the-fly duplicate removal.
Here is the result with the parameters that were used in the computation.
......
......@@ -28,7 +28,7 @@ This is exactly similar to
We also repeat this important paragraph from the RSA-240 documentation.
Most (if not all) information boxes in this document rely on two shell
variables, `CADO_BUILD` and `DATA`, be set and `export`-ed to shell
variables, `CADO_BUILD` and `DATA`, be set and exported to shell
subprocess (as with `export CADO_BUILD=/blah/... ; export
DATA=/foo/...`). The `CADO_BUILD` variable is assumed to be the path to
a successful cado-nfs build directory. The `DATA` variable, which is
......@@ -36,7 +36,7 @@ used by some scripts, should point to a directory with plenty of storage,
possibly on some shared filesystem. Storage is also needed to store the
temporary files with collected relations. Overall, a full reproduction of
the computation would need in the whereabouts of 10TB of storage. All
scripts provided in this script expect to be run from the directory where
scripts provided in this document expect to be run from the directory where
they are placed, since they are also trying to access companion data
files.
......@@ -79,7 +79,7 @@ To estimate the number of relations produced by a set of parameters:
special-q size.
- We random-sample in the global q-range, using sieving and not batch:
this produces the same relations. This is slower but `-batch` is
currently incompatible with on-line (on-the-fly) duplicate removal.
incompatible (with the version of cado-nfs we used) with on-the-fly duplicate removal.
Here is what it gives with the parameters that were used in the computation.
......@@ -94,8 +94,8 @@ The file has size 804409710 bytes, and takes less than 4 minutes
to compute on the `grvingt` computers.
The hint file is [`rsa250.hint`](rsa250.hint), and has a weird format.
The following basically says "Three algebraic large primes for special-q
less than 2^31, and two otherwise."
The following basically says "Three algebraic large primes for special-q's
less than 2^32, and two otherwise."
```shell
cat > rsa250.hint <<EOF
......@@ -107,7 +107,7 @@ cat > rsa250.hint <<EOF
EOF
```
We can now sieve for random-sampled special-q, and remove duplicate
We can now sieve for random-sampled special-q's, and remove duplicate
relations on-the-fly. In the output of the command line below, only the
number of unique relations per special-q matters. The timing does not
matter.
......@@ -125,7 +125,7 @@ order to vary the random picks.)
In order to deduce an estimate of the total number of (de-duplicated)
relations, it remains to multiply the average number of relations per
special-q as obtained during the sample sieving by the number of
special-q in the global q-range. The latter can be precisely estimated
special-q's in the global q-range. The latter can be precisely estimated
using the logarithmic integral function as an approximation of the number
of degree-1 prime ideals below a bound. In
[Sagemath](https://www.sagemath.org/) code, this gives:
......@@ -139,9 +139,9 @@ print (tot_rels)
# 6.23205306433878e9
```
This estimate (6.2G relations) can be made more precise by increasing the
number of special-q that are sampled for sieving. It is also possible to
number of special-q's that are sampled for sieving. It is also possible to
have different nodes sample different sub-ranges of the global range to
get the result faster. We consider that sampling 1024 special-qs is
get the result faster. We consider that sampling 1024 special-q's is
enough to get a reliable estimate.
## Estimating the cost of sieving
......@@ -199,10 +199,10 @@ sys 70m15.469s
```
Then the `128m8.106=7688.1s` value must be appropriately scaled in order
to convert it into physical core-seconds. For instance, in our case,
since there are 32 physical cores and we sieved 1024 special-qs, this
since there are 32 physical cores and we sieved 1024 special-q's, this
gives `(128*60+8.1)*32/1024=240.25` core.seconds per special-q.
Finally, it remains to multiply by the number of special-q in this
Finally, it remains to multiply by the number of special-q's in this
subrange. We get (in Sagemath):
```python
......@@ -219,7 +219,7 @@ With this experiment, we get therefore about 1060 core.years for this sub-range.
#### Cost of 1-sided sieving + batch in the q-range [4e9,12e9]
For special-qs larger than 4e9, since we are using batch smoothness
For special-q's larger than 4e9, since we are using batch smoothness
detection on side 0, we have to precompute the `rsa250.batch0` file which
contains the product of all primes to be extracted. (Note that the
`-batch1` option is mandatory, even if for our parameters, no file is
......@@ -238,7 +238,7 @@ shell script given in this repository. This launches:
produce relations.
The script takes two command line arguments `-q0 xxx` and `-q1 xxx`,
which describe the range of special-q to process. Temporary files are put
which describe the range of special-q's to process. Temporary files are put
in the `/tmp` directory by default.
In order to run [`rsa250-sieve-batch.sh`](rsa250-sieve-batch.sh) on your
......@@ -255,12 +255,12 @@ is as follows:
```
The script prints on stdout the start and end date, and in the output of
`las`, which can be found in `$DATA/log/las.${q0}-${q1}.out`, the
number of special-q that have been processed can be found. From this
number of special-q's that have been processed can be found. From this
information one can again deduce the cost in core.seconds to process one
special-q and then the overall cost of sieving the q-range [4e9,12e9].
The design of this script imposes to have a rather long range of
special-q to handle for each run of `rsa250-sieve-batch.sh`. Indeed,
special-q's to handle for each run of `rsa250-sieve-batch.sh`. Indeed,
during the last minutes, the `finishbatch` jobs need to take care of the
last survivor files while `las` is no longer running, so that the node is
not fully occupied. If the `rsa250-sieve-batch.sh` job takes a few hours,
......@@ -318,7 +318,7 @@ the full computation. It is just necessary to remove the `-random-sample`
option and to adjust the `-q0` and `-q1` to create many small work units
that in the end cover exactly the global q-range.
Since we do not expect anyone to spend again as much computing resources
Since we do not expect anyone to spend as much computing resources
to perform again exactly the same computation, we provide in the
[`rsa250-rel_count`](rsa250-rel_count) file the count of how many (non-unique)
relations were produced for each 100M special-q sub-range.
......@@ -373,7 +373,7 @@ Again, the general description of the [RSA-240
case](../rsa240/README.md#estimating-the-linear-algebra-time-more-precisely-and-choosing-parameters)
applies identically to RSA-250.
In order to bench the time per iteration for the selected matrix, we can
In order to benchmark the time per iteration for the selected matrix, we can
use the following script.
```shell
export matrix=$DATA/rsa250.matrix.250.bin
......@@ -437,7 +437,7 @@ We used the following command on the machine `wurst`:
```shell
$CADO_BUILD/linalg/characters -poly rsa250.poly -purged $DATA/purged3.gz -index $DATA/rsa250.index.gz -heavyblock $DATA/rsa250.matrix.250.dense.bin -out $DATA/rsa250.kernel -ker $DATA/W -lpb0 36 -lpb1 37 -nchar 50 -t 56
```
This gave after about 2 hours 23 dependencies.
This gave 23 dependencies after about 2 hours.
## Reproducing the square root step
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment