Better estimate filesize
Knowing filesize is important to accurately estimate the number of sequences in the input file and, thus, use a correct multiplier for the e-value computation.
However currently the filesize is the one on disk, which may be compressed or uncompressed. Thus, with compressed file, the number of sequences is underestimated and the e-value is too permissive.
This leads to #5158 in at least one case (https://health.vidjil.org/13296-2 ind119).