AIRR et nombre de reads : finalement duplicate_count ?
J'allais envoyer le mail à AIRR quand je suis retonbé sur cette discussion par mail:
(Vidjil) Note that we focus on clones throughout all the Vidjil platform, not on individual reads. We plan to use the "consensus_count" key of the AIRR format to encode the number of reads belonging to a clone, is it the good way to go ?
(JVH, AIRR) For counting clones, the
duplicate_countfield would be more appropriate;
consensus_countis for UMI consensus read annotation. However, if you want a clonotype summary report (eg, count of unique CDR3s without V/J annotations), then the Rearrangement format isn't really suitable for that. This might be a format we have to consider designing, if there is enough demand for it. (This is a grey area though, because it's more of a custom analysis output than something we can standardize.)