Faire de la lecture
Un début (à verser dans un Zotero) :
-
Thi-Tuyet-Hai Nguyen, Adam Jatowt, Mickaël Coustaty, Nhu-Van Nguyen, Antoine Doucet. Deep Statistical Analysis of OCR Errors for Effective Post-OCR Processing. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), Jun 2019, Champaign, France. pp.29-38, ⟨10.1109/jcdl.2019.00015⟩. ⟨hal-02519302⟩ -
Rebora, Simone. "A Digital Edition between Stylometry and OCR: The Klagenfurter Ausgabe of Robert Musil." Textual Cultures 12, no. 2 (2019): 71-90. Accessed March 17, 2021. doi:10.2307/26821537. -
Ahmed Hamdi, Axel Jean-Caurant, Nicolas Sidère, Mickaël Coustaty, Antoine Doucet. An Analysis of the Performance of Named Entity Recognition over OCRed Documents. 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL), Jun 2019, Champaign, United States. pp.333-334, ⟨10.1109/JCDL.2019.00057⟩. ⟨hal-02364693⟩ -
Marie-Laure Massot, Arianna Sforzini, Vincent Ventresque. Transcribing Foucault’s handwriting with Transkribus. Journal of Data Mining and Digital Humanities, Episciences.org, 2019, Atelier Digit_Hum. ⟨hal-01913435v3⟩ -
Marie-Laure Massot, Jean-Philippe Moreux, Vincent Ventresque. Expérimenter Transkribus sur les fiches de lecture de Michel Foucault. Colloque de clôture du projet ANR Foucault Fiches de lecture Seconde partie « Editer Michel Foucault (1994-2021) », Sep 2020, Paris, France. ⟨hal-02974811⟩ -
Chloé Artaud, Nicolas Sidère, Antoine Doucet, Jean-Marc Ogier, Vincent Poulain D'andecy. Find it! Fraud Detection Contest Report. 24th International Conference on Pattern Recognition (ICPR 2018), Aug 2018, Beijing, China. pp. 13-18. ⟨hal-02316399⟩ -
Thi Tuyet Hai Nguyen, Adam Jatowt, Nhu-Van Nguyen, Mickael Coustaty,and Antoine Doucet. 2020. Neural Machine Translation with BERT for Post-OCR Error Detection and Correction. InProceedings of the ACM/IEEE JointConference on Digital Libraries in 2020 (JCDL ’20), August 1–5, 2020, VirtualEvent, China.ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3383583.3398605 -
Régis Schlagdenhauffen. Optical Recognition Assisted Transcription with Transkribus: The Experiment concerning Eugène Wilhelm's Personal Diary (1885-1951). Journal of Data Mining and Digital Humanities, Episciences.org, 2020, Atelier Digit_Hum. ⟨hal-02520508v3⟩ -
T. Nguyen, A. Jatowt, M. Coustaty, N. Nguyen and A. Doucet, "Post-OCR Error Detection by Generating Plausible Candidates," 2019 International Conference on Document Analysis and Recognition (ICDAR), Sydney, NSW, Australia, 2019, pp. 876-881, doi: 10.1109/ICDAR.2019.00145. -
Mark J Hill, Simon Hengchen, Quantifying the impact of dirty OCR on historical text analysis: Eighteenth Century Collections Online as a case study, Digital Scholarship in the Humanities, Volume 34, Issue 4, December 2019, Pages 825–843, https://doi.org/10.1093/llc/fqz024 🔑 -
E. Rusakov, L. Rothacker, H. Mo and G. A. Fink, "A Probabilistic Retrieval Model for Word Spotting Based on Direct Attribute Prediction," 2018 16th International Conference on Frontiers in Handwriting Recognition (ICFHR), Niagara Falls, NY, USA, 2018, pp. 38-43, doi: 10.1109/ICFHR-2018.2018.00016. -
Gurjar, Neha, Sebastian Sudholt, et Gernot A. Fink. « Learning Deep Representations for Word Spotting Under Weak Supervision ». arXiv:1712.00250 [cs], 26 janvier 2018. http://arxiv.org/abs/1712.00250. -
Dahl, Christian M., Torben Johansen, Emil N. Sørensen, et Simon Wittrock. « HANA: A HAndwritten NAme Database for Offline Handwritten Text Recognition ». arXiv:2101.10862 [cs, econ], 22 janvier 2021. http://arxiv.org/abs/2101.10862. -
Tanti Kristanti, Laurent Romary. DeLFT and entity-fishing : Tools for CLEF HIPE 2020 Shared Task. CLEF 2020 - Conference and Labs of the Evaluation Forum, Sep 2020, Thessaloniki / Virtual, Greece. ⟨hal-02974946⟩ -
Luca Foppiano, Laurent Romary. Entity-fishing: a DARIAH entity recognition and disambiguation service. Journal of the Japanese Association for Digital Humanities, Japanese Association for Digital Humanities, 2020, 5 (1), pp.22-60. ⟨10.17928/jjadh.5.1_22⟩. ⟨hal-01812100v2⟩ -
Charles Riondet, Luca Foppiano. History Fishing When engineering meets History. Text as a Resource. Text Mining in Historical Science #dhiha7, Institut Historique Allemand (Paris), Jun 2017, Paris, France. ⟨hal-01830713⟩ -
Dominique Stutzmann, Jean-François Moufflet, Sébastien Hamel. Full Text Search in Medieval Manuscripts: Issues and Perspectives of the HIMANIS Project for Electronic Publishing. Medievales -Paris-, Puv, 2017, 73 (73), pp.67 - 96. ⟨10.4000/medievales.8198⟩. ⟨hal-01854949⟩ -
Théodore Bluche, Sébastien Hamel, Christopher Kermorvant, Joan Puigcerver, Dominique Stutzmann, et al.. Preparatory KWS Experiments for Large-Scale Indexing of a Vast Medieval Manuscript Collection in the HIMANIS Project. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Nov 2017, Kyoto, France. ⟨10.1109/ICDAR.2017.59⟩. ⟨halshs-01853682⟩ -
The DIGITARIUM as a research corpus: New approaches to extracting and linking named entities from historical newspapers (Nina C. Rastinger and Matthias Schlögl) - https://twitter.com/schambers3/status/1372141931652857857?s=19 + https://twitter.com/schambers3/status/1372140883479498752?s=19 (revoir la présentation dès qu'elle sera en ligne)
-
Session 3 de NEwsEye
-> des lectures à discuter avec Alix
Edited by Hugo Scheithauer