Notes on the annotation of the text in Transkribus
As discussed, here is a little review/summary of how the annotation of the text works on Transkribus. There's room for improvement but there are good ideas too.
Usage on the desktop platform
- Load a collection/document and a page
- Open the Metadata/Textual panel to display the list of available tags and the portions of text already annotated. On the transcription panel, annotated portions of text are underscored in a color set for the tag.
- There are two ways to add a tag:
- in the transcription panel, select the portion of text and hit “+” next to the tag in the list of available tags.
- in the transcription panel, select the portion of text, right-click and choose "All tags" and the desired tag.
- To create, edit or delete a tag, hit “Customize” - properties can be added to a tag (equivalent to a key/value pair or to an xml attribute/value pair), set a default value to a property, change the color associated to the tag, etc.
- To remove an annotation, simply delete the item from the list of annotations, or select the annotated portion, right-click and hit "Delete all tags for current selection."
If an annotation goes on for several lines, Transkribus will consider each line as a different portion, which has its downsides.
On the online app (https://transkribus.eu/r/read/library/), select the text in the transcription panel, right-click and add the tag. It's possible to deactivate the display of the annotation but it is not possible (to my knowledge) to add new tags online, whereas it is possible on the desktop app.
Remarks on the tags
The tags put the annotation of the style, the semantic annotation and the diplomatique annotation on the same levels. Tags are associated to a user: if a custom tag is created in a document, it will be available in all the other documents for that user. On the other hand, unless a tag is already used on a document shared with another user, it is possible to share say a set of customed tags.
Exporting the annotations
as a side note, the
page2teiXSL used by Transkribus uses the values in this attribute to report the annotation as more or less proper tags in the XML TEI export. See: https://github.com/dariok/page2tei/blob/master/page2tei-0.xsl#L572
- the annotations are not exported in ALTO XML