eScriptorium issueshttps://gitlab.inria.fr/scripta/escriptorium/-/issues2019-05-12T16:39:48+02:00https://gitlab.inria.fr/scripta/escriptorium/-/issues/78text find and replace2019-05-12T16:39:48+02:00STOKL BEN EZRA Danieltext find and replaceFind and
find and replace textual values (if possible with regex).Find and
find and replace textual values (if possible with regex).https://gitlab.inria.fr/scripta/escriptorium/-/issues/74one RTL issue in the transcription page (not box) returned (not urgent)2019-05-13T09:45:03+02:00STOKL BEN EZRA Danielone RTL issue in the transcription page (not box) returned (not urgent)One of the old RTL issues persists here for the last http://ns342141.ip-5-196-76.eu/document/68/part/4768/edit/
![image](/uploads/9dfea4ff49d848e20fa2186be7654cb8/image.png)One of the old RTL issues persists here for the last http://ns342141.ip-5-196-76.eu/document/68/part/4768/edit/
![image](/uploads/9dfea4ff49d848e20fa2186be7654cb8/image.png)https://gitlab.inria.fr/scripta/escriptorium/-/issues/76segment manually but transcribe automatically2019-05-13T12:07:08+02:00STOKL BEN EZRA Danielsegment manually but transcribe automaticallyI just uploaded a binarized image, segmented manually two lines and wanted to run the transcriber through it. This turned out to be impossible. The system automatically rebinarized, resegmented (completely wrong) and then the transcripti...I just uploaded a binarized image, segmented manually two lines and wanted to run the transcriber through it. This turned out to be impossible. The system automatically rebinarized, resegmented (completely wrong) and then the transcription is crap of course.https://gitlab.inria.fr/scripta/escriptorium/-/issues/77transcription box does not take correct coordinates2019-05-13T12:08:22+02:00STOKL BEN EZRA Danieltranscription box does not take correct coordinatesI think sometimes the transcription box takes wrong coordinates, e.g. here
http://ns342141.ip-5-196-76.eu/document/70/part/5244/edit/
among the first lines, which shows only the very top, but the automatic transcription goes into the ri...I think sometimes the transcription box takes wrong coordinates, e.g. here
http://ns342141.ip-5-196-76.eu/document/70/part/5244/edit/
among the first lines, which shows only the very top, but the automatic transcription goes into the right direction and has to be based on other coordinates, which are shown correctly on the white rectangle on the left on the manuscript image.
If I manually enlarge the box on the manuscript, the transcription box does not change coordinates. Does it have a fixed size or sthg like that?
![image](/uploads/a0ca598ece4e504146045eba42f69ca5/image.png)https://gitlab.inria.fr/scripta/escriptorium/-/issues/79no export for regions only2019-05-15T13:45:50+02:00STOKL BEN EZRA Danielno export for regions onlyI imported regions to http://ns342141.ip-5-196-76.eu/document/66 with altoXML import.
I then corrected the coordinates.Worked nicely (have some ideas for ergonomics, but is already really functional).
BUT THEN: the export is empty (prob...I imported regions to http://ns342141.ip-5-196-76.eu/document/66 with altoXML import.
I then corrected the coordinates.Worked nicely (have some ideas for ergonomics, but is already really functional).
BUT THEN: the export is empty (probably focusing on lines, and when there are no lines, there are also no regions...
See below for file-content:
[export-parma_3173-2019-05-12T17_37.xml](/uploads/041195e67e989e696525e35e26d40d64/export-parma_3173-2019-05-12T17_37.xml)https://gitlab.inria.fr/scripta/escriptorium/-/issues/81need folders for documents to distinguish projects2019-05-15T13:46:44+02:00STOKL BEN EZRA Danielneed folders for documents to distinguish projectsWe will need folders to organize documents in different projects. Especially if we work on thousands of single pages, they should not fill the screen but be able to be organized in n levels.We will need folders to organize documents in different projects. Especially if we work on thousands of single pages, they should not fill the screen but be able to be organized in n levels.https://gitlab.inria.fr/scripta/escriptorium/-/issues/72Import doesn't change image name when there are duplicates2019-05-27T10:16:08+02:00Robin TissotImport doesn't change image name when there are duplicatesSo if one deletes one of the images, the other one bug.
Try to use the django storage facilities to deal with this.So if one deletes one of the images, the other one bug.
Try to use the django storage facilities to deal with this.https://gitlab.inria.fr/scripta/escriptorium/-/issues/68strange image upload bug2019-05-27T10:44:27+02:00STOKL BEN EZRA Danielstrange image upload bugI tried to upload a new Hebrew document for Princeton and everything seemed to work. It loads 271 images but does not show neither the small tabs nor the big images, i.e. http://ns342141.ip-5-196-76.eu/media/documents/52/00000006.pngI tried to upload a new Hebrew document for Princeton and everything seemed to work. It loads 271 images but does not show neither the small tabs nor the big images, i.e. http://ns342141.ip-5-196-76.eu/media/documents/52/00000006.pngSTOKL BEN EZRA DanielSTOKL BEN EZRA Danielhttps://gitlab.inria.fr/scripta/escriptorium/-/issues/85IIIF Import fails2019-05-29T10:38:36+02:00Peter StokesIIIF Import failsIIIF import fails for manuscripts from Cambridge University Library, giving `value too long for type character varying(512)` error. Some sample manifests are
* https://cudl.lib.cam.ac.uk/iiif/MS-KK-00003-00018
* https://cudl.lib.cam.ac....IIIF import fails for manuscripts from Cambridge University Library, giving `value too long for type character varying(512)` error. Some sample manifests are
* https://cudl.lib.cam.ac.uk/iiif/MS-KK-00003-00018
* https://cudl.lib.cam.ac.uk/iiif/MS-HH-00001-00010
Import also fails for Trinity College Cambridge, giving `HTTPSConnectionPool(host='trin-digital-library.trin.cam.ac.uk', port=443): Max retries exceeded with url: /iiif/2/B.15.34%2F001_B.15.34_front-cover.jpg/full/full/0/default.jpg (Caused by SSLError(SSLError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:852)'),))` (or similar)
Sample manifests:
* https://trin-sites-pub.trin.cam.ac.uk/manuscripts/B.15.34.json
* https://trin-sites-pub.trin.cam.ac.uk/manuscripts/B.10.4.jsonhttps://gitlab.inria.fr/scripta/escriptorium/-/issues/84Option to reduce file size on IIIF import2019-05-29T10:39:55+02:00Peter StokesOption to reduce file size on IIIF importTaking the full-res images on the IIIF import is often unnecessary and makes everything much slower (e.g. https://dms-data.stanford.edu/data/manifests/Parker/pz542dy6146/manifest.json has been running over a day). An option to import sca...Taking the full-res images on the IIIF import is often unnecessary and makes everything much slower (e.g. https://dms-data.stanford.edu/data/manifests/Parker/pz542dy6146/manifest.json has been running over a day). An option to import scaled images would be very helpful (e.g. by percentage).https://gitlab.inria.fr/scripta/escriptorium/-/issues/21Export2019-05-29T10:41:11+02:00Robin TissotExportDecide where the button should be.
For now a very simple xml file with only lines would be enough and needed for the first milestone.Decide where the button should be.
For now a very simple xml file with only lines would be enough and needed for the first milestone.https://gitlab.inria.fr/scripta/escriptorium/-/issues/92Clicking on arrow in edit window to go to previou s or next page does not cha...2019-06-03T12:37:20+02:00STOKL BEN EZRA DanielClicking on arrow in edit window to go to previou s or next page does not change page image shownhttps://gitlab.inria.fr/scripta/escriptorium/-/issues/90v0.5 issues2019-06-03T12:37:30+02:00Robin Tissotv0.5 issues* ~~importing images doesn't update the selected counter in the images tab~~
* ~~error 500 when trying to train on non-binarised images!~~
* ~~very thin lines boundaries are wrong in the transcription popup because there is a minimum hei...* ~~importing images doesn't update the selected counter in the images tab~~
* ~~error 500 when trying to train on non-binarised images!~~
* ~~very thin lines boundaries are wrong in the transcription popup because there is a minimum height for the image and input - regression?!~~
* Transcription progress (the %age in the card) seems broken.https://gitlab.inria.fr/scripta/escriptorium/-/issues/91further small issues with version 0.52019-06-03T14:53:51+02:00STOKL BEN EZRA Danielfurther small issues with version 0.5http://ns342141.ip-5-196-76.eu/document/147/images/?select=9131 is binarized but the binariyation icon continues to blink.
I cannot select it for transcription.
There is a bug for the regions: ![image](/uploads/bc1a5383a1e9a506154ece8adf...http://ns342141.ip-5-196-76.eu/document/147/images/?select=9131 is binarized but the binariyation icon continues to blink.
I cannot select it for transcription.
There is a bug for the regions: ![image](/uploads/bc1a5383a1e9a506154ece8adff784b7/image.png)https://gitlab.inria.fr/scripta/escriptorium/-/issues/97transcribe in page text view not only in text box2019-06-24T09:28:06+02:00STOKL BEN EZRA Danieltranscribe in page text view not only in text boxIf the text has few mistakes it may actually be more comfortable to correct directly on the page text view.If the text has few mistakes it may actually be more comfortable to correct directly on the page text view.https://gitlab.inria.fr/scripta/escriptorium/-/issues/93training: visualization of progress needed2019-06-24T09:37:25+02:00STOKL BEN EZRA Danieltraining: visualization of progress neededIt would be good if the training option gives some feedback about the stand of things. I launched it 48 hours ago. It should have finished along time ago. I would suggest
1. a sign "training data creation finished" --> begin training
2. ...It would be good if the training option gives some feedback about the stand of things. I launched it 48 hours ago. It should have finished along time ago. I would suggest
1. a sign "training data creation finished" --> begin training
2. a progress report every epoch as in kraken giving the CER on the testset and perhaps visualizing the current transcription of 5-10 row images so that one can check whether things are going into a good direction.https://gitlab.inria.fr/scripta/escriptorium/-/issues/71(small) annotation validation parser2019-06-28T12:22:24+02:00STOKL BEN EZRA Daniel(small) annotation validation parserIt would be essential in order to avoid later chaos to have a very small annotation validation parser that would do the following:
Each time the user opens a page with a new tag, which all begin with <, the algorithm checks
1. Whether ...It would be essential in order to avoid later chaos to have a very small annotation validation parser that would do the following:
Each time the user opens a page with a new tag, which all begin with <, the algorithm checks
1. Whether the symbolchain is in the list of allowed chains
2. It opens a warning symbol including ! and the tag-symbol on the top right corner, (or, nice dream, next to the text on the right line) until the tag is closed on that page.
If either one is not the case there is a warning somewhere (e.g. right top corner red !) and clicking on it opens a textbox listing the tag (and its place?) which has not been closed/ which is not in the list.
One can get the same kind of warning on the document tab and get a list of all tags (and their page numbers), which are not closed on that page.
There can of course be situations in which a tag is only closed on the next page, like a placename that begins on the bottom and continues on the top of the next.https://gitlab.inria.fr/scripta/escriptorium/-/issues/60Ability to transcribe rotated box and / or polygons2019-06-28T12:22:52+02:00STOKL BEN EZRA DanielAbility to transcribe rotated box and / or polygonsIt would be very important for the upcoming presentation to be able to transcribe lines that are not completely straight, because most manuscript pages contain them. So it would be good to be able to have a more precise display for cases...It would be very important for the upcoming presentation to be able to transcribe lines that are not completely straight, because most manuscript pages contain them. So it would be good to be able to have a more precise display for cases like:
![image](/uploads/273407395afebf9f3f8a1727a51af055/image.png)
and the corresponding transcription window
![image](/uploads/6d1481dcc3bb3c512bff9e429dd23ba2/image.png)
Even only for manual markup or via altoXML import which allows printspace, textblocks, textlines etc with a @rotation attribute; or polygons with the @shape attribute.
Ben can you already tell us how kraken's new segmenter will deal with them? I assume it is already clear to you, right?KIESSLING BenjaminKIESSLING Benjaminhttps://gitlab.inria.fr/scripta/escriptorium/-/issues/31feature request: alternating color for pair-impair lines2019-06-28T12:24:20+02:00STOKL BEN EZRA Danielfeature request: alternating color for pair-impair linesalternating color for pair-impair lines to quickly check correctness of line segmentationalternating color for pair-impair lines to quickly check correctness of line segmentationRobin TissotRobin Tissothttps://gitlab.inria.fr/scripta/escriptorium/-/issues/45Populate typologies and metadata keys2019-06-28T12:24:33+02:00Robin TissotPopulate typologies and metadata keysCan be found at:
/admin/core/metadata/
/admin/core/typology/
And should we let users create freely more of those?Can be found at:
/admin/core/metadata/
/admin/core/typology/
And should we let users create freely more of those?