... | ... | @@ -71,8 +71,31 @@ if you faced an error message like this |
|
|
|
|
|
Update the attributes xmlns and schemaLocation of `<PcGts>` to supported version as descirbed above.
|
|
|
By defaults the segmentation for the selected images, both regions and lines, will be deleted. You can disable this behavior by unchecking 'Override existing segmentation.', in which case the system will try to match the lines and regions by their `ID` attribute. The old content for matching lines is then stored in its history and new lines/regions are created when no matching existing element are found.
|
|
|
Baseline tag is optional in PageXml, and TextRegion have a liste of cordonnates as type x1,y1...xn,yn it describe a polygon
|
|
|
example of PageXML file :
|
|
|
TextRegion tag have a liste of coordinates as type `x1,y1 x2,y2...xn,yn` it describe a polygon. Baseline tag is optional in PageXml.
|
|
|
the content ca be stored in `Textline` or each `Word` is separated for example :
|
|
|
```xml<TextLine id="r2l1" custom="readingOrder {index:0;}">
|
|
|
<Coords points="150,64 346,60 425,81 "/>
|
|
|
<Baseline points="155,55 180,55 206,55 231,55 257,55 283,55 308,55"/>
|
|
|
<TextEquiv>
|
|
|
<Unicode>ܡ ܗܘܡ ܐܘ ܥܒ</Unicode>
|
|
|
</TextEquiv>
|
|
|
</TextLine>```
|
|
|
```xml<TextLine id="l1">
|
|
|
<Coords points="1550,422 1555,422"/>
|
|
|
<Word id="w122" language="Hebrew" primaryScript="Hebr - Hebrew" readingDirection="right-to-left">
|
|
|
<Coords points="926,424 926,426"/>
|
|
|
<TextEquiv>
|
|
|
<Unicode>ע"י</Unicode>
|
|
|
</TextEquiv></Word>
|
|
|
<Word id="w45" language="Hebrew" primaryScript="Hebr - Hebrew" readingDirection="right-to-left">
|
|
|
<Coords points="531,464 687,464 "/>
|
|
|
<TextEquiv>
|
|
|
<Unicode>הוט</Unicode>
|
|
|
</TextEquiv>
|
|
|
</Word>
|
|
|
<TextLine>```
|
|
|
|
|
|
example of full PageXML file :
|
|
|
```xml
|
|
|
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
|
|
|
<PcGts xmlns="http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15 http://schema.primaresearch.org/PAGE/gts/pagecontent/2013-07-15/pagecontent.xsd">
|
... | ... | |