ocrd_models.ocrd_page module¶
API to PAGE-XML, generated with generateDS from XML schema.
- ocrd_models.ocrd_page.parse(inFileName, silence=False, print_warnings=True)[source]¶
Parse a file, create the object tree, and export it.
- Parameters:
inFileName (str) –
print_warnings (boolean) –
- Returns:
The root object in the tree.
- ocrd_models.ocrd_page.parseEtree(inFileName, silence=False, print_warnings=True, mapping=None, nsmap=None)[source]¶
Parse a file, create the object tree, and export it. Return tree and mappings, too.
- Parameters:
inFileName (str) –
print_warnings (boolean) –
- Returns:
- A tuple of
The root object in the tree.
The full node tree.
A mapping from object IDs to tree nodes.
A reverse mapping from tree nodes to object IDs.
- ocrd_models.ocrd_page.parseString(inString, silence=False, print_warnings=True)[source]¶
Parse a string, create the object tree, and export it.
- Parameters:
inString (str) –
- Returns:
The root object in the tree.
- class ocrd_models.ocrd_page.AdvertRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, bgColour=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing advertisements. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The background colour of the region
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='AdvertRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='AdvertRegionType')[source]¶
- class ocrd_models.ocrd_page.AlternativeImageType(filename=None, comments=None, conf=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Confidence value (between 0 and 1)
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='AlternativeImageType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='AlternativeImageType')[source]¶
- class ocrd_models.ocrd_page.BaselineType(points=None, conf=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Confidence value (between 0 and 1)
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- validate_PointsType_patterns_ = [['^(([0-9]+,[0-9]+ )+([0-9]+,[0-9]+))$']]¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='BaselineType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='BaselineType')[source]¶
- class ocrd_models.ocrd_page.BorderType(Coords=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Border of the actual page (if the scanned image contains parts not belonging to the page).
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='BorderType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='BorderType')[source]¶
- exportChildren(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='BorderType', fromsubclass_=False, pretty_print=True)[source]¶
- set_Coords(Coords)[source]¶
Set coordinate polygon by given
CoordsType
object. Moreover, invalidate self’s ``pc:AlternativeImage``s (because they will have been cropped with a bbox of the previous polygon).
- class ocrd_models.ocrd_page.ChartRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, type_=None, numColours=None, bgColour=None, embText=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing charts or graphs of any type, should be marked as chart regions. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The type of chart in the region An approximation of the number of colours used in the region The background colour of the region Specifies whether the region also contains text
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='ChartRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='ChartRegionType')[source]¶
- class ocrd_models.ocrd_page.ChemRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, bgColour=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing chemical formulas. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The background colour of the region
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='ChemRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='ChemRegionType')[source]¶
- class ocrd_models.ocrd_page.CoordsType(points=None, conf=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Polygon outline of the element as a path of points. No points may lie outside the outline of its parent, which in the case of Border is the bounding rectangle of the root image. Paths are closed by convention, i.e. the last point logically connects with the first (and at least 3 points are required to span an area). Paths must be planar (i.e. must not self-intersect). Confidence value (between 0 and 1)
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- validate_PointsType_patterns_ = [['^(([0-9]+,[0-9]+ )+([0-9]+,[0-9]+))$']]¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='CoordsType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='CoordsType')[source]¶
- class ocrd_models.ocrd_page.CustomRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, type_=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing content that is not covered by the default types (text, graphic, image, line drawing, chart, table, separator, maths, map, music, chem, advert, noise, unknown). Information on the type of content represented by this region
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='CustomRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='CustomRegionType')[source]¶
- class ocrd_models.ocrd_page.GlyphType(id=None, ligature=None, symbol=None, script=None, production=None, custom=None, comments=None, AlternativeImage=None, Coords=None, Graphemes=None, TextEquiv=None, TextStyle=None, UserDefined=None, Labels=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
The script used for the glyph Overrides the production attribute of the parent word / text line / text region. For generic use
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GlyphType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='GlyphType')[source]¶
- exportChildren(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GlyphType', fromsubclass_=False, pretty_print=True)[source]¶
- invalidate_AlternativeImage(feature_selector=None)[source]¶
Remove derived images from this segment (due to changed coordinates).
If feature_selector is not none, remove only images with matching
@comments
, e.g.feature_selector=cropped,deskewed
.
- set_Coords(Coords)[source]¶
Set coordinate polygon by given
CoordsType
object. Moreover, invalidate self’s ``pc:AlternativeImage``s (because they will have been cropped with a bbox of the previous polygon).
- class ocrd_models.ocrd_page.GraphemeBaseType(id=None, index=None, ligature=None, charType=None, custom=None, comments=None, TextEquiv=None, extensiontype_=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Base type for graphemes, grapheme groups and non-printing characters. Order index of grapheme, group, or non-printing character within the parent container (graphemes or glyph or grapheme group). Type of character represented by the grapheme, group, or non-printing character element. For generic useFor generic use
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GraphemeBaseType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='GraphemeBaseType')[source]¶
- class ocrd_models.ocrd_page.GraphemeGroupType(id=None, index=None, ligature=None, charType=None, custom=None, comments=None, TextEquiv=None, Grapheme=None, NonPrintingChar=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GraphemeBaseType
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
GraphemeBaseType
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GraphemeGroupType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='GraphemeGroupType')[source]¶
- class ocrd_models.ocrd_page.GraphemeType(id=None, index=None, ligature=None, charType=None, custom=None, comments=None, TextEquiv=None, Coords=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GraphemeBaseType
Represents a sub-element of a glyph. Smallest graphical unit that can be assigned a Unicode code point.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
GraphemeBaseType
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GraphemeType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='GraphemeType')[source]¶
- class ocrd_models.ocrd_page.GraphemesType(Grapheme=None, NonPrintingChar=None, GraphemeGroup=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Container for graphemes, grapheme groups and non-printing characters.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GraphemesType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='GraphemesType')[source]¶
- class ocrd_models.ocrd_page.GraphicRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, type_=None, numColours=None, embText=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing simple graphics, such as a company logo, should be marked as graphic regions. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The type of graphic in the region An approximation of the number of colours used in the region Specifies whether the region also contains text.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='GraphicRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='GraphicRegionType')[source]¶
- class ocrd_models.ocrd_page.GridPointsType(index=None, points=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Points with x,y coordinates. The grid row index
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- validate_PointsType_patterns_ = [['^(([0-9]+,[0-9]+ )+([0-9]+,[0-9]+))$']]¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GridPointsType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='GridPointsType')[source]¶
- class ocrd_models.ocrd_page.GridType(GridPoints=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Matrix of grid points defining the table grid on the page.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='GridType', pretty_print=True)[source]¶
- class ocrd_models.ocrd_page.ImageRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, colourDepth=None, bgColour=None, embText=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
An image is considered to be more intricate and complex than a graphic. These can be photos or drawings. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The colour bit depth required for the region The background colour of the region Specifies whether the region also contains text
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='ImageRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='ImageRegionType')[source]¶
- class ocrd_models.ocrd_page.LabelType(value=None, type_=None, comments=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Semantic label The label / tag (e.g. ‘person’). Can be an RDF resource identifier (e.g. object of an RDF triple). Additional information on the label (e.g. ‘YYYY-mm-dd’ for a date label). Can be used as predicate of an RDF triple.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='LabelType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='LabelType')[source]¶
- class ocrd_models.ocrd_page.LabelsType(externalModel=None, externalId=None, prefix=None, comments=None, Label=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Reference to external model / ontology / schema E.g. an RDF resource identifier (to be used as subject or object of an RDF triple) Prefix for all labels (e.g. first part of an URI)
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='LabelsType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='LabelsType')[source]¶
- class ocrd_models.ocrd_page.LayerType(id=None, zIndex=None, caption=None, RegionRef=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='LayerType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='LayerType')[source]¶
- class ocrd_models.ocrd_page.LayersType(Layer=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Can be used to express the z-index of overlapping regions. An element with a greater z-index is always in front of another element with lower z-index.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='LayersType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='LayersType')[source]¶
- class ocrd_models.ocrd_page.LineDrawingRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, penColour=None, bgColour=None, embText=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
A line drawing is a single colour illustration without solid areas. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The pen (foreground) colour of the region The background colour of the region Specifies whether the region also contains text
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='LineDrawingRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='LineDrawingRegionType')[source]¶
- class ocrd_models.ocrd_page.MapRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing maps. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='MapRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='MapRegionType')[source]¶
- class ocrd_models.ocrd_page.MathsRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, bgColour=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing equations and mathematical symbols should be marked as maths regions. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The background colour of the region
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='MathsRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='MathsRegionType')[source]¶
- class ocrd_models.ocrd_page.MetadataItemType(type_=None, name=None, value=None, date=None, Labels=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Type of metadata (e.g. author) E.g. imagePhotometricInterpretation E.g. RGB
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='MetadataItemType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='MetadataItemType')[source]¶
- class ocrd_models.ocrd_page.MetadataType(externalRef=None, Creator=None, Created=None, LastChange=None, Comments=None, UserDefined=None, MetadataItem=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
External reference of any kind
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15" xmlns:None="http://www.w3.org/2001/XMLSchema" ', name_='MetadataType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='MetadataType')[source]¶
- class ocrd_models.ocrd_page.MusicRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, orientation=None, bgColour=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Regions containing musical notations. The angle the rectangle encapsulating a region has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). Range: -179.999,180 The background colour of the region
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='MusicRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='MusicRegionType')[source]¶
- class ocrd_models.ocrd_page.NoiseRegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
RegionType
Noise regions are regions where no real data lies, only false data created by artifacts on the document or scanner noise.
- member_data_items_ = []¶
- subclass = None¶
- superclass¶
alias of
RegionType
- export(outfile, level, namespaceprefix_='', namespacedef_='', name_='NoiseRegionType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='NoiseRegionType')[source]¶
- class ocrd_models.ocrd_page.NonPrintingCharType(id=None, index=None, ligature=None, charType=None, custom=None, comments=None, TextEquiv=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GraphemeBaseType
A glyph component without visual representation but with Unicode code point. Non-visual / non-printing / control character. Part of grapheme container (of glyph) or grapheme sub group.
- member_data_items_ = []¶
- subclass = None¶
- superclass¶
alias of
GraphemeBaseType
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='NonPrintingCharType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='NonPrintingCharType')[source]¶
- class ocrd_models.ocrd_page.OrderedGroupIndexedType(id=None, regionRef=None, index=None, caption=None, type_=None, continuation=None, custom=None, comments=None, UserDefined=None, Labels=None, RegionRefIndexed=None, OrderedGroupIndexed=None, UnorderedGroupIndexed=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Indexed group containing ordered elements Optional link to a parent region of nested regions. The parent region doubles as reading order group. Only the nested regions should be allowed as group members. Position (order number) of this item within the current hierarchy level. Is this group a continuation of another group (from previous column or page, for example)? For generic use
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='OrderedGroupIndexedType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='OrderedGroupIndexedType')[source]¶
- get_AllIndexed(classes=None, index_sort=True)[source]¶
Get all indexed children sorted by their
@index
.- Parameters:
classes (list) – Type of children (sans
Indexed
) to return. Default:['RegionRef', 'OrderedGroup', 'UnorderedGroup']
index_sort (boolean) – Whether to sort by
@index
- Returns:
a list of
RegionRefIndexedType
,OrderedGroupIndexedType
, andUnorderedGroupIndexedType
- extend_AllIndexed(elements, validate_continuity=False)[source]¶
Add all elements in list elements, respecting
@index
order. With validate_continuity, check that all new elements come after all old elements (or raise an exception). Otherwise, ensure this condition silently (by increasing@index
accordingly).
- class ocrd_models.ocrd_page.OrderedGroupType(id=None, regionRef=None, caption=None, type_=None, continuation=None, custom=None, comments=None, UserDefined=None, Labels=None, RegionRefIndexed=None, OrderedGroupIndexed=None, UnorderedGroupIndexed=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Numbered group (contains ordered elements) Optional link to a parent region of nested regions. The parent region doubles as reading order group. Only the nested regions should be allowed as group members. Is this group a continuation of another group (from previous column or page, for example)? For generic use
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='OrderedGroupType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='OrderedGroupType')[source]¶
- get_AllIndexed(classes=None, index_sort=True)[source]¶
Get all indexed children sorted by their
@index
.- Parameters:
classes (list) – Type of children (sans
Indexed
) to return. Default:['RegionRef', 'OrderedGroup', 'UnorderedGroup']
index_sort (boolean) – Whether to sort by
@index
- Returns:
a list of
RegionRefIndexedType
,OrderedGroupIndexedType
, andUnorderedGroupIndexedType
- extend_AllIndexed(elements, validate_continuity=False)[source]¶
Add all elements in list elements, respecting
@index
order. With validate_continuity, check that all new elements come after all old elements (or raise an exception). Otherwise, ensure this condition silently (by increasing@index
accordingly).
- class ocrd_models.ocrd_page.PageType(imageFilename=None, imageWidth=None, imageHeight=None, imageXResolution=None, imageYResolution=None, imageResolutionUnit=None, custom=None, orientation=None, type_=None, primaryLanguage=None, secondaryLanguage=None, primaryScript=None, secondaryScript=None, readingDirection=None, textLineOrder=None, conf=None, AlternativeImage=None, Border=None, PrintSpace=None, ReadingOrder=None, Layers=None, Relations=None, TextStyle=None, UserDefined=None, Labels=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, MapRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Contains the image file name including the file extension. Specifies the width of the image.Specifies the height of the image.Specifies the image resolution in width.Specifies the image resolution in height. Specifies the unit of the resolution information referring to a standardised unit of measurement (pixels per inch, pixels per centimeter or other). For generic use The angle the rectangle encapsulating the page (or its Border) has to be rotated in clockwise direction in order to correct the present skew (negative values indicate anti-clockwise rotation). (The rotated image can be further referenced via “AlternativeImage”.) Range: -179.999,180 The type of the page within the document (e.g. cover page). The primary language used in the page (lower-level definitions override the page-level definition). The secondary language used in the page (lower-level definitions override the page-level definition). The primary script used in the page (lower-level definitions override the page-level definition). The secondary script used in the page (lower-level definitions override the page-level definition). The direction in which text within lines should be read (order of words and characters), in addition to “textLineOrder” (lower-level definitions override the page-level definition). The order of text lines within a block, in addition to “readingDirection” (lower-level definitions override the page-level definition). Confidence value for whole page (between 0 and 1)
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='PageType', pretty_print=True)[source]¶
- exportChildren(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='PageType', fromsubclass_=False, pretty_print=True)[source]¶
- property id¶
- get_AllRegions(classes=None, order='document', depth=0)[source]¶
Get all the
*Region
elements, or only those provided by classes. Return in document order, unless order isreading-order
.- Parameters:
classes (list) – Classes of regions that shall be returned, e.g.
['Text', 'Image']
order ("document"|"reading-order"|"reading-order-only") – Whether to return regions sorted by document order (
document
, default) or by reading order with regions not in the reading order at the end of the returned list (reading-order
) or regions not in the reading order omitted (reading-order-only
)depth (int) – Recursive depth to look for regions at, set to 0 for all regions at any depth. Default: 0
- Returns:
a list of
TextRegionType
,ImageRegionType
,LineDrawingRegionType
,GraphicRegionType
,TableRegionType
,ChartRegionType
,MapRegionType
,SeparatorRegionType
,MathsRegionType
,ChemRegionType
,MusicRegionType
,AdvertRegionType
,NoiseRegionType
,UnknownRegionType
, and/orCustomRegionType
For example, to get all text anywhere on the page in reading order, use:
'\n'.join(line.get_TextEquiv()[0].Unicode for region in page.get_AllRegions(classes=['Text'], depth=0, order='reading-order') for line in region.get_TextLine())
- get_AllAlternativeImages(page=True, region=True, line=True, word=True, glyph=True)[source]¶
Get all the
pc:AlternativeImage
in a document- Parameters:
page (boolean) – Get images on
pc:Page
levelregion (boolean) – Get images on
pc:*Region
levelline (boolean) – Get images on
pc:TextLine
levelword (boolean) – Get images on
pc:Word
levelglyph (boolean) – Get images on
pc:Glyph
level
- Returns:
a list of
AlternativeImageType
- invalidate_AlternativeImage(feature_selector=None)[source]¶
Remove derived images from this segment (due to changed coordinates).
If feature_selector is not none, remove only images with matching
@comments
, e.g.feature_selector=cropped,deskewed
.
- set_Border(Border)[source]¶
Set coordinate polygon by given
BorderType
object. Moreover, invalidate self’s ``pc:AlternativeImage``s (because they will have been cropped with a bbox of the previous polygon).
- get_AllTextLines(region_order='document', respect_textline_order=True)[source]¶
Return all the TextLine in the document
- Parameters:
region_order ("document"|"reading-order"|"reading-order-only") – Whether to return regions sorted by document order (
document
, default) or by reading order with regions not in the reading order at the end of the returned list (reading-order
) or regions not in the reading order omitted (reading-order-only
)respect_textline_order (boolean) – Whether to respect @textLineOrder attribute
- Returns:
a list of
TextLineType
- class ocrd_models.ocrd_page.PcGtsType(pcGtsId=None, Metadata=None, Page=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='PcGtsType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='PcGtsType')[source]¶
- exportChildren(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='PcGtsType', fromsubclass_=False, pretty_print=True)[source]¶
- property id¶
- get_AllAlternativeImagePaths(page=True, region=True, line=True, word=True, glyph=True)[source]¶
Get all the
pc:AlternativeImage/@filename
paths referenced in the PAGE-XML document.- Parameters:
page (boolean) – Get images on
pc:Page
levelregion (boolean) – Get images on
pc:*Region
levelline (boolean) – Get images on
pc:TextLine
levelword (boolean) – Get images on
pc:Word
levelglyph (boolean) – Get images on
pc:Glyph
level
- Returns:
a list of image filename strings
- class ocrd_models.ocrd_page.PrintSpaceType(Coords=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Determines the effective area on the paper of a printed page. Its size is equal for all pages of a book (exceptions: titlepage, multipage pictures). It contains all living elements (except marginals) like body type, footnotes, headings, running titles. It does not contain pagenumber (if not part of running title), marginals, signature mark, preview words.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='PrintSpaceType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='PrintSpaceType')[source]¶
- class ocrd_models.ocrd_page.ReadingOrderType(conf=None, OrderedGroup=None, UnorderedGroup=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Definition of the reading order within the page. To express a reading order between elements they have to be included in an OrderedGroup. Groups may contain further groups. Confidence value (between 0 and 1)
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='ReadingOrderType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='ReadingOrderType')[source]¶
- class ocrd_models.ocrd_page.RegionRefIndexedType(index=None, regionRef=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
Numbered regionPosition (order number) of this item within the current hierarchy level.
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='RegionRefIndexedType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='RegionRefIndexedType')[source]¶
- class ocrd_models.ocrd_page.RegionRefType(regionRef=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶
- export(outfile, level, namespaceprefix_='', namespacedef_='xmlns:pc="http://schema.primaresearch.org/PAGE/gts/pagecontent/2019-07-15"', name_='RegionRefType', pretty_print=True)[source]¶
- exportAttributes(outfile, level, already_processed, namespaceprefix_='', name_='RegionRefType')[source]¶
- class ocrd_models.ocrd_page.RegionType(id=None, custom=None, comments=None, continuation=None, AlternativeImage=None, Coords=None, UserDefined=None, Labels=None, Roles=None, TextRegion=None, ImageRegion=None, LineDrawingRegion=None, GraphicRegion=None, TableRegion=None, ChartRegion=None, SeparatorRegion=None, MathsRegion=None, ChemRegion=None, MusicRegion=None, AdvertRegion=None, NoiseRegion=None, UnknownRegion=None, CustomRegion=None, extensiontype_=None, gds_collector_=None, **kwargs_)[source]¶
Bases:
GeneratedsSuper
For generic use Is this region a continuation of another region (in previous column or page, for example)?
- member_data_items_ = [<ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>, <ocrd_models.ocrd_page_generateds.MemberSpec_ object>]¶
- subclass = None¶
- superclass = None¶