Über
OCR-D Phase III
Das OCR-D-Projekt
Kontaktieren Sie uns!
Blog
Publikationen und Vorträge
Modulprojekte
Daten
Erste Teststellung
Nutzer*innenumfrage
Impressum
Entwickler*innen
Ground Truth Richtlinien
PAGE-XML Formatdokumentation
Best Practices für Softwareentwicklung in OCR-D
Spezifikationen
OCR-D/core API Dokumentation
Anwender*innen
Setup Anleitung
Nutzeranleitung
Workflows
Modelle
Glossar
FAQ
Guidelines for the Ground Truth Transcription
Guidelines for the Ground Truth Transcription
The Ground-Truth-Guidelines
Conventions for these Guidelines
Transcription
Level 1
Level 2
Level 3
Fundamentals of the Transcription
How to Transcribe in Level 1
How to Transcribe in Level 2
How to Transcribe in Level 3
Spellings and Symbols
Distinction between I and J
Level 1
Level 2 and 3
Distinction between u/f, u/v and v/u
Level 1
Level 2 and 3
s-Graphemes
Level 1
Level 2 and 3
r-Graphemes
Level 1
Level 2 and 3
Ligatures
Level 1
Level 2
Level 3
Umlauts
Level 1
Level 2
Level 3
Abbreviation Lines
Level 1
Level 2
Level 3
Diacritics
Level 1
Level 2
Level 3
Hyphenation
Level 1
Level 2 and 3
Numbers
Fractions
Roman Numerals
Superscript Numbers
Subscript Numbers
Tables
Handwritten annotations
Punctuation Basic Rules
Dash
Quotation Marks
Spaces
Level 1 and Level 2
Level 3
Comparison between Level 1, 2 and Level 3
Overviews and Examples
OCR-D Coordination Project Coding
Alphabets, Abbreviations and Special Characters
Ligatures
Layout and Structure
General Information
Print Space
Page Margin
ReadingOrder
Typographical Peculiarities
Typeface (TextStyle)
Ligatures (Level 1 and 2)
First Step : The Page Types
Second Step: Page Regions
Level 1
Level 2
Relations
TextRegion
Paragraph
Heading
Column header (header)
Page-number
Marginalia
Footnote (footnote / footnote-continued / endnote)
Initial (drop-capital)
Signature mark
Catch-word
Floating Elements in the Print Space (floating)
Table of Content (TOC-entry)
Illustrations, photos (ImageRegion)
Book decoration, drawings (GraphicRegion)
Separation Lines, Separators (SeparatorRegion)
Level 1
Level 2
Tables (TableRegion)
Mathematical characters (MathsRegion)
Chemical symbols (ChemRegion)
Notes (MusicRegion)
Advertisement (AdvertRegion)
Damage, Dirt, Stains, Noise (NoiseRegion)
Other (UnknownRegion)
Documentation of the OCR-D Structure Ground Truth
Definition
Background
Overview and Concordance
Structure concordance METS and PAGE
Documentation of the PAGE XML Format for Page Content
Main Schema
Main schema pagecontent.xsd
Element
Element pc:PcGts
Complex Type
Complex Type pc:PcGtsType
Complex Type pc:MetadataType
Complex Type pc:UserDefinedType
Complex Type pc:UserAttributeType
Complex Type pc:MetadataItemType
Complex Type pc:LabelsType
Complex Type pc:LabelType
Complex Type pc:PageType
Complex Type pc:AlternativeImageType
Complex Type pc:BorderType
Complex Type pc:CoordsType
Complex Type pc:PrintSpaceType
Complex Type pc:ReadingOrderType
Complex Type pc:OrderedGroupType
Complex Type pc:RegionRefIndexedType
Complex Type pc:OrderedGroupIndexedType
Complex Type pc:UnorderedGroupIndexedType
Complex Type pc:RegionRefType
Complex Type pc:UnorderedGroupType
Complex Type pc:LayersType
Complex Type pc:LayerType
Complex Type pc:RelationsType
Complex Type pc:RelationType
Complex Type pc:TextRegionType
Complex Type pc:RegionType
Complex Type pc:RolesType
Complex Type pc:TableCellRoleType
Complex Type pc:ImageRegionType
Complex Type pc:LineDrawingRegionType
Complex Type pc:GraphicRegionType
Complex Type pc:TableRegionType
Complex Type pc:ChartRegionType
Complex Type pc:SeparatorRegionType
Complex Type pc:MathsRegionType
Complex Type pc:ChemRegionType
Complex Type pc:MusicRegionType
Complex Type pc:AdvertRegionType
Complex Type pc:NoiseRegionType
Complex Type pc:UnknownRegionType
Complex Type pc:CustomRegionType
Complex Type pc:GridType
Complex Type pc:GridPointsType
Complex Type pc:TextLineType
Complex Type pc:BaselineType
Complex Type pc:WordType
Complex Type pc:GlyphType
Complex Type pc:GraphemesType
Complex Type pc:GraphemeType
Complex Type pc:GraphemeBaseType
Complex Type pc:TextEquivType
Complex Type pc:NonPrintingCharType
Complex Type pc:GraphemeGroupType
Complex Type pc:TextStyleType
Complex Type pc:MapRegionType
Simple Type
Simple Type pc:ConfSimpleType
Simple Type pc:PointsType
Simple Type pc:GroupTypeSimpleType
Simple Type pc:ColourSimpleType
Simple Type pc:ChartTypeSimpleType
Simple Type pc:GraphicsTypeSimpleType
Simple Type pc:ColourDepthSimpleType
Simple Type pc:TextDataTypeSimpleType
Simple Type pc:ScriptSimpleType
Simple Type pc:ProductionSimpleType
Simple Type pc:LanguageSimpleType
Simple Type pc:ReadingDirectionSimpleType
Simple Type pc:TextTypeSimpleType
Simple Type pc:TextLineOrderSimpleType
Simple Type pc:AlignSimpleType
Simple Type pc:PageTypeSimpleType
Page XML Extensions
Exif-PageXML Konkordanz
Imprint