About
OCR-D Phase III The OCR-D-project Get in touch! Blog Publications and Presentations Module Projects Data Initial tests User Survey Imprint
Developers
Ground Truth Guidelines PAGE-XML format documentation OCR-D development best practices Specifications OCR-D/core API Documentation
Users
Setup Guide User Guide Workflows Models Glossary
FAQ
  de  
Spaces
Richtlinien zur Transkription für Ground Truth Transcription Guidelines for Ground Truth
OCR-D: DFG-funded Initiative for Optical Character Recognition Development

  • The Ground-Truth-Guidelines
  • Conventions for these Guidelines
  • Transcription
    • Level 1
    • Level 2
    • Level 3
    • Fundamentals of the Transcription
    • Spellings and Symbols
    • Numbers
    • Tables
    • Maps
    • Handwritten annotations
    • Punctuation Basic Rules
      • Dash
      • Quotation Marks
      • Spaces
        • Level 1 and Level 2
        • Level 3
        • Comparison between Level 1, 2 and Level 3
    • Overviews and Examples
  • Layout and Structure
  • Documentation of the OCR-D Structure Ground Truth
  • Documentation of the PAGE XML Format for Page Content
  • Page XML Extensions
  • Imprint

Spaces

Spaces can be used, for example, for formatting and separating words or parts of words. Within the scope of the transcription levels the settings of blanks are defined differently.

  • Level 1 and Level 2
  • Level 3
  • Comparison between Level 1, 2 and Level 3
Parent topic: Punctuation
Related information
  • Level 1 and Level 2
  • Level 3
Published by OCR-D.

The guidelines for Ground Truth transcription are based on the OCR-D specs v3.4.0

DFG logo
GitHub | gitter | Wiki | Docker Hub | Technology Watch | sitemap.xml