About
OCR-D Phase III The OCR-D-project Get in touch! Blog Publications and Presentations Module Projects Data Initial tests User Survey Imprint
Developers
Ground Truth Guidelines PAGE-XML format documentation OCR-D development best practices Specifications OCR-D/core API Documentation
Users
Setup Guide User Guide Workflows Models Glossary
FAQ
  de  
Punctuation
Richtlinien zur Transkription für Ground Truth Transcription Guidelines for Ground Truth
OCR-D: DFG-funded Initiative for Optical Character Recognition Development

  • The Ground-Truth-Guidelines
  • Conventions for these Guidelines
  • Transcription
    • Level 1
    • Level 2
    • Level 3
    • Fundamentals of the Transcription
    • Spellings and Symbols
    • Numbers
    • Tables
    • Maps
    • Handwritten annotations
    • Punctuation Basic Rules
      • Dash
      • Quotation Marks
      • Spaces
    • Overviews and Examples
  • Layout and Structure
  • Documentation of the OCR-D Structure Ground Truth
  • Documentation of the PAGE XML Format for Page Content
  • Page XML Extensions
  • Imprint

Punctuation

All punctuation marks (question marks, exclamation marks, period, comma, semicolon, colon, virgel) are transcribed as printed. The punctuation is not normalized according to today's standards.

The Unicode table General Punctuation is valid for punctuation.

  • Dash
  • Quotation Marks
  • Spaces
Parent topic: Guidelines for the Ground Truth Transcription
Published by OCR-D.

The guidelines for Ground Truth transcription are based on the OCR-D specs v3.4.0

DFG logo
GitHub | gitter | Wiki | Docker Hub | Technology Watch | sitemap.xml