About
OCR-D Phase III The OCR-D-project Get in touch! Blog Publications and Presentations Module Projects Data Initial tests User Survey Imprint
Developers
Ground Truth Guidelines PAGE-XML format documentation OCR-D development best practices Specifications OCR-D/core API Documentation
Users
Setup Guide User Guide Workflows Models Glossary
FAQ
  de  
Umlauts
Richtlinien zur Transkription für Ground Truth Transcription Guidelines for Ground Truth
OCR-D: DFG-funded Initiative for Optical Character Recognition Development

  • The Ground-Truth-Guidelines
  • Conventions for these Guidelines
  • Transcription
    • Level 1
    • Level 2
    • Level 3
    • Fundamentals of the Transcription
    • Spellings and Symbols
      • Distinction between I and J
      • Distinction between u/f, u/v and v/u
      • s-Graphemes
      • r-Graphemes
      • Ligatures
      • Umlauts
        • Level 1
        • Level 2
        • Level 3
      • Abbreviation Lines
      • Diacritics
      • Hyphenation
    • Numbers
    • Tables
    • Maps
    • Handwritten annotations
    • Punctuation Basic Rules
    • Overviews and Examples
  • Layout and Structure
  • Documentation of the OCR-D Structure Ground Truth
  • Documentation of the PAGE XML Format for Page Content
  • Page XML Extensions
  • Imprint

Umlauts

In the originals, there can be different spellings of umlauts such as the forms Ä, Ö, Ü, ä, ö, ü used today or a superscript e. The transcription is regulated in the respective levels.

  • Level 1
  • Level 2
  • Level 3
Parent topic: Spellings, Special Characters and Symbols
Published by OCR-D.

The guidelines for Ground Truth transcription are based on the OCR-D specs v3.4.0

DFG logo
GitHub | gitter | Wiki | Docker Hub | Technology Watch | sitemap.xml