Transcription Guidelines for Ground Truth OCR-D: DFG-funded Initiative for Optical Character Recognition Development
How to Transcribe in Level 1
If the text to be transcribed can be recorded with Unicode characters, these
must be used exclusively.
Apart from the vocal ligatures, all ligatures are split.
If the character can only be formed by combining two characters, this
combination must be used.
If the character cannot be formed from the combination of several characters and if a MUFI
equivalent exists, use MUFI.
If options 1, 2, 3 are not possible, a code definition shall be used in consultation with the
OCR-D Coordination project following the joint agreements reached on major
international projects such as IMPACT, EEBO, ECCO.