The technical printing conditions are reproduced and an interpretation of signs is oriented towards their use in the language and writing system. For example, regarding ligatures, only independent graphemes (vocal ligatures) with a specific code point are reproduced using standardized codes (Unicode). Ligatures resulting from the printing technology (consonant ligatures) are generally split up. The information ligature is documented as a formatting specification (such as bold, italic...) in the Ground Truth.
If the character can only be formed from a combination of two characters within the Unicode standard, this combination shall be used and preferred to the encodings defined by the coordinating project.
Spaces are only reproduced as separators of words.
Punctuation marks are always added to the preceding word.