Text in tables is always transcribed true to the original in Unicode format (coding in UTF-8) of the Unicode standard valid at the time of the transcription. Characters which are not displayed on the keyboard are written either
- as a Unicode hexadecimal entity or
- as a sign.
Mixing different Unicode spellings should be avoided. Nothing should be modernized, printing errors are to be accepted.
Exceptions and deviations are discussed in detail in this document.
The structure of the table is specified as attribute values in the TableRegion element. For more information, see the documentation for the PageXML format.