If possible, diacritics are realized using the Unicode entities from the Combining Diacritical Marks Unicode block, e.g. the superscript o over u or U (ͦ, COMBINING LATIN SMALL LETTER O).
When using combinable characters, write the basic character followed "seamlessly" by the diacritic. (e.g. uͦ uͦ).
The spelling for the c-Cedille (ç, U+00E7, LATIN SMALL LETTER C WITH CEDILLA), the e caudata (ę, ę, LATIN SMALL LETTER E WITH OGONEK) in the meaning ae or the e with trema (ë, ë, LATIN SMALL LETTER E WITH DIAERESIS) is done without the Combining Diacritical Marks.
The transcription is based on the German or Latin character set. Characters of other alphabets (Greek, Cyrillic, Hebrew, etc.) are realized using their corresponding Unicode entities. The respective Unicode standard which is valid at the time of the transcription is to be applied. The Unicode lists, which cover a large number of different cases, can be found at http://www.unicode.org/charts/.
The most important lists at a glance:
- Latin characters standard (Controls and Basic Latin)
- Supplement to the latin character set (Controls and Latin-1 Supplement)
- Greek character set (Greek and Coptic)
- Extended greek character set (Greek extended)
- Cyrillic character set (Cyrillic)
- Combined diacritical marks (Combining diacritical marks)