WackoWiki: Transliteration of Non-Roman Scripts

https://wackowiki.org/doc     Version: 4 (06.02.2022 11:30)

Transliteration of Non-Roman Scripts

Unicode

Since R6.0 WackoWiki uses the the Intl Transliterator class[link1] for the transliteration[link2] of file names and normalization[link3] of page tags.


for creating a new language schema for WackoWiki
/lang/lang.xy.php

I am not very sure about note #4. I even fixed the lang.el.php according to my iso language charset but i am not
quite sure for their use to wackowiki. I am trying to find the bottom of this. You can edit the below (or even delete it) as you please.

You should have in mind the below notes for creating the lang.xy.php file for your language schema.

  1. The lang.xy.php file for your language scheme must be in your language encoding charset and not in utf8. For example the greek iso charset is: iso-8859-7, so the charset encoding of the file (lang.el.php) must be iso-8859-7.

  1. To find the unicode_entities for your language you should read the appropieate language unicode chart from http://www.unicode.org/charts. The values are in hexadecimal numeric entities. So you must convert them to decimal numeric values.

  1. There are five values for Transliteration (you can find more about this at Transliteration[link7])
    • TranslitLettersFrom
    • TranslitLettersTo
    • TranslitBiLetters
    • TranslitCaps
    • TranslitSmall

The NpjLettersFrom and NpjLettersTo must be equal at length and value. The wackowiki convert from TranslitLettersFrom to TranslitLettersTo so the url for the wackowiki pages should be at transliteration format. The TranslitBiLetters is an array for converting digraphs letters.

My personal suggestion for TranslitLettersTo is to use only latin-1 letters.

  1. UPPER_P, LOWER_P and ALPHA_P values must be according to your language iso charset. I found the correct values from wikipedia, eg. ISO-8859-7 : http://en.wikipedia.org/wiki/ISO-8859-7

The values must be: \x and next the hex matrix value (row/col).
  1. the GREEK CAPITAL LETTER A = \xc1