Transliteration of Non-Roman Scripts
Unicode
Since R6.0 WackoWiki uses the the Intl Transliterator class for the transliteration of file names and normalization of page tags.
for creating a new language schema for WackoWiki
/lang/lang.xy.php
I am not very sure about note #4. I even fixed the lang.el.php according to my iso language charset but i am not
quite sure for their use to wackowiki. I am trying to find the bottom of this. You can edit the below (or even delete it) as you please.
You should have in mind the below notes for creating the lang.xy.php file for your language schema.
- The lang.xy.php file for your language scheme must be in your language encoding
charset
and not in utf8. For example the greek iso charset is: iso-8859-7, so the charset encoding of the file (lang.el.php) must be iso-8859-7.
- To find the
unicode_entities
for your language you should read the appropieate language unicode chart from http://www.unicode.org/charts. The values are in hexadecimal numeric entities. So you must convert them to decimal numeric values.- eg. The GREEK CAPITAL LETTER ALPHA A is 391(hex) but in the lang.el.php i've converted to 913(dec) value.
- Unicode Entity Codes for Greek
- How to Convert a Hex Number to a Decimal Number
- There are five values for Transliteration (you can find more about this at Transliteration)
-
TranslitLettersFrom
-
TranslitLettersTo
-
TranslitBiLetters
-
TranslitCaps
-
TranslitSmall
-
The NpjLettersFrom and NpjLettersTo must be equal at length and value. The wackowiki convert from TranslitLettersFrom to TranslitLettersTo so the url for the wackowiki pages should be at transliteration format. The TranslitBiLetters is an array for converting digraphs letters.
My personal suggestion for TranslitLettersTo is to use only latin-1 letters.
-
UPPER_P
,LOWER_P
andALPHA_P
values must be according to your language iso charset. I found the correct values from wikipedia, eg. ISO-8859-7 : http://en.wikipedia.org/wiki/ISO-8859-7
The values must be:
\x
and next the hex matrix value (row/col
).- the GREEK CAPITAL LETTER A =
\xc1