1. ((http://doc.cat-v.org/plan_9/4th_edition/papers/utf The original paper from Bell Labs on UTF-8))
2. **((http://jkorpela.fi/chars.html A tutorial on character code issues))** - the MUST READ
3. ((https://en.wikipedia.org/wiki/Mojibake Mojibake))
2. ((https://web.archive.org/web/20070622045515/http://www.phpwact.org/php/i18n/charsets Character Sets / Character Encoding Issues))
3. ((https://web.archive.org/web/20070622045515/http://www.phpwact.org/php/i18n/utf-8 Handling UTF-8 with PHP))
1. ((http://www.w3.org/International/articles/unicode-migration/ Migrating to Unicode))
4. http://en.wikipedia.org/wiki/UTF-8
5. ((https://en.wikipedia.org/wiki/Unicode_block Unicode block))
5. **((/Dev/Guidelines/UnicodeCheatSheet Unicode Cheat Sheet))**
5. http://htmlpurifier.org/docs/enduser-utf8.html
1. ((http://kunststube.net/encoding/ What every programmer absolutely, positively needs to know about encodings and character sets to work with text))
6. https://www.sitepoint.com/bringing-unicode-to-php-with-portable-utf8/
7. W3C: ((http://www.w3.org/TR/charmod-norm/ Character Model for the World Wide Web: String Matching))
7. https://unicode.org
6. ((https://unicode.org/reports/tr18/ Unicode Regular Expressions))
7. ((https://unicode.org/reports/tr9/ Unicode Bidirectional Algorithm))
9. ((https://unicode.org/reports/tr36/tr36-9.html#visual_spoofing Unicode Security Considerations))
10. ((http://unicode.org/reports/tr15/ Unicode Normalization Forms))
7. http://www.utf8everywhere.org/
2. ((Quotes Quotes))
%%(info type="example")
̲ᴛ̲ʜ̲ᴇ̲ʀ̲ᴇ̲ ̲ɪ̲s̲ ̲ɴ̲ᴏ̲ ̲U̲ɴ̲ɪ̲ᴄ̲ᴏ̲ᴅ̲ᴇ̲ ̲ᴍ̲ᴀ̲ɢ̲ɪ̲ᴄ̲ ̲ʙ̲ᴜ̲ʟ̲ʟ̲ᴇ̲ᴛ̲ ̲
💩 𝔸 𝕤 𝕤 𝕦 𝕞 𝕖 𝔹 𝕣 𝕠 𝕜 𝕖 𝕟 𝕟 𝕖 𝕤 𝕤 💩
😈 ¡ƨdləɥ ƨᴉɥʇ ədoɥ puɐ ʻλɐp əɔᴉu ɐ əʌɐɥ ʻʞɔnl poo⅁ 😈
%%
===PHP 7.x/8.x===
* Strings literals in PHP are still fundamentally composed of bytes.
* It is up to developers to deal with character encoding issues using ((https://www.php.net/manual/en/book.mbstring.php mbstring)), ((https://www.php.net/manual/en/book.iconv.php iconv)), ((https://www.php.net/manual/en/class.uconverter.php uconverter)), etc.
* The ((https://www.php.net/manual/en/book.intl.php intl extension)) wraps a lot of the functionality that was originally going to be a part of PHP 6 for use in PHP 7/8.
* PHP 7 helps by adding the inline UTF-8 literal syntax ##\u{[0-9A-Fa-f]+}## and ((https://www.php.net/manual/en/class.intlchar.php IntlChar class)).
===Unicode & IDNA===
<[RFC 3986 neither supports IDNA, nor non-ASCII characters. WHATWG URL supports IDNA and Unicode characters, and it explicitly suggests that browsers should render the host component by displaying Unicode characters.
The recommendation is not just for user-friendliness: it's necessary for security reasons, alleviating the human risk factor in exploits. E.g. ~“xn--google.com” could deceive the uninitiated reader that it is a Google domain, however the IDNA domain decodes to “䕮䕵䕶䕱.com” in fact.]>