𝔘𝔫𝔦𝔠𝔬𝔡𝔢 Resources
- The original paper from Bell Labs on UTF-8
- A tutorial on character code issues - the MUST READ
- Mojibake
- Character Sets / Character Encoding Issues
- Handling UTF-8 with PHP
- Migrating to Unicode
- http://en.wikipedia.org/wiki/UTF-8
- Unicode block
- Unicode Cheat Sheet
- http://htmlpurifier.org/docs/enduser-utf8.html
- What every programmer absolutely, positively needs to know about encodings and character sets to work with text
- https://www.sitepoint.com/brin[...]-with-portable-utf8/
- https://www.utf8-chartable.de/
- W3C: Character Model for the World Wide Web: String Matching
- https://unicode.org
- http://www.utf8everywhere.org/
- Quotes
̲ᴛ̲ʜ̲ᴇ̲ʀ̲ᴇ̲ ̲ɪ̲s̲ ̲ɴ̲ᴏ̲ ̲U̲ɴ̲ɪ̲ᴄ̲ᴏ̲ᴅ̲ᴇ̲ ̲ᴍ̲ᴀ̲ɢ̲ɪ̲ᴄ̲ ̲ʙ̲ᴜ̲ʟ̲ʟ̲ᴇ̲ᴛ̲ ̲
💩 𝔸 𝕤 𝕤 𝕦 𝕞 𝕖 𝔹 𝕣 𝕠 𝕜 𝕖 𝕟 𝕟 𝕖 𝕤 𝕤 💩
😈 ¡ƨdləɥ ƨᴉɥʇ ədoɥ puɐ ʻλɐp əɔᴉu ɐ əʌɐɥ ʻʞɔnl poo⅁ 😈
PHP 7.x/8.x
- Strings literals in PHP are still fundamentally composed of bytes.
- It is up to developers to deal with character encoding issues using mbstring, iconv, uconverter, etc.
- The intl extension wraps a lot of the functionality that was originally going to be a part of PHP 6 for use in PHP 7.x/8.x.
- PHP 7 helps by adding the inline UTF-8 literal syntax
\u{[0-9A-Fa-f]+}
and IntlChar class.