utf8mb4

1. Check the maximum length of columns and index keys


This is probably the most tedious part of the whole upgrading process.


When converting from utf8 to utf8mb4, the maximum length of a column or index key is unchanged in terms of bytes. Therefore, it is smaller in terms of characters, because the maximum length of a character is now four bytes instead of three.


For example, a TINYTEXT column can hold up to 255 bytes, which correlates to 85 three-byte or 63 four-byte characters. Let’s say you have a TINYTEXT column that uses utf8 but must be able to contain more than 63 characters. Given this requirement, you can’t convert this column to utf8mb4 unless you also change the data type to a longer type such as TEXT — because if you’d try to fill it with four-byte characters, you’d only be able to enter 63 characters, but not more.


The same goes for index keys. The InnoDB storage engine has a maximum index length of 767 bytes, so for utf8 or utf8mb4 columns, you can index a maximum of 255 or 191 characters, respectively. If you currently have utf8 columns with indexes longer than 191 characters, you will need to index a smaller number of characters when using utf8mb4. (Because of this, I had to change some indexed VARCHAR(255) columns to VARCHAR(191).)

1.1. innodb_large_prefix

Enable this option to allow index key prefixes longer than 767 bytes (up to 3072 bytes) for InnoDB tables that use DYNAMIC or COMPRESSED row format. (Creating such tables also requires the option values innodb_file_format=barracuda and innodb_file_per_table=true.)


Barracuda is the newest file format, it supports index key prefixes up to 3072 bytes.

[mysqld]
innodb_large_prefix=1
innodb_file_format=barracuda
innodb_file_per_table=true


https://dev.mysql.com/doc/refm[..]code-conversion.html

2. Types


Type Size Description
CHAR[Length] Length bytes A fixed-length field from 0 to 255 characters long
VARCHAR[Length] String length + 1 or 2 bytes A variable-length field from 0 to 65,535 characters long
TINYTEXT String length + 1 bytes A string with a maximum length of 255 characters
TEXT String length + 2 bytes A string with a maximum length of 65,535 characters
MEDIUMTEXT String length + 3 bytes A string with a maximum length of 16,777,215 characters
LONGTEXT String length + 4 bytes A string with a maximum length of 4,294,967,295 characters

types bytes utf8 utf8mb4
TINYTEXT, CHAR 255 85 64
TEXT, VARCHAR 65.535 21.845 16.384
MEDIUMTEXT 16.777.215 5.592.405 4.194.304

3. Settings


'database_charset' => 'utf8mb4',
'database_collation' => 'utf8mb4_general_ci',


The recommended collation 'utf8mb4_general_ci' will work best for sites using a
Latin-1 character set, sites with non-Latin-1 character sets may wish to instead
use 'utf8mb4_unicode_ci'.

4. Issues

4.1. page

CREATE TABLE wacko_page (
    page_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    version_id INT(10) UNSIGNED NOT NULL DEFAULT '1',
    owner_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    user_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    title VARCHAR(250) NOT NULL DEFAULT '',
    tag VARCHAR(250) NOT NULL DEFAULT '',
    supertag VARCHAR(250) NOT NULL DEFAULT '',
    menu_tag VARCHAR(250) NOT NULL DEFAULT '',
    depth INT(10) UNSIGNED NOT NULL DEFAULT '0',
    parent_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    created DATETIME NULL DEFAULT NULL,
    modified DATETIME NULL DEFAULT NULL,
    body MEDIUMTEXT NOT NULL,
    body_r MEDIUMTEXT NOT NULL,
    body_toc TEXT NOT NULL,
    formatting VARCHAR(20) NOT NULL DEFAULT 'wacko',
    edit_note VARCHAR(200) NOT NULL DEFAULT '',
    minor_edit TINYINT(1) UNSIGNED DEFAULT '0',
    page_size INT(10) UNSIGNED NOT NULL DEFAULT '0',
    license_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    reviewed TINYINT(1) UNSIGNED NOT NULL DEFAULT '0',
    reviewed_time DATETIME NULL DEFAULT NULL,
    reviewer_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    ip VARCHAR(15) NOT NULL DEFAULT '',
    latest TINYINT(1) UNSIGNED DEFAULT '1',
    handler VARCHAR(30) NOT NULL DEFAULT 'page',
    comment_on_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    comments INT(4) UNSIGNED NOT NULL DEFAULT '0',
    files INT(4) UNSIGNED NOT NULL DEFAULT '0',
    revisions INT(10) UNSIGNED NOT NULL DEFAULT '0',
    hits INT(10) UNSIGNED NOT NULL DEFAULT '0',
    theme VARCHAR(20) DEFAULT NULL,
    page_lang VARCHAR(2) NOT NULL DEFAULT '',
    commented DATETIME NULL DEFAULT NULL,
    description VARCHAR(250) NOT NULL DEFAULT '',
    keywords VARCHAR(250) BINARY NOT NULL DEFAULT '',
    footer_comments TINYINT(1) UNSIGNED NULL DEFAULT NULL,
    footer_files TINYINT(1) UNSIGNED NULL DEFAULT NULL,
    footer_rating TINYINT(1) UNSIGNED NULL DEFAULT NULL,
    hide_toc TINYINT(1) UNSIGNED NULL DEFAULT NULL,
    hide_index TINYINT(1) UNSIGNED NULL DEFAULT NULL,
    tree_level TINYINT(1) UNSIGNED NOT NULL DEFAULT '0',
    show_menu_tag TINYINT(1) UNSIGNED NOT NULL DEFAULT '0',
    allow_rawhtml TINYINT(1) UNSIGNED NULL DEFAULT NULL,
    disable_safehtml TINYINT(1) UNSIGNED NULL DEFAULT NULL,
    noindex TINYINT(1) UNSIGNED NULL DEFAULT '0',
    deleted TINYINT(1) UNSIGNED NULL DEFAULT '0',
    PRIMARY KEY (page_id),
    KEY idx_user_id (user_id),
    KEY idx_owner_id (owner_id),
    FULLTEXT KEY body (body),
    UNIQUE KEY idx_tag (tag),
    KEY idx_supertag (supertag),
    KEY idx_depth(depth),
    KEY idx_created (created),
    KEY idx_modified (modified),
    KEY idx_minor_edit (minor_edit),
    KEY idx_deleted (deleted),
    KEY idx_reviewed (reviewed),
    KEY idx_comment_on_id (comment_on_id),
    KEY idx_commented (commented),
    KEY idx_title (title)
) ENGINE=InnoDB COMMENT='' DEFAULT CHARSET=utf8mb4


#1071 – Specified key was too long; max key length is 767 bytes


title VARCHAR(191) NOT NULL DEFAULT '',
tag VARCHAR(191) NOT NULL DEFAULT '',
supertag VARCHAR(191) NOT NULL DEFAULT '',

4.2. file

CREATE TABLE wacko_file (
    file_id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
    page_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    user_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    file_name VARCHAR(250) NOT NULL DEFAULT '',
    file_lang VARCHAR(2) NOT NULL DEFAULT '',
    file_description VARCHAR(250) NOT NULL DEFAULT '',
    caption TEXT NOT NULL,
    license_id INT(10) UNSIGNED NOT NULL DEFAULT '0',
    uploaded_dt DATETIME NULL DEFAULT NULL,
    modified_dt DATETIME NULL DEFAULT NULL,
    file_size INT(10) UNSIGNED NOT NULL DEFAULT '0',
    picture_w INT(10) UNSIGNED NOT NULL DEFAULT '0',
    picture_h INT(10) UNSIGNED NOT NULL DEFAULT '0',
    file_ext VARCHAR(10) NOT NULL DEFAULT '',
    mime_type VARCHAR(255) NOT NULL DEFAULT '',
    hits INT(10) UNSIGNED NOT NULL DEFAULT '0',
    deleted TINYINT(1) UNSIGNED NULL DEFAULT '0',
    PRIMARY KEY (file_id),
    UNIQUE idx_page_id (page_id, file_name),
    KEY idx_page_id_2 (page_id, uploaded_dt),
    KEY idx_deleted (deleted),
    KEY idx_user_id (user_id)
) ENGINE=InnoDB COMMENT='' DEFAULT CHARSET=utf8mb4


#1071 – Specified key was too long; max key length is 767 bytes


file_name VARCHAR(191) NOT NULL DEFAULT '',