mysql:configure_utf8mb4
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
mysql:configure_utf8mb4 [2016/07/03 09:43] – peter | mysql:configure_utf8mb4 [2020/07/15 09:30] (current) – external edit 127.0.0.1 | ||
---|---|---|---|
Line 3: | Line 3: | ||
UTF-8 in MySQL isn't fully supported by the **utf8** character set, it only supports a maximum of 3-byte characters. | UTF-8 in MySQL isn't fully supported by the **utf8** character set, it only supports a maximum of 3-byte characters. | ||
+ | These instructions show how to convert the entire database to be **utf8mb4**. | ||
+ | **IMPORTANT: | ||
+ | The following instructions assume that a case-sensitive database is required throughout. | ||
+ | |||
+ | * For case-sensitive text set the character set to **utf8mb4_bin**. | ||
+ | * For case-insensitive text, the character set can be set to **utf8mb4_unicode_ci** instead. | ||
+ | |||
+ | |||
+ | |||
+ | ===== Back up your data ===== | ||
+ | |||
+ | Before making any change to a database it is always recommended that a full backup of the database is taken. | ||
+ | |||
+ | |||
+ | ===== Update Database ===== | ||
+ | |||
+ | Update the default character set and collation for your database. | ||
+ | |||
+ | <code mysql> | ||
+ | ALTER DATABASE `mydatabase` CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | ===== Update Tables ===== | ||
+ | |||
+ | Change the default character set of the table. | ||
+ | |||
+ | <code mysql> | ||
+ | ALTER TABLE `mytable1` DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; | ||
+ | </ | ||
+ | |||
+ | ===== Update Table Columns ===== | ||
+ | |||
+ | Now, you can convert one column at a time, and this may be what you wish to do if you require different character sets for your CHAR, VARCHAR, and TEXT columns, here's how you do that: | ||
+ | |||
+ | <code mysql> | ||
+ | ALTER TABLE `mytable1` CHANGE `mycolumn` `mycolumn` VARCHAR(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_bin NOT NULL DEFAULT | ||
+ | </ | ||
+ | |||
+ | Ensure you convert to the same column type and length, etc, the above is for example only and if you copy/paste it, you may screw up your column schema. | ||
+ | |||
+ | If on the other hand you just want to change the entire the entire table at once, you can do: | ||
+ | |||
+ | <code mysql> | ||
+ | ALTER TABLE `mytable1` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; | ||
+ | </ | ||
+ | |||
+ | |||
+ | ===== Update Table Indices ===== | ||
+ | |||
+ | When changing the character type, you may run into this on InnoDB: | ||
+ | |||
+ | <code mysql> | ||
+ | ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes | ||
+ | </ | ||
+ | |||
+ | or this on MyISAM: | ||
+ | |||
+ | <code mysql> | ||
+ | ERROR 1071 (42000): Specified key was too long; max key length is 1000 bytes | ||
+ | </ | ||
+ | |||
+ | |||
+ | If you're using InnoDB on MySQL 5.6.3 or higher you can enable **innodb_large_prefix** in your MySQL config file (more information in manual http:// | ||
+ | |||
+ | * Make note of the conflicting index, it's going to likely be one which is something like VARCHAR(255) or an index across multiple columns which includes VARCHAR. | ||
+ | |||
+ | * utf8mb4 uses 4 bytes to store every character, therefore the maximum length of an index can be ~767/4=191. | ||
+ | |||
+ | <code mysql> | ||
+ | ALTER TABLE `mytable1` DROP INDEX `theindex`; | ||
+ | </ | ||
+ | |||
+ | Then recreate it with the offending column(s) limited to 191: | ||
+ | |||
+ | <code mysql> | ||
+ | ALTER TABLE `mytable1` ADD INDEX `theindex` (`mycolumn` (191)); | ||
+ | </ | ||
+ | |||
+ | |||
+ | or if across multiple columns (assuming mycolumnb is not utf8 for example): | ||
+ | |||
+ | <code mysql> | ||
+ | ALTER TABLE `mytable1` ADD INDEX `theindex` (`mycolumn` (191), | ||
+ | </ | ||
+ | |||
+ | As long as the indices are the same, and in the same column order, you should receive the same benefits for the indices without worrying about redoing your queries. | ||
+ | |||
+ | |||
+ | How does one find the offending index? | ||
+ | |||
+ | <code mysql> | ||
+ | mysql> DROP INDEX foo_1234 on foo; | ||
+ | Query OK, 0 rows affected (0.00 sec) | ||
+ | Records: 0 Duplicates: 0 Warnings: 0 | ||
+ | mysql> ALTER TABLE foo CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin; | ||
+ | Query OK, 0 rows affected (0.11 sec) | ||
+ | Records: 0 Duplicates: 0 Warnings: 0 | ||
+ | CREATE INDEX foo_1234 ON foo (`baz`(191)); | ||
+ | </ | ||
+ | |||
+ | ===== Additional Notes and Considerations ===== | ||
+ | |||
+ | If a column is not being used for search and case insensitivity isn't an issue, instead of using CHAR or VARCHAR, I suggest using BINARY and VARBINARY. | ||
+ | |||
+ | Additionally instead of using TEXT, use BLOB, for the same reasons, but also realize the same limitations apply, such as no fulltext searching. | ||
+ | |||
+ | In summation, if you don't need case sensitivity and you don't need fulltext search, consider BINARY, VARBINARY, and BLOB over CHAR, VARCHAR, and TEXT, it'll be a lot easier to deal with when it comes to Unicode. | ||
+ | |||
+ | You can learn more about this on my MySQL performance, | ||
+ | |||
+ | |||
+ | |||
+ | ===== Database Connections ===== | ||
+ | |||
+ | Depending on your programming language, you may need to specify when connecting which chartype to use (you can also, in most cases, specify this on configuration, | ||
+ | |||
+ | <code mysql> | ||
+ | SET NAMES utf8mb4; | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | ===== Configuration ===== | ||
+ | |||
+ | You can edit your my.cnf (or my.ini on Windows) and make these changes to the appropriate sections of the configuration file (applicable to MySQL 5.6, older versions may need adjusted configuration): | ||
+ | |||
+ | <code mysql> | ||
+ | [client] | ||
+ | default-character-set = utf8mb4 | ||
+ | |||
+ | [mysql] | ||
+ | default-character-set = utf8mb4 | ||
+ | |||
+ | [mysqld] | ||
+ | character-set-client-handshake = FALSE | ||
+ | character-set-server = utf8mb4 | ||
+ | collation-server = utf8mb4_bin | ||
+ | </ |
mysql/configure_utf8mb4.1467539011.txt.gz · Last modified: 2020/07/15 09:30 (external edit)