User Tools

Site Tools


mysql:configure_utf8mb4

This is an old revision of the document!


MySQL - Configure utf8mb4

UTF-8 in MySQL isn't fully supported by the utf8 character set, it only supports a maximum of 3-byte characters. To have full UTF-8 support in MySQL then the utf8mb4 character set needs to be used.

These instructions show how to convert the entire database to be utf8mb4.

Back up your data first.

Update Database

Update the default character set and collation for your database.

For case-sensitive text set the character set to utf8mb4_bin.

ALTER DATABASE `mydatabase` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_bin;

For case-insensitive text, then the character set can be set to utf8mb4_unicode_ci instead.

ALTER DATABASE `mydatabase` CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Update Tables

Change the default character set of the table. This way when you add new columns in the future you don't need to worry about adding all of the character set specification.

ALTER TABLE `mytable1` DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_bin;

Update Table Columns

Now, you can convert one column at a time, and this may be what you wish to do if you require different character sets for your CHAR, VARCHAR, and TEXT columns, here's how you do that:

ALTER TABLE `mytable1` CHANGE `mycolumn` `mycolumn` VARCHAR(64) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_bin NOT NULL DEFAULT  '';

Now obviously you're going to want to make sure that you're converting to the same column type and length, etc, the above is for example only and if you copy/paste it, you may screw up your column schema. Essentially you're just using ALTER TABLE CHANGE on the column in order to change the character set to utf8mb4 and collation to utf8mb4_unicode_bin.

If on the other hand you just want to change the entire the entire table at once, you can do:

ALTER TABLE `mytable1` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_bin;

Update Table Indices

When changing the character type, you may run into this on InnoDB:

ERROR 1071 (42000): Specified key was too long; max key length is 767 bytes

or this on MyISAM:

ERROR 1071 (42000): Specified key was too long; max key length is 1000 bytes

If you're using InnoDB on MySQL 5.6.3 or higher you can enable innodb_large_prefix in your MySQL config file (more information in manual http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix), but if you aren't you can take a few steps to work it out the old way:

  1. Make note of the conflicting index, it's going to likely be one which is something like VARCHAR(255) or an index across multiple columns which includes VARCHAR. Make note of the index name, type, and which column(s) it crosses.
  2. In my own scenario, I had a lot of columns which included some sort of VARCHAR(254) and ID which was binary(20). Now it seems like 254+20 = 274, and hey that's less than 767 (or 1000) so what's the deal? Well, MySQL doesn't count literal bytes in VARCHAR when it comes to Unicode, rather potential Unicode bytes are themselves counted as a byte (wait, what?). So if the column is 254 and it's utf8 that means the actual potential length is literally (254 * 3) bytes, and with utf8mb4 it's (254 * 4). So really the length of the key you're trying to create is 1)
1)
254 * 4) + 20). InnoDB only allows a maximum of 255 bytes for the column in an index with utf8 and 191 bytes for utf8mb4. So if you need the entire column indexed, you aren't going to want to change the character set for that column(s), and instead I recommend changing all others one by one (as seen in the Table Columns section) rather than trying to convert the entire table. However if you do not need the entire column to be index, and in certain cases I did not. Drop the index:
ALTER TABLE `mytable1` DROP INDEX `theindex`;
Then recreate it with the offending column(s) limited to 191:
ALTER TABLE `mytable1` ADD INDEX `theindex` (`mycolumn` (191));
or if across multiple columns (assuming mycolumnb is not utf8 for example):
ALTER TABLE `mytable1` ADD INDEX `theindex` (`mycolumn` (191),`mycolumnb`);
As long as the indices are the same, and in the same column order, you should receive the same benefits for the indices without worrying about redoing your queries.
mysql/configure_utf8mb4.1467540373.txt.gz · Last modified: 2020/07/15 09:30 (external edit)

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki