WordPress 5.7 Upgrade Breaks Posts with Emoji in the Title?
Not sure what might be going on here, but after upgrading K-Squared Ramblings to WordPress 5.7, some posts started timing out.
Viewing the post…
Searching in the Posts interface for anything that was in the post…
Trying to list posts that had its tags…
I could search for it on the blog, but not in the admin area. Error logs on the web server showed a connection timeout. The only way I could find them was through the Search Regex plugin, and it turned out I could still edit the post. Though I couldn’t save any changes without getting the same timeout.
Both posts that I’d been alerted to had emoji in the title. Hmmm…
I tried removing the emoji from the title and saving it. The post loaded again! And I could search for keywords and tags that matched it!
Just for kicks, I pasted the emoji back in. Saved. No problem.
I don’t know what changed, but apparently older versions of WordPress were storing emoji in titles differently or something, and re-saving it fixes it?
I checked another blog that updated from 5.6 to 5.7 and had emoji – in some cases the same symbols – in titles, but it wasn’t affected by the same problem. I’d guess it’s either a combination with some plugin, or something left over from the fact that the site has been updated continuously from some very old versions of WordPress.
Anyway, that seems to work, so I’m posting this here for anyone else with the same issue to see what worked for me.
Tracking it Down
May update: It happened again on K2R, possibly after the upgrade to 5.7.1. This time I decided to dig into it in more detail and checked out the database. Because this is A Very Old Blog™, all the post columns (among others) were still using the Latin-1 character set, even though WordPress is working in Unicode now. So MySQL thinks it has Latin-1 data, but it’s actually UTF-8.
But you can’t just change the character set on live data. If the old charset matches the data, you need to convert it. If the old charset doesn’t match, you need to not convert it. Chances are you’ll either end up with double-encoded or broken data, like I got when I tried updating a column with PhpMyAdmin (after running a backup, of course!), and it deleted everything after the first emoji in each post!
Alex King’s post on converting mysql from latin1 to utf8 helped me solve it. Basically I took a copy of the mysqldump file I’d just created, searched for latin1 in the table definitions and replaced it with utf8, and re-imported the SQL file. So far everything I’ve spot-checked looks OK! Here’s hoping this actually fixed it right this time!
Or will I need to re-do the process with utf8mb4…? Update: yes.