Success!
It took way longer than I anticipated, but I finally solved the problem of the Unicode characters being turned into question marks, identified in my last post. (Of course, now that I’ve solved the problem, the Chinese characters appear as 張駿 and no longer as ??.)
As I suspected, the underlying MySQL database was set up such that Unicode characters disappeared when they were saved to the database. Specifically, the character set used was Latin1 when it should have been UTF-8. This appears to be a problem when Fantastico is used to install WordPress . . . the process that I illustrated and advocated in one of my training videos. I have to run a few more tests, but I suspect that doing a command-line install of WordPress will avoid all of these issues. Look for a revised training video with step-by-step instructions.
So, without going into great detail, here’s the gist of what I had to do today:
- Back up WordPress’s MySQL database—always a good idea before a major update such as this.
- Place a notice that the blog is out of service. I did this by uploading an index.html page, which overrode the index.php page that WordPress uses.
- Run 49 ALTER TABLE statements to turn text and character columns into their blob and binary counterparts.
- Run an ALTER DATABASE statement to change the default character set for the database to UTF-8. (I probably didn’t have to do this, as the character set was already UTF-8 at the database level. Irritatingly, it had been overridden at the table level to Latin1. Grrrrr.)
- Run 10 ALTER TABLE statements to change the default character set for each table to UTF-8.
- Run 49 ALTER TABLE statements to turn blob and binary columns back into their text and character counterparts, at the same time specifying character set UTF-8 at the column level.
- Place the blog back online. I did this by simply renaming index.html to indexMAINT.html.
Would that it had been this simple. When I inspected the blog posts after step 6, I discovered that all entries were truncated at the first special character—precisely the situation I had been led to believe that these steps were designed to avoid. So I spent a good two hours manually updating the 11 of the 13 blog entries that had special characters.
But the job is done. Now all I have to do is make another backup of the successfully updated database.
Stay tuned for more WordPress tips as I encounter them.

