I want to set up MySQL server and all to it connections to it to use
utf8. For work, we are using MySQL server 5.5 and RHEL 6.5 (or the
equivalent CentOS on our Vagrant VMs).
Step 1. What is it currently using?
Step 2. Changing the server’s character set.
The my.cnf file has a number of sections:
Settings in the [mysqld] section apply to the server. Settings in
the [mysql] section apply to the mysql command line client. And
settings in the [client] section apply to all clients. Since I am
mainly trying to change how the server and our python (or ruby)
clients interact with the database, I want to make changes in the
[mysqld] and [client] sections.
If I add default-character-set = utf8 to the [mysqld] section, and
restart the server, my settings are now:
When I create a new database without specifying a character set, it is
created with server’s default, UTF8. That’s good. But the client and
connection character set still say latin1.
Step 3a. Setting the connection characterset explicitly
I can explicitly set the character set for character_set_client,
character_set_connection, and character_set_results after I have
logged into the database but doing either:
mysql> SET NAMES 'utf8';
OR
mysql> charset utf8
In either case, all of my character set variables are now utf8. With
‘set names’, the character set reverts to the default ‘latin1’ if the
client has to reconnect but with the ‘charset utf8’ command, the
change survives needing to reconnect. (Note the syntax differences
betweeen the 2 commands - set names requires quoting ‘utf8’ and a
semicolon to execute. Charset does not require either.)
But both of those commands have to be issued after the connection is
established - and need to be repeated each time I connect. I want
something more automatic.
Step 3b. Changing the client / connection character set.
It looked like I should be able to bet the client settings working by
adding the same line, default-character-set = utf8, to the
[client] section. But in my initial trials, that didn’t seem to
change anything. I also tried adding default-character-set = utf8 to
both the [client] and [mysql] sections. But I still get:
Some of the things I was reading indicated that my problem might be
that connections for superusers (like root) might behave differently
than connections for normal users. I tried creating a normal user and
then retesting various combinations of parameters - with the same
results.
I finally found something that consistently gave me utf8 connection
parameters without having to set them expliclty (see digression
below). In this article
about converting to a better version of MySQL utf8, utf8mb4, I see
this directive: character-set-client-handshake = FALSE. Adding that to
my [mysqld] section gives me consistent utf8 connection parameters
with root and other users. However, I don’t think I really want to do
that. I may want to be able to override the character set information
for some connections (for example if I am connecting to legacy
databases that are not in utf8). I think most of the frameworks I use
already pass a character set parameter with the connection - at least
Rails has a character set option in it’s database.yml config file.