Perl, MySQL and UTF-8
One of the mysteries of Perl to me is that why, as of yet, is there no UTF-8 support in DBD::mysql although this issue has been discussed on the msql-mysql-modules list since at least 2003 (using the MARC archives). This is also given that MySQL does have UTF-8 support itself.
When I first looked into this I found a few articles on this:
- utf-8 and DBD::mysql by Pedro Melo
- Movable Type, MySQL, Perl, Unicode by Zakaria "Zack" Ajmal: provides a patch for Movable Type 3.2
Pedro's article mentions that the reason this hasn't been done for DBD::mysql is that the DBI and DBD::mysql folks cannot decide where to put UTF-8 implementation, i.e. in DBI itself or the DBD drivers. Because, there is still no built-in support. To get around this, there have been numerous patches produced. Andrew Forrest even put together UTF-8 versions of DBI and CGI.pm (link seems broken atm). However, some of these patches seem to have problems and are non-standard.
If you prefer to use an ORM, DBIx::Class and Class::DBI get around this by implementing UTF-8 support in their own libraries with DBIx::Class::UTF8Columns and Class::DBI::utf8 respectively. I'd recommend DBIx::Class over Class::DBI since it has more functionality (e.g. built-in JOIN support) and is supposed to generate more efficient SQL.
The intersting thing is that DBD::Pg for PostgreSQL has had built-in UTF-8 support for some time. While not an issue specific to the MySQL database, the UTF-8 perl driver issue is something to consider when choosing MySQL or PostgreSQL.
Update: Thanks to Dominic Mitchell for mentioning the latest developer release, DBD::mysql 3.0007_1 released on 8 Sep 2006, has integrated UTF-8 support. It's a developer release but good things are finally happening!