perl iconmysql iconunicode icon

Perl, MySQL and UTF-8

Posted in , , , Mon, 02 Oct 2006 15:35:00 GMT

One of the mysteries of Perl to me is that why, as of yet, is there no UTF-8 support in DBD::mysql although this issue has been discussed on the msql-mysql-modules list since at least 2003 (using the MARC archives). This is also given that MySQL does have UTF-8 support itself.

When I first looked into this I found a few articles on this:

Pedro's article mentions that the reason this hasn't been done for DBD::mysql is that the DBI and DBD::mysql folks cannot decide where to put UTF-8 implementation, i.e. in DBI itself or the DBD drivers. Because, there is still no built-in support. To get around this, there have been numerous patches produced. Andrew Forrest even put together UTF-8 versions of DBI and CGI.pm (link seems broken atm). However, some of these patches seem to have problems and are non-standard.

If you prefer to use an ORM, DBIx::Class and Class::DBI get around this by implementing UTF-8 support in their own libraries with DBIx::Class::UTF8Columns and Class::DBI::utf8 respectively. I'd recommend DBIx::Class over Class::DBI since it has more functionality (e.g. built-in JOIN support) and is supposed to generate more efficient SQL.

The intersting thing is that DBD::Pg for PostgreSQL has had built-in UTF-8 support for some time. While not an issue specific to the MySQL database, the UTF-8 perl driver issue is something to consider when choosing MySQL or PostgreSQL.

Update: Thanks to Dominic Mitchell for mentioning the latest developer release, DBD::mysql 3.0007_1 released on 8 Sep 2006, has integrated UTF-8 support. It's a developer release but good things are finally happening!

del.icio.us:Perl, MySQL and UTF-8 digg:Perl, MySQL and UTF-8 reddit:Perl, MySQL and UTF-8 spurl:Perl, MySQL and UTF-8 wists:Perl, MySQL and UTF-8 simpy:Perl, MySQL and UTF-8 newsvine:Perl, MySQL and UTF-8 blinklist:Perl, MySQL and UTF-8 furl:Perl, MySQL and UTF-8 fark:Perl, MySQL and UTF-8 blogmarks:Perl, MySQL and UTF-8 Y!:Perl, MySQL and UTF-8 smarking:Perl, MySQL and UTF-8 magnolia:Perl, MySQL and UTF-8 segnalo:Perl, MySQL and UTF-8

8 comments

Comments

  1. Dominic Mitchell said about 4 hours later:

    Wait no longer! DBD::mysql 3.007_01 has now got the same level of utf8 support as DBD::Pg. I sent the patch to Patrick Galbraith and it got in! That version is lacking tests, but they should be in the next version.

    It’s basically the same idea as in DBD::Pg. You set $dbh->{mysql_enable_utf8} and any text columns come back as UTF8 if they’re valid.

    I also managed to get the support into DBD::Pg a couple of years back. This was quite lucky as the patches are very similar. :-)

  2. John Wang said about 5 hours later:

    Wow, that’s exciting news! Thanks for posting about the developer release. Looks like I may try it out soon.

  3. Robbie Bow said 2 days later:

    Just a small correction: MySQL 5.0 does have UTF8 support. I’m not sure, but I think 4.1 also does.

  4. John Wang said 2 days later:

    Robbie: I think you misread the article since it says “MySQL does have UTF-8 support” with the issue being UTF-8 support in the DBD::mysql driver, not the MySQL database.

  5. Artem Russakovskii said about 1 year later:

    In case you need utf8 support in DBD::mysql before the aforementioned 3.007 release, you can use

    $dbh->do(“SET NAMES ‘utf8’”);

    after getting the db handle.

  6. admin said about 1 year later:

    mysql的优化

  7. Herb Chenault said about 1 year later:

    (this may accidently posted twice..)

    Aug14, 2008-> Perl 5.8.0, MySQL 5, DBI 1.32: saga continued-> did all the above but no success when outputting data from db into utf8 files. After much searching and seeing all the confusion… found article on use utf8::decode($$ref[$i]) (using example $sth->fetchrow_arrayref()) for each row/column and it seemed to work as several articles stated… It is noted that all worked fine w/o “use utf8::decode” when inserting data into database from utf8/xml/parser. (I confirmed the insertion by using MySQL browser and setting terminal emulation correctly to verify utf8-symbols…)

    I continue to read that-> seems like 5.8.6/8 is better Perl version to use and the DBI/DBD later version are supposed to address this problem..

    I love Perl however this utf8 stuff is a mess..

  8. Herb Chenault said about 1 year later:

    Herb Chenault’s second post-> Concerning the DBD::mysql “mysql_enable_utf8”, it seems the DBD::mysql guys are strongly recommending version DBD::mysql 4.03/.04 because these latest versions have important utf8 bug fixes….

    just trying to help people w/ Perl utf8 since others are helping me…

(leave url/email »)

   Comment Markup Help Preview comment