[Koha-devel] Investigations on Perl, MySQL & UTF-8

Pierrick LE GALL pierrick at koha-fr.org
Fri Mar 10 12:49:11 CET 2006


Hi koha-devel,

Because the story of Perl, MySQL, UTF-8 and Koha is becoming more and
more complicated, I've decided to start my tests outside of Koha or any
web server. I wanted to check that Perl and MySQL could communicate
with UTF-8 data.

What I did :

1. copy some UTF-8 strings from
http://www.columbia.edu/kermit/utf8-t1.html paste into a UTF-8 text
file utf8.txt (open/past in UTF-8 console, with Vim having :set
encoding=utf-8)

2. create a UTF-8 database with a simple table having a TEXT field

$ mysql --user=root --password=xxx
mysql> CREATE DATABASE `utf8_test` CHARACTER SET utf8;
mysql> connect utf8_test
mysql> create table strings (id int, value text);
mysql> quit

(no need to set connection character set to utf-8 in that case, default
latin1 is fine)

Note: my MySQL server is latin1...

$ mysql --user=root --password=xxx utf8_test
mysql> status
Server characterset:    latin1
Db     characterset:    utf8
Client characterset:    latin1
Conn.  characterset:    latin1
mysql> set names 'UTF8';
mysql> status
Server characterset:    latin1
Db     characterset:    utf8
Client characterset:    utf8
Conn.  characterset:    utf8

3. write and execute a Perl script which reads the UTF-8 text file,
insert UTF-8 strings into database, retrieve UTF-8 strings from
database, print UTF-8 strings to STDOUT. See details in attached file
readfile_insertdb.pl. Important note: "set names 'UTF8';" is mandatory.

Everything is *working fine*. My output is in UTF-8, I'm 100% sure of
it.

DBD::mysql : 2.9007
      Perl : 5.8.7
     MySQL : 4.1.12-Debian_1ubuntu3.1-log
       DBI : 1.48

(find your local versions with attached script versions.pl)

I suspect that Paul's data stored in MySQL are not truely UTF-8. Maybe
I miss the point, but it seems Perl, MySQL and UTF-8 are not working so
badly altogether.

Cheers,

-- 
Pierrick LE GALL
INEO media system
-------------- next part --------------
A non-text attachment was scrubbed...
Name: readfile_insertdb.pl
Type: application/x-perl
Size: 1234 bytes
Desc: not available
URL: </pipermail/koha-devel/attachments/20060310/039b7f42/attachment-0004.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: versions.pl
Type: application/x-perl
Size: 658 bytes
Desc: not available
URL: </pipermail/koha-devel/attachments/20060310/039b7f42/attachment-0005.bin>


More information about the Koha-devel mailing list