[Koha-devel] Koha 3.0 and UTF-8

Paul POULAIN paul.poulain at free.fr
Wed Jan 4 16:53:44 CET 2006


utf8 is a : go for beta test in HEAD.

Some explanations of what i've made :

- updater/updatedatabase => will transform all tables in innoDB (not 
related to utf8, just to warn you) AND collate them in utf8 / 
utf8_general_ci. The SQL command is : ALTER TABLE tablename DEFAULT 
CHARACTER SET utf8 COLLATE utf8_general_ci.

- *-top.inc will show the pages in utf8

- THE HARD THING : for me, mysql-client and mysql-server were set up to 
communicate in iso8859-1, whatever the mysql collation ! Thus, pages 
were improperly shown, as datas were transmitted in iso8859-1 format ! 
After a full day of investigation, someone on usenet pointed "set NAMES 
'utf8'" to explain that I wanted utf8. I could put this in my.cnf, but 
if I do that, ALL databases will "speak" in utf8, that's not what we 
want. Thus, I added a line in Context.pm : everytime a DB handle is 
opened, the communication is set to utf8.

- how to deal with MARC records ? MARC records are in marc-8 encoding, 
stored in biblioitems.marc binary format, that is not modified by the 
alter table (fortunatly, as it's a binary format !). But... I created a 
marcxml column in this table, containing the XML output of the marc 
record (a duplicate of the raw marc record). I did not knew what it will 
be used for, but know I know : the utf8 move transform the marcxml 
column, and the catalogue is moved to utf8 (with MARCgetbiblio using 
marcxml instead of raw marc) ! The last question being : is 
biblioitems.marc (raw marc record) still useful ? I think no :
- iso2709 is limited to 99999 char, specialised to MARC-8 encoding, binary.
- XML has none of those limitations.
Thus, my opinion is that we should get rid of iso2709 and use XML 
everywhere, except when exporting datas in iso2709 format. But 
internally, we should use only XML.

Let me know if it works completly, correctly, poorly or not at all for you !
-- 
Paul POULAIN
Consultant indépendant en logiciels libres
responsable francophone de koha (SIGB libre http://www.koha-fr.org)





More information about the Koha-devel mailing list