[Koha-devel] UTF-8 problems : a summary and some solutions

Joshua Ferraro jmf at liblime.com
Wed Aug 23 18:04:37 CEST 2006


On Wed, Aug 23, 2006 at 03:44:20PM +0200, Henri-Damien LAURENT wrote:
> OK for claiming MARC compliance, as soon as it is for ANY MARC flavor
> would it be UNIMARC or else.
Not sure I understand what you mean ...

> But I am not OK to say pure UTF-8 Koha is already possible, since it is
> unfortunately not.
> I reported a display error and reported to How you can get these errors
> through a simple script.
>
> Both DBI and CGI are buggy in their UTF-8 management, even though it is
> true they donot harm UTF-8 data. But if PERL is to cope with utf8 data,
> it has to be aware of that and encode things properly.
> Maybe for you it works since you have no utf-8 *Both* in your zebra
> records *and* your framework or any mysqldata.
> But for us it IS a blocking problem and we HAVE to cope with it.
> As soon as we work only with mysql or only with zebra, no problems, as I
> said.
> But we are not.

http://wipoopac.liblime.com is pure UTF-8, Apache, MySQL database,
table, column, etc., and it works just fine. The MARC data was imported
using MARC::* suite, converted from MARC-8 encoding to UTF-8 encoding.

What specific problems are you having?

> And we must be aware of that.
> But you gave me at least two pennies to think about :
> 1) LEADER HAS TO be well integrated (in MARChtml2xml in rel 3_0, it is
> not even generated)
> 2) MARC::Charset->ignore_errors can be used but is not the best solution
> since some data could be lost without notice.
The only data that is lost is the one invalid character that causes
the problem. For instance, if you are using MARC21 records, and your
record doesn't have a leader (or the leader claims to be MARC-8 encoded),
and you have an invalid latin1 character in the data, it will lose that
one character. But that character should not have been there in the first
place!

The third thing to be aware of is the UNIMARC flag that you can pass
to MARC::File::XML ... for some reason, I've mentioned this about
10 times to UNIMARC guys but so far, noone has tested it!

Cheers,

-- 
Joshua Ferraro                       SUPPORT FOR OPEN-SOURCE SOFTWARE
President, Technology       migration, training, maintenance, support
LibLime                                Featuring Koha Open-Source ILS
jmf at liblime.com |Full Demos at http://liblime.com/koha |1(888)KohaILS





More information about the Koha-devel mailing list