[Koha-devel] Extending MARC::Charset::Table ?

Thu Jun 5 00:49:01 CEST 2014

Hi,

On Wed, Jun 4, 2014 at 11:56 AM, Philippe Blouin <
philippe.blouin at inlibro.com> wrote:

> We're using the MARC library for some migration, as usual, but we
> encountered some new issue with some arabic title: the key code 703
>   0x02BF 703 MODIFIER LETTER LEFT HALF RING ʿ   is not part of the Table
> db, which cause the whole subfield to disappear and causing us headaches.
>

What is the source character encoding of the records?  If the records are
already in UTF-8, then it is not necessary to transcode them to MARC8, then
back to UTF8 for loading into Koha.  Adding the following line to whatever
code you're using to pre-process the records might help:

MARC::Charset->assume_unicode(1);

As an alternative, you could adjust change the records to use 0x02bb rather
than 0x02bf.  I'm assuming that the strings in question are transliterated
Arabic following the ALA-LC Arabic romanization.  If so, back in 1999, the
mapping of the "ayn" character was changed from 0x02bf to 0x02bb. [1]

[1] http://www.loc.gov/marc/marbi/2005/2005-05.html

Regards,

Galen
-- 
Galen Charlton
Manager of Implementation
Equinox Software, Inc. / The Open Source Experts
email:  gmc at esilibrary.com
direct: +1 770-709-5581
cell:   +1 404-984-4366
skype:  gmcharlt
web:    http://www.esilibrary.com/
Supporting Koha and Evergreen: http://koha-community.org &
http://evergreen-ils.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20140604/0d63cda5/attachment.html>