[Koha-bugs] [Bug 9579] Facets truncation broken for multi-byte characters

Fri Jan 31 17:41:34 CET 2014

http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=9579

--- Comment #25 from Galen Charlton <gmcharlt at gmail.com> ---
Created attachment 24959
  -->
http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=24959&action=edit
Bug 9579: fix truncation of facets containing multi-byte characters

We seem to be relying on whatever Zoom::Results->render return, and
Perl doesn't explicitly consider it UNICODE data. That's why CORE::substr
(and probably CORE::length too) cut the bytes wrong.

This patch just decodes the UTF-8 data that render() returns and then
Perl behaves, heh.

It uses Encode::decode_utf8 which is already a dependency for the current
stable Koha releases.

REVISED TEST PLAN
-----------------
1) Import the attached sample records.
2) Rebuild your indexes
3) In OPAC search for يكيمكتبات : قبسي ، كرم
-- There will be ugly diamonds with question marks in the facets
4) apply the patch
5) Search again.
-- The names will be properly truncated.
NOTE: This test assumes FacetLabelTruncationLength = 20.

Sponsored-by: Universidad Nacional de Cordoba

Signed-off-by: Mark Tompsett <mtompset at hotmail.com>
Signed-off-by: Katrin Fischer <Katrin.Fischer.83 at web.de>
Passes all tests and QA script.
Works as described, tested with several German, English and
the Arabic test record. Arabic strings now display correctly
and no regression was found.

Signed-off-by: Galen Charlton <gmc at esilibrary.com>

I've reviewed it and approve its inclusion in 3.14.x and earlier.  I
will use the patches for bug 11096, once they pass QA, for the master
branch.

Signed-off-by: Galen Charlton <gmc at esilibrary.com>

-- 
You are receiving this mail because:
You are watching all bug changes.