[Koha-bugs] [Bug 10939] ICU does not transliterate polish special characters

Wed Nov 6 09:47:33 CET 2013

http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=10939

Jacek Ablewicz <abl at biblos.pk.edu.pl> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |abl at biblos.pk.edu.pl

--- Comment #12 from Jacek Ablewicz <abl at biblos.pk.edu.pl> ---
I think you only need those two lines in etc/words-icu.xml:

<transliterate rule="{ ł > l "/>
<transliterate rule="{ Ł > l "/>

as all other Polish diacritics should behave correctly out of the box.

Polish "l striked" is indeed somehow special (while it shouldn't - general
consensus is that "ł" is NOT a very different letter than "l" etc.). Problem
is: long time ago, some person from Unicode Consortium involved in UCA (Unicode
Collation Algorithm) development made rather questionable decision to not treat
"ł, Ł" as wariants of "l, L". It became a major PITA from than on - for
example, it does also affect mysql utf8_general_ci & utf8_unicode_ci
collations, to name just a few side effects.

BTW, I believe this got finally corrected in subsequent UCA revisions (in
v5.2.0+, AFAIRC), so this workaround may be no longer necessary in Koha
installations with more recent libicu versions (?).

-- 
You are receiving this mail because:
You are watching all bug changes.