[Koha-bugs] [Bug 14759] Replacement for Text::Unaccent

Fri Dec 11 01:21:22 CET 2015

http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=14759

--- Comment #19 from David Cook <dcook at prosentient.com.au> ---
(In reply to Katrin Fischer from comment #17)
> Thx Yuval! It looks to me like using the new method would be a big step in
> the right direction.

I agree.

Text::Unaccent and Text::Unaccent::PurePerl don't appear to be comprehensive
enough to deal with many languages. While it seems to handle Latin and Greek
characters, it doesn't do so well with Arabic and Hebrew.

Note that nothing seems to happen with the (Japanese?) ideograms that Galen
tested. I wonder if accents are even a thing with CJK languages... I've asked a
friend who knows Chinese for her input on that one. Oh, I know some people with
Japanese experience as well... I should ask them.

I think we should also ask Vietnamese users, as Vietnamese has a lot of
diacritics... and I think they might actually be quite significant.
(https://en.wikipedia.org/wiki/Vietnamese_alphabet#Tone_marks)

I'll update the listserv to ask for people with Vietnamese knowledge too... as
that could potentially answer Galen's question about whether or not we should
even be unaccenting userid values...

-- 
You are receiving this mail because:
You are watching all bug changes.