[Koha-patches] [PATCH] (bug #4856) fix rebuild zebra to delete NSB/NSE chars
Frederic Demians
frederic at tamil.fr
Tue Jun 8 21:03:11 CEST 2010
> line 283
> It should remove the non sorting blocks (we read from docs.)
I didn't catch it. Thanks. My approach works for me: adding NSB/NSE as
words separators in 'space' directive. Didn't it work for you? Did you
test it?
> But that doesnot fit and when analysing the index process, it doesnot
> remove them.
This directive may be wrong? => two closing parenthesis:
map (\x88.*\x89)) @
Have you tried?
map {\x88} @
map {\x89} @
or:
map <88> @
map <89> @
> Solution could be ICU. But using ICU also means loosing some
> truncation attributes like fuzzy, or left truncation. Moreover ICU is
> quite picky about the way tokens are analyzed.
Zebra ICU should be explored further in order to alleviate week
non-latin characters support in Koha.
> We also want to be able to propose some solution to those who are not
> willing to use ICU and install yet another dependency.
Your solution for ICU-allergics is also for anybody. A lot of UNIMARC
libraries want to keep NSB/NSE characters.
--
Frédéric DEMIANS
http://www.tamil.fr/u/fdemians.html
More information about the Koha-patches
mailing list