[Koha-bugs] [Bug 27299] New: Zebra phrase register is incorrectly tokenized
bugzilla-daemon at bugs.koha-community.org
bugzilla-daemon at bugs.koha-community.org
Tue Dec 22 23:56:39 CET 2020
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27299
Bug ID: 27299
Summary: Zebra phrase register is incorrectly tokenized
Change sponsored?: ---
Product: Koha
Version: master
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P5 - low
Component: Searching - Zebra
Assignee: koha-bugs at lists.koha-community.org
Reporter: dcook at prosentient.com.au
Recently, I noticed issues with "exact" matching for authority linking when
using Zebra ICU.
I've documented those issues upstream on the idzebra project on Github:
https://github.com/indexdata/idzebra/issues/24
Adam Dickmeiss and I are still working through this issue, but it seems very
likely to me that the issue is that we are tokenizing strings for the "p"
register when we should not be.
Looking at Zebra CHR, the "p" register is not tokenized. According to Zebra's
own documentation
(https://software.indexdata.com/zebra/doc/querymodel-zebra.html#querymodel-pqf-apt-mapping-structuretype),
the "p" register is supposed to be "Character normalized, but not tokenized
index for phrase matches".
I'm still waiting for Adam to confirm my solution, but I've opened this bug
report to track things on the Koha side, and to include a patch which I hope
will resolve these problems.
--
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.
More information about the Koha-bugs
mailing list