[Koha-devel] Fixing exact search in Zebra ICU (formerly LinkBibHeadingsToAuthorities not working as expected)

Fridolin SOMERS fridolin.somers at biblibre.com
Mon Jan 4 09:49:38 CET 2021


Whooooo
thanks a lot for your time and your detectiv expertise ;)

Such an old problem I thought it was an unsolved bug inside Zebra.
Juste a line into configuration 🤯

I'm testing riht away.

Best regards and Happy new year David :)

Le 23/12/2020 à 00:16, dcook at prosentient.com.au a écrit :
> Thanks, Fridolin.
> 
> I've opened up https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27299 to track this topic in Koha. I've added a See Also to 18017.
> 
> I've added a patch to 27299 with the change I proposed in https://github.com/indexdata/idzebra/issues/24.
> 
> I've tested it outside of Koha and it works as I expected/hoped. I just want to get Indexdata's blessing and then I'll push for it in Koha.
> 
> David Cook
> Software Engineer
> Prosentient Systems
> Suite 7.03
> 6a Glen St
> Milsons Point NSW 2061
> Australia
> 
> Office: 02 9212 0899
> Online: 02 8005 0595
> 
> -----Original Message-----
> From: Fridolin SOMERS <fridolin.somers at biblibre.com>
> Sent: Tuesday, 22 December 2020 10:11 PM
> To: dcook at prosentient.com.au; koha-devel at lists.koha-community.org
> Subject: Re: [Koha-devel] LinkBibHeadingsToAuthorities not working as expected
> 
> Hi,
> 
> I've pushed on
> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=18017
> 
> I copied from MARC21 the use of index_heading and index_match_heading that was crually missing.
> 
> I've seen and subscribed to :
> https://github.com/indexdata/idzebra/issues/24
> 
>   > If you agree, I'll remove the token element from phrases-icu.xml and that would probably be case closed.
> Whooo which change is this ?
> 
> Best regards
> 
> Le 17/12/2020 Ă  03:31, dcook at prosentient.com.au a Ă©crit :
>> Thanks, Fridolin!
>>
>> I've reported it to Indexdata but they say they can't reproduce the problem...
>>
>> David Cook
>> Software Engineer
>> Prosentient Systems
>> Suite 7.03
>> 6a Glen St
>> Milsons Point NSW 2061
>> Australia
>>
>> Office: 02 9212 0899
>> Online: 02 8005 0595
>>
>> -----Original Message-----
>> From: Koha-devel <koha-devel-bounces at lists.koha-community.org> On
>> Behalf Of Fridolin SOMERS
>> Sent: Wednesday, 16 December 2020 7:54 PM
>> To: koha-devel at lists.koha-community.org
>> Subject: Re: [Koha-devel] LinkBibHeadingsToAuthorities not working as
>> expected
>>
>> Thanks a lot for your tests.
>>
>> I will try on our Bionic env with Zebra 2.2.1
>>
>> Le 11/12/2020 Ă  06:56, dcook at prosentient.com.au a Ă©crit :
>>> This one is driving me a bit crazy, so I’ve logged an issue with
>>> Indexdata and I’m hoping for the best:
>>> https://github.com/indexdata/idzebra/issues/24
>>> <https://github.com/indexdata/idzebra/issues/24>.
>>>
>>> David Cook
>>>
>>> Software Engineer
>>>
>>> Prosentient Systems
>>>
>>> Suite 7.03
>>>
>>> 6a Glen St
>>>
>>> Milsons Point NSW 2061
>>>
>>> Australia
>>>
>>> Office: 02 9212 0899
>>>
>>> Online: 02 8005 0595
>>>
>>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org> *On
>>> Behalf Of *dcook at prosentient.com.au
>>> *Sent:* Friday, 11 December 2020 4:31 PM
>>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org>
>>> *Subject:* Re: [Koha-devel] LinkBibHeadingsToAuthorities not working
>>> as expected
>>>
>>> Using koha-testing-docker set up for ICU and with my fix from
>>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198
>>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198>:
>>>
>>> querytype ccl2rpn
>>>
>>> set_cclfile /etc/koha/zebradb/ccl.properties
>>>
>>> format xml
>>>
>>> elements zebra::snippet
>>>
>>> Z> f Match-heading,phr,ext,do-not-truncate="the Q"
>>>
>>> Sent searchRequest.
>>>
>>> Received SearchResponse.
>>>
>>> Search was a success.
>>>
>>> Number of hits: 34, setno 24
>>>
>>> SearchResult-1: term=the cnt=34
>>>
>>> records returned: 0
>>>
>>> Elapsed: 0.000315
>>>
>>> Z> show
>>>
>>> Sent presentRequest (1+1).
>>>
>>> Records: 1
>>>
>>> Record type: XML
>>>
>>> <record xmlns="http://www.indexdata.com/zebra/
>>> <http://www.indexdata.com/zebra/>">
>>>
>>>      <snippet name="Match-heading" type="p"
>>> fields="Match">Xenophon<s>the</s>Historian</snippet>
>>>
>>> </record>nextResultSetPosition = 2
>>>
>>> Elapsed: 0.003864
>>>
>>> So that’s special… admittedly that’s Zebra 2.0.59…
>>>
>>> On Zebra 2.1.4 with ICU:
>>>
>>> Z> f Match-heading,phr,ext,do-not-truncate="the Q"
>>>
>>> Sent searchRequest.
>>>
>>> Received SearchResponse.
>>>
>>> Search was a success.
>>>
>>> Number of hits: 0, setno 2
>>>
>>> SearchResult-1: term=the  cnt=5383, term=Q cnt=10
>>>
>>> records returned: 0
>>>
>>> Elapsed: 0.009691
>>>
>>> Z> f Match-heading,phr,ext,do-not-truncate="the"
>>>
>>> Sent searchRequest.
>>>
>>> Received SearchResponse.
>>>
>>> Search was a success.
>>>
>>> Number of hits: 85, setno 3
>>>
>>> SearchResult-1: term=the cnt=85
>>>
>>> records returned: 0
>>>
>>> Elapsed: 0.002209
>>>
>>> However, 85 results is still too many. It should be 0 results.
>>>
>>> I can’t add the diagnostics from
>>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198
>>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198>
>>> right now though… so will have to get back to this one another day probably.
>>>
>>> But maybe someone else using Zebra with ICU can look into this problem too.
>>>
>>> It’s leading to lots of duplicate authorities it seems…
>>>
>>> David Cook
>>>
>>> Software Engineer
>>>
>>> Prosentient Systems
>>>
>>> Suite 7.03
>>>
>>> 6a Glen St
>>>
>>> Milsons Point NSW 2061
>>>
>>> Australia
>>>
>>> Office: 02 9212 0899
>>>
>>> Online: 02 8005 0595
>>>
>>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org
>>> <mailto:koha-devel-bounces at lists.koha-community.org>> *On Behalf Of
>>> *dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
>>> *Sent:* Friday, 11 December 2020 4:22 PM
>>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org
>>> <mailto:koha-devel at lists.koha-community.org>>
>>> *Subject:* Re: [Koha-devel] LinkBibHeadingsToAuthorities not working
>>> as expected
>>>
>>> Ugh yeah no… I’m reproducing it with koha-testing-docker too.
>>>
>>> Looks like yet another ICU bug…
>>>
>>> David Cook
>>>
>>> Software Engineer
>>>
>>> Prosentient Systems
>>>
>>> Suite 7.03
>>>
>>> 6a Glen St
>>>
>>> Milsons Point NSW 2061
>>>
>>> Australia
>>>
>>> Office: 02 9212 0899
>>>
>>> Online: 02 8005 0595
>>>
>>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org
>>> <mailto:koha-devel-bounces at lists.koha-community.org>> *On Behalf Of
>>> *dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
>>> *Sent:* Friday, 11 December 2020 3:50 PM
>>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org
>>> <mailto:koha-devel at lists.koha-community.org>>
>>> *Subject:* Re: [Koha-devel] LinkBibHeadingsToAuthorities not working
>>> as expected
>>>
>>> Then again… I can’t reproduce this problem on koha-testing-docker,
>>> but I can on a prod Koha running Zebra 2.1.4 with… a very strange
>>> /etc/koha/zebradb/etc/default.idx file…
>>>
>>> David Cook
>>>
>>> Software Engineer
>>>
>>> Prosentient Systems
>>>
>>> Suite 7.03
>>>
>>> 6a Glen St
>>>
>>> Milsons Point NSW 2061
>>>
>>> Australia
>>>
>>> Office: 02 9212 0899
>>>
>>> Online: 02 8005 0595
>>>
>>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org
>>> <mailto:koha-devel-bounces at lists.koha-community.org>> *On Behalf Of
>>> *dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
>>> *Sent:* Friday, 11 December 2020 3:32 PM
>>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org
>>> <mailto:koha-devel at lists.koha-community.org>>
>>> *Subject:* [Koha-devel] LinkBibHeadingsToAuthorities not working as
>>> expected
>>>
>>> Hi all,
>>>
>>> I’m still investigating, but it seems to me that the
>>> C4::Linker::Default in C4::Biblio::LinkBibHeadingsToAuthorities isn’t
>>> searching accurately using C4::Heading::authorities.
>>>
>>> I’m looking at C4::AuthoritiesMarc::SearchAuthorities and at a glance
>>> it looks OK, but in practice I think that my search queries are
>>> getting way too many results.
>>>
>>> By hand, if I try the following query:
>>>
>>> Match-heading,phr,ext,do-not-truncate="e"
>>>
>>> I get a huge number of results, which is odd, since that should be an
>>> “exact” match.
>>>
>>> I’ve opened
>>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198
>>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198>
>>> because I need to improve the diagnostics available for a Zebra
>>> authorities database, but yeah… not good. Hopefully I’ll know more soon.
>>>
>>> David Cook
>>>
>>> Software Engineer
>>>
>>> Prosentient Systems
>>>
>>> Suite 7.03
>>>
>>> 6a Glen St
>>>
>>> Milsons Point NSW 2061
>>>
>>> Australia
>>>
>>> Office: 02 9212 0899
>>>
>>> Online: 02 8005 0595
>>>
>>>
>>> _______________________________________________
>>> Koha-devel mailing list
>>> Koha-devel at lists.koha-community.org
>>> https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
>>> website : http://www.koha-community.org/ git :
>>> http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
>>>
>>
> 
> --
> Fridolin SOMERS <fridolin.somers at biblibre.com> Software and system maintainer 🦄
> BibLibre, France
> 
> 

-- 
Fridolin SOMERS <fridolin.somers at biblibre.com>
Software and system maintainer 🦄
BibLibre, France


More information about the Koha-devel mailing list