[Koha-devel] Fixing exact search in Zebra ICU (formerly LinkBibHeadingsToAuthorities not working as expected)

dcook at prosentient.com.au dcook at prosentient.com.au
Wed Dec 23 00:16:35 CET 2020


Thanks, Fridolin.

I've opened up https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27299 to track this topic in Koha. I've added a See Also to 18017.

I've added a patch to 27299 with the change I proposed in https://github.com/indexdata/idzebra/issues/24. 

I've tested it outside of Koha and it works as I expected/hoped. I just want to get Indexdata's blessing and then I'll push for it in Koha. 

David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia

Office: 02 9212 0899
Online: 02 8005 0595

-----Original Message-----
From: Fridolin SOMERS <fridolin.somers at biblibre.com> 
Sent: Tuesday, 22 December 2020 10:11 PM
To: dcook at prosentient.com.au; koha-devel at lists.koha-community.org
Subject: Re: [Koha-devel] LinkBibHeadingsToAuthorities not working as expected

Hi,

I've pushed on
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=18017

I copied from MARC21 the use of index_heading and index_match_heading that was crually missing.

I've seen and subscribed to :
https://github.com/indexdata/idzebra/issues/24

 > If you agree, I'll remove the token element from phrases-icu.xml and that would probably be case closed.
Whooo which change is this ?

Best regards

Le 17/12/2020 à 03:31, dcook at prosentient.com.au a écrit :
> Thanks, Fridolin!
> 
> I've reported it to Indexdata but they say they can't reproduce the problem...
> 
> David Cook
> Software Engineer
> Prosentient Systems
> Suite 7.03
> 6a Glen St
> Milsons Point NSW 2061
> Australia
> 
> Office: 02 9212 0899
> Online: 02 8005 0595
> 
> -----Original Message-----
> From: Koha-devel <koha-devel-bounces at lists.koha-community.org> On 
> Behalf Of Fridolin SOMERS
> Sent: Wednesday, 16 December 2020 7:54 PM
> To: koha-devel at lists.koha-community.org
> Subject: Re: [Koha-devel] LinkBibHeadingsToAuthorities not working as 
> expected
> 
> Thanks a lot for your tests.
> 
> I will try on our Bionic env with Zebra 2.2.1
> 
> Le 11/12/2020 à 06:56, dcook at prosentient.com.au a écrit :
>> This one is driving me a bit crazy, so I’ve logged an issue with 
>> Indexdata and I’m hoping for the best:
>> https://github.com/indexdata/idzebra/issues/24
>> <https://github.com/indexdata/idzebra/issues/24>.
>>
>> David Cook
>>
>> Software Engineer
>>
>> Prosentient Systems
>>
>> Suite 7.03
>>
>> 6a Glen St
>>
>> Milsons Point NSW 2061
>>
>> Australia
>>
>> Office: 02 9212 0899
>>
>> Online: 02 8005 0595
>>
>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org> *On 
>> Behalf Of *dcook at prosentient.com.au
>> *Sent:* Friday, 11 December 2020 4:31 PM
>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org>
>> *Subject:* Re: [Koha-devel] LinkBibHeadingsToAuthorities not working 
>> as expected
>>
>> Using koha-testing-docker set up for ICU and with my fix from
>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198
>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198>:
>>
>> querytype ccl2rpn
>>
>> set_cclfile /etc/koha/zebradb/ccl.properties
>>
>> format xml
>>
>> elements zebra::snippet
>>
>> Z> f Match-heading,phr,ext,do-not-truncate="the Q"
>>
>> Sent searchRequest.
>>
>> Received SearchResponse.
>>
>> Search was a success.
>>
>> Number of hits: 34, setno 24
>>
>> SearchResult-1: term=the cnt=34
>>
>> records returned: 0
>>
>> Elapsed: 0.000315
>>
>> Z> show
>>
>> Sent presentRequest (1+1).
>>
>> Records: 1
>>
>> Record type: XML
>>
>> <record xmlns="http://www.indexdata.com/zebra/
>> <http://www.indexdata.com/zebra/>">
>>
>>     <snippet name="Match-heading" type="p"
>> fields="Match">Xenophon<s>the</s>Historian</snippet>
>>
>> </record>nextResultSetPosition = 2
>>
>> Elapsed: 0.003864
>>
>> So that’s special… admittedly that’s Zebra 2.0.59…
>>
>> On Zebra 2.1.4 with ICU:
>>
>> Z> f Match-heading,phr,ext,do-not-truncate="the Q"
>>
>> Sent searchRequest.
>>
>> Received SearchResponse.
>>
>> Search was a success.
>>
>> Number of hits: 0, setno 2
>>
>> SearchResult-1: term=the  cnt=5383, term=Q cnt=10
>>
>> records returned: 0
>>
>> Elapsed: 0.009691
>>
>> Z> f Match-heading,phr,ext,do-not-truncate="the"
>>
>> Sent searchRequest.
>>
>> Received SearchResponse.
>>
>> Search was a success.
>>
>> Number of hits: 85, setno 3
>>
>> SearchResult-1: term=the cnt=85
>>
>> records returned: 0
>>
>> Elapsed: 0.002209
>>
>> However, 85 results is still too many. It should be 0 results.
>>
>> I can’t add the diagnostics from
>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198
>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198>
>> right now though… so will have to get back to this one another day probably.
>>
>> But maybe someone else using Zebra with ICU can look into this problem too.
>>
>> It’s leading to lots of duplicate authorities it seems…
>>
>> David Cook
>>
>> Software Engineer
>>
>> Prosentient Systems
>>
>> Suite 7.03
>>
>> 6a Glen St
>>
>> Milsons Point NSW 2061
>>
>> Australia
>>
>> Office: 02 9212 0899
>>
>> Online: 02 8005 0595
>>
>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org
>> <mailto:koha-devel-bounces at lists.koha-community.org>> *On Behalf Of 
>> *dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
>> *Sent:* Friday, 11 December 2020 4:22 PM
>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org
>> <mailto:koha-devel at lists.koha-community.org>>
>> *Subject:* Re: [Koha-devel] LinkBibHeadingsToAuthorities not working 
>> as expected
>>
>> Ugh yeah no… I’m reproducing it with koha-testing-docker too.
>>
>> Looks like yet another ICU bug…
>>
>> David Cook
>>
>> Software Engineer
>>
>> Prosentient Systems
>>
>> Suite 7.03
>>
>> 6a Glen St
>>
>> Milsons Point NSW 2061
>>
>> Australia
>>
>> Office: 02 9212 0899
>>
>> Online: 02 8005 0595
>>
>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org
>> <mailto:koha-devel-bounces at lists.koha-community.org>> *On Behalf Of 
>> *dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
>> *Sent:* Friday, 11 December 2020 3:50 PM
>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org
>> <mailto:koha-devel at lists.koha-community.org>>
>> *Subject:* Re: [Koha-devel] LinkBibHeadingsToAuthorities not working 
>> as expected
>>
>> Then again… I can’t reproduce this problem on koha-testing-docker, 
>> but I can on a prod Koha running Zebra 2.1.4 with… a very strange 
>> /etc/koha/zebradb/etc/default.idx file…
>>
>> David Cook
>>
>> Software Engineer
>>
>> Prosentient Systems
>>
>> Suite 7.03
>>
>> 6a Glen St
>>
>> Milsons Point NSW 2061
>>
>> Australia
>>
>> Office: 02 9212 0899
>>
>> Online: 02 8005 0595
>>
>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org
>> <mailto:koha-devel-bounces at lists.koha-community.org>> *On Behalf Of 
>> *dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
>> *Sent:* Friday, 11 December 2020 3:32 PM
>> *To:* 'Koha Devel' <koha-devel at lists.koha-community.org
>> <mailto:koha-devel at lists.koha-community.org>>
>> *Subject:* [Koha-devel] LinkBibHeadingsToAuthorities not working as 
>> expected
>>
>> Hi all,
>>
>> I’m still investigating, but it seems to me that the 
>> C4::Linker::Default in C4::Biblio::LinkBibHeadingsToAuthorities isn’t 
>> searching accurately using C4::Heading::authorities.
>>
>> I’m looking at C4::AuthoritiesMarc::SearchAuthorities and at a glance 
>> it looks OK, but in practice I think that my search queries are 
>> getting way too many results.
>>
>> By hand, if I try the following query:
>>
>> Match-heading,phr,ext,do-not-truncate="e"
>>
>> I get a huge number of results, which is odd, since that should be an 
>> “exact” match.
>>
>> I’ve opened
>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198
>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27198>
>> because I need to improve the diagnostics available for a Zebra 
>> authorities database, but yeah… not good. Hopefully I’ll know more soon.
>>
>> David Cook
>>
>> Software Engineer
>>
>> Prosentient Systems
>>
>> Suite 7.03
>>
>> 6a Glen St
>>
>> Milsons Point NSW 2061
>>
>> Australia
>>
>> Office: 02 9212 0899
>>
>> Online: 02 8005 0595
>>
>>
>> _______________________________________________
>> Koha-devel mailing list
>> Koha-devel at lists.koha-community.org
>> https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
>> website : http://www.koha-community.org/ git :
>> http://git.koha-community.org/ bugs : http://bugs.koha-community.org/
>>
> 

--
Fridolin SOMERS <fridolin.somers at biblibre.com> Software and system maintainer 🦄
BibLibre, France




More information about the Koha-devel mailing list