[Koha-devel] How is relevance weighted in zebra search?

David Cook dcook at prosentient.com.au
Mon Apr 18 06:29:55 CEST 2016


I think that Bywater post from 2012 is actually quite misleading and
downright wrong in some cases
 but Ian’s comments in the listserv are good. 

 

Koha’s C4::Search code needs to die in a fire
 

 

François is right about QueryAutoTruncate. If I recall correctly, it
actually silently turns off relevance searching.

 

1)      The most important thing to remember is that
C4::Search::_build_weighted_query() adds a bunch of stuff to the mix. The
query the user makes is very very different from the one that is sent to
Zebra. Add some debugging logging to Koha or monitor your Zebra server to
see what queries are “actually” being sent to Zebra. That ByWater post says
that some MARC tags are more relevant than others. That’s not really true
per se. However, Koha adds several title indexes to the query and gives them
high ranking weights, I believe. So it’s not that some MARC fields are
ranked higher than others in general. Rather, Search.pm just loads up the
query to favour title indexes (which are not the same thing as MARC tags
anyway
)

2)      I’d review
<http://www.indexdata.com/zebra/doc/administration-ranking.html>
http://www.indexdata.com/zebra/doc/administration-ranking.html. The query is
broken down into atomic parts, those atomic queries are run, hit lists are
created for these atomic parts, the documents in the hit lists are scored,
and then all the atomic queries are merged together to create a master hit
list and the total document score is calculated. As far I know, it just uses
term frequency, and not proximity (unless you tell it to also use
proximity).  

 

What query did you use in yaz-client? I suspect that you didn’t include
ranking? If you did, it probably wasn’t the same query used by Koha, unless
the query weighting was being turned off (say by QueryAutoTruncate). 

 

When in doubt, figure out what Koha is sending to Zebra. That should almost
always indicate what’s actually going on. 

 

Check the search results to see if they’re actually in relevance order. Just
look at the biblionumbers for the first 5 results. If they’re in perfect
ascending/descending order, chances are that your relevance is silently
turned off. 

 

David Cook

Systems Librarian

 

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

 

Office: 02 9212 0899

Direct: 02 8005 0595

 

From: koha-devel-bounces at lists.koha-community.org
[mailto:koha-devel-bounces at lists.koha-community.org] On Behalf Of Francois
Charbonnier
Sent: Tuesday, 22 March 2016 2:17 AM
To: koha-devel at lists.koha-community.org
Subject: Re: [Koha-devel] How is relevance weighted in zebra search?

 

Hi,

I'll suggest you to read this and that :
http://bywatersolutions.com/2012/11/27/a-little-bit-about-relevance-in-koha/
https://lists.katipo.co.nz/public/koha/2012-February/031909.html

Both helped me to understand a bit more how relevance works in Koha.

And, try to desactivate the QueryAutoTruncate system preferences. It doesn't
mix well with the relevance ranking.



François Charbonnier,
Bibl. prof. / Chef de produits

Tél.  : (888) 604-2627
 <mailto:francois.charbonnier at inLibro.com> francois.charbonnier at inLibro.com 

inLibro | pour esprit libre |  <http://www.inLibro.com> www.inLibro.com 

Le 2016-03-21 09:06, Barton Chittenden a écrit :

Hi, 

 

We just upgraded a library from Koha 3.0 to 3.22. After the upgrade, the
library complained that the title "Golden Age" was at the bottom of their
OPAC search results.

 

I checked the obvious first:

 

OPACdefaultSortField had been set to 'relevance' with OPACdefaultSortOrder
set to 'ascending', which would definitely be backwards (least relevant
first)... I changed to to 'ascending', and got exactly the same results.

 

Restarting memcache didn't help.

 

When I searched using yaz-client, I found that the results that were coming
first in the list had the phrase "golden age" in 520$a.

 

Here's the version of zebra:

 

    $ idzebra-config --version

    2.0.59

 

At this point, I have two major questions:

 

1) Where do I check which fields are ranked more highly in a relevance
search? (I would think that title should be ranked higher than 520$a --
summary).

2) Why didn't the sort order change when I changed OPACdefaultSortOrder from
ascending to descending?






_______________________________________________
Koha-devel mailing list
Koha-devel at lists.koha-community.org
<mailto:Koha-devel at lists.koha-community.org> 
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20160418/ff751edc/attachment-0001.html>


More information about the Koha-devel mailing list