[Koha-bugs] [Bug 11232] Retrieve facets from Zebra

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Fri Mar 21 19:47:51 CET 2014


http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=11232

--- Comment #23 from Tomás Cohen Arazi <tomascohen at gmail.com> ---
I'd like to add that the default behaviour for facets is to limit them to 20...
The indexdata folks explained to me that a higher count can be hardcoded in the
configuration file. For instance:

 <retrieval syntax="xml" name="zebra::facet::subject:p:100"/>

will set the limit to 100.

It is worth mentioning that Zebra tokenizes data for indexing purposes, and
also lower cases it. Thus, to have facets work as we use them now (they show
the exact data that is on the record) then :0 should be used.

With :p :

<record xmlns="http://www.indexdata.com/zebra/">
  <facet type="p" index="subject">
    <term coccur="1" occur="49">sang</term>
    <term coccur="1" occur="45">tegneserier amerikanske 25458100</term>
    <term coccur="1" occur="42">02</term>
    <term coccur="1" occur="42">klaver</term>
    <term coccur="1" occur="28">spillefilmer amerikanske 28231100</term>
    <term coccur="1" occur="17">filmer komedie 2040551600</term>
    <term coccur="1" occur="16">sang klaver 02</term>
    <term coccur="1" occur="12">filmer romantisk 2040551900</term>
    <term coccur="1" occur="5">tidsskrifter norske 13171900</term>
    <term coccur="1" occur="3">barnehager tidsskrifter 372 2105 15283900</term>
    <term coccur="1" occur="3">basketball 796 323 d5 10329200</term>
    <term coccur="1" occur="3">litteratur historie og kritikk 809 d5
10045100</term>
    <term coccur="1" occur="3">malerkunst historie generelt 759 d5 750 z
24632900</term>
    <term coccur="1" occur="3">sporreboker 793 73 d5 12683600</term>
    <term coccur="1" occur="3">sprak 400 d5 2041875500</term>
    <term coccur="1" occur="3">tegneserier science fiction bs 26382800</term>
  </facet>
</record>

So we should define facets like this:

 <retrieval syntax="xml" name="zebra::facet::subject:0:100"/>

The element set in the call should be the same and will get this results:

Z> elem zebra::facet::subject:0:100
Z> s
Sent presentRequest (1+1).
Records: 1
Record type: XML
<record xmlns="http://www.indexdata.com/zebra/">
  <facet type="0" index="subject">
    <term coccur="1" occur="49">Sang</term>
    <term coccur="1" occur="45">Tegneserier, Amerikanske 25458100</term>
    <term coccur="1" occur="42">02</term>
    <term coccur="1" occur="42">Klaver</term>
    <term coccur="1" occur="28">Spillefilmer, Amerikanske 28231100</term>
    <term coccur="1" occur="17">Filmer Komedie 2040551600</term>
    <term coccur="1" occur="16">Sang Klaver 02</term>
    <term coccur="1" occur="12">Filmer Romantisk 2040551900</term>
    <term coccur="1" occur="5">Tidsskrifter, Norske 13171900</term>
    <term coccur="1" occur="3">Barnehager Tidsskrifter 372.2105 15283900</term>
    <term coccur="1" occur="3">Basketball 796.323 d5 10329200</term>
    <term coccur="1" occur="3">Litteratur Historie og kritikk 809 d5
10045100</term>
    <term coccur="1" occur="3">Malerkunst Historie Generelt 759 d5 750 z
24632900</term>
    <term coccur="1" occur="3">Språk 400 d5 2041875500</term>
    <term coccur="1" occur="3">Spørrebøker 793.73 d5 12683600</term>
    <term coccur="1" occur="3">Tegneserier Science fiction BS 26382800</term>
  </facet>
</record>


---
And I'm sure we'll be getting better results in the encoding front, because
Zebra should be returning the exact data we sent for indexing.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list