[Koha-bugs] [Bug 13665] New: Retrieve facets from zebra is slow

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Wed Feb 4 12:14:22 CET 2015


http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=13665

            Bug ID: 13665
           Summary: Retrieve facets from zebra is slow
 Change sponsored?: ---
           Product: Koha
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: major
          Priority: P5 - low
         Component: Searching
          Assignee: gmcharlt at gmail.com
          Reporter: jonathan.druart at biblibre.com
        QA Contact: testopia at bugs.koha-community.org
        Depends on: 11232

With a 1 million biblio records installation (MARC21 + DOM), a search is very
slow if facets are retrieved from zebra.

Debugging, I found where the processing time is spent: 
In C4::Search::_get_facet_from_result_set, the line
    my $facet = $rs->record( 0 )->raw;
can spent up to 3 seconds!
Actually it's the ->record call, not the ->raw.
In ZOOM->record, the time is spent in
  my $_rec = Net::Z3950::ZOOM::resultset_record($this->_rs(), $which);

I stopped the track game at this point.

So for instance, with a facet holdingbranch, (FacetMaxCount set to 20) the
element zebra::facet::su-to:0:20" is set, $rs->size returns 962076, and the
total execution time for _get_facet_from_result_set (only for this facet!) is
3.2sec.

If I set FacetMaxCount to 1, I got: 0.3 sec, for 10: 1.67. So quite linear.

Lets compare with yaz-client:
Z> open unix:/home/koha/var/run/zebradb/bibliosocket
Connecting...OK.
Sent initrequest.
Connection accepted by v3 target.
ID     : 81
Name   : Zebra Information Server/GFS/YAZ
Version: 4.2.30 98864b44c654645bc16b2c54f822dc2e45a93031
Options: search present delSet triggerResourceCtrl scan sort extendedServices
namedResultSets
Elapsed: 0.009150
Z> base biblios
Z> format xml
Z> elem zebra::facet::holdingbranch:0:20
Z> f d
Sent searchRequest.
Received SearchResponse.
Search was a success.
Number of hits: 809470, setno 1
SearchResult-1: term=d cnt=809470
records returned: 0
Elapsed: 0.078905
Z> s 1+1
Sent presentRequest (1+1).
Records: 1
Record type: XML
<record xmlns="http://www.indexdata.com/zebra/">
  <facet type="0" index="holdingbranch">
    <term coccur="941" occur="103144">br1</term>
    [...]
  </facet>
</record>
nextResultSetPosition = 2
Elapsed: 1.393694

To compare with the old facet method, I calculated the time spend in 
C4::Search::GetFacets (for 20 facets):
>From zebra (new): 9.3sec
>From Records (old): 0.16sec (with maxRecordsForFacets = 20)
>From Records (old): 1.85sec (with maxRecordsForFacets = 100)
>From Records (old): 15.1sec (with maxRecordsForFacets = 1000)

Note that the machine is a test machine (VE) and is quite slow.
Firebug tells me that with the new method (20 facets), the total load page is
~30sec, with the old method (20 facets calculated from 20 records) is 8-9sec

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list