[Koha-zebra] zebra config problem (still 0, yes, really 0 !)

Paul POULAIN paul.poulain at free.fr
Thu Feb 9 15:18:58 CET 2006


Mike Taylor a écrit :
>>* benchmarking : it seems the zebraidx update is faster than lightning 
>>(400biblios/sec : 10 000biblios in 25seconds), while ZOOM indexing is 
>>slow (something like 25biblios/second) More benchmarking could be done.
> That is a surprising difference, since as you no doubt know, "ZOOM
> indexing" is merely the use of ZOOM to pass the records to Zebra for
> indexing.  I flatly refuse to believe that the communication layer is
> responsible for a slow-down by a factor of 16, so something else is
> going on here.  My best guess is that "zebraidx update" is making use
> of caching mechanisms that ZOOM's update requests are not benefiting
> from.  There may be a way to have ZOOM request that caching: Adam will
> be able to tell us.

Just a bet :
If I hear my SCSI disk correctly & read logs accordingly, it seems that 
the zebraidx update reads all records and indexes them all at once, 
while ZOOM indexes them one by one.
Thus, you have a lot of useless SCSI writes.

The main question here is that we haven't decided yet wether we will 
store item status in zebra DB of just in SQL. (If we store status in 
zebra, then z3950 queries could be 100% complete). With a 25/s indexing 
speed, i think we could afford it. But I didn't made a true benchmark, 
it's just a 1st measure !

(note I won't play lottery if I won my bet, as euro million has been won 
last week : 183 000 000 EUR ! we're back to a small 15 000 000, 
increased by something like 15 every week until someone win !)

-- 
Paul POULAIN et Henri Damien LAURENT
Consultants indépendants
en logiciels libres et bibliothéconomie (http://www.koha-fr.org)





More information about the Koha-zebra mailing list