[Koha-zebra] Re: [Koha-devel] Building zebradb

Wed Mar 15 18:18:54 CET 2006

Tümer Garip a écrit :
> Hi,

Hello Tümer,

> We have now put the zebra into production level systems. So here is some
> experience to share.
> Building the zebra database from single records is a veeeeery looong
> process. (100K records 150k items)
> 
> Best method we found:
> 
> 1- Change zebra.cfg file to include
> 
> iso2079.recordType:grs.marcxml.collection
> recordType:grs.xml.collection
if I understand, you now have 2 types of records in your DB (or 2 
differents representations of a record)

> 2- Write (or hack export.pl) to export all the marc records as one big
> chunk to the correct directory with an extension .iso2079 And system
> call "zebraidx -g iso2079 -d <dbnamehere> update records -n".

Could you send us the code for export.pl ?

> This ensures that zebra knows its reading marc records rather than xml
> and builds 100K+ records in zooming speed.
> Your zoom module always uses the grs.xml filter while you can anytime
> update or reindex any big chunk of the database as long as you have marc
> records.

Great, I think I understand.

> 3-We are still using the old API weso  read the xml and use
> MARC::Record->new_from_xml( $xmldata )
> A note here that we did not had to upgrade MARC::Record or MARC::Charset
> at all. Any marc created within KOHA is UTF8 and any marc imported into
> KOHA (old marc_subfield_tables) was correctly decoded to utf8 with
> char_decode of biblio.

Could it be possible to use this zebra.cfg to manage iso2709 through 
Perl-ZOOM ?
If yes, we could avoid marc => xml => zoom and zoom => xml => marc 
transformations.

> 4- We modified circ2.pm and items table to have item onloan field and
> mapped it to marc holdings data. Now our opac search do not call mysql
> but for the branchname.

Could you send us/me the code too ?

> 5- Average updates per day is about 2000 (circulation+cataloger). I can
> say that the speed of the zoom search which slows down during a commit
> operation is acceptable considering the speed gain we have on the
> search.
> 
> 6- Zebra behaves very well with searches but is very tempremental with
> updates. A queue of updates sometimes crashes the zebraserver. When the
> database crash we can not save anything even though we are using shadow
> files. I'll be reporting on this issue once we can isolate the problems.

You're definetly a gem too ;-)

-- 
Paul POULAIN et Henri Damien LAURENT
Consultants indépendants
en logiciels libres et bibliothéconomie (http://www.koha-fr.org)