[Koha-devel] Zebra config problem (still 1)

Paul POULAIN paul.poulain at free.fr
Wed Feb 8 12:31:13 CET 2006


Adam Dickmeiss a écrit :
(answer to Adam question at the end)

I want to completly describe my history with zebra, to let you be aware 
of all I did, and maybe understand why I begin to really feel 
*discouraged* :
* just in case you don't know : i've been Koha Release Manager for 
version 2.0 and 2.2. I'm the main -almost only- author of the MARC 
support in Koha.
* when the 3.0 Release Manager was nominated, Joshua, he suggested to 
adopt Zebra. At first, I was not very happy with this proposal, as it 
adds a new tool for Koha, and makes install more complex. But other args 
convinced me it was the way to go.
* Thus I set up zebra on my computer, and began to move MARC stuff to 
zebra. I succedeed to have something working correctly after something 
like a week of work. The problem being that the zebra indexing was done 
through a perl exec() and zebraidx.
So, I waited for Perl-ZOOM very impatiently, letting the code as it for 
some months (2-3 ?).
When Perl-ZOOM arrived, I was very very happy.
But now i'm really no more happy at all, as I ran into many many many 
problems and feel quite stuck and alone with the problem.
I don't want to count how many days I've spend on koha/zebra without 
success, but that's something like 6-7 full days, probably more :-(

Here is a summary of all my problems :
* at 1st, I tried to setup a iso2709 (full MARC) DB. I ran into "Error 
updating 10002 => Encoding failed". After investigating and asking this 
list, 
(http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00015.html and 
following thread) it appears that iso2709 support was problematic and 
that I had better going XML.
That seemed a good idea to me, as XML is highly more comprehensive and a 
sex-appealing technology ;-)
* Thus, I changes some code in Koha to use MARCXML package 
(http://search.cpan.org/~esummers/MARC-XML-0.81/lib/MARC/File/XML.pm)
* But I still ran into the "Error updating 10002" After investigating a 
little bit more, adam finaly caught the culprit 
(http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00034.html). 
This time it was a compilation problem !!!
* Could it be my last problem ? no, unfortunatly. I ran into the 2 
recent problems : impossible to search, failure to index with RecordId.
* It appears finally to Mike 
(http://lists.gnu.org/archive/html/koha-zebra/2006-01/msg00038.html) 
that the search features were not in official yaz package, and a new 
package has been released !
* I'm still stuck with the indexing problem. I really thought I wanted 
to do something simple : index MARCXML data (produced by ed package) 
into zebra. Why it does not work is NOT clear to me.
I solved a problem with marc21.abs to be renamed to collection.abs, but 
didn't saw anything on this, and if Tümer had not seen this, I would not 
have found it myself ! (and i't still unclear to me why you have a 
marc21.abs where MACXML speaks of <collection> tag)

Now,I'm afraid there's still something undocumented somewhere, or 
bugged, or unreleased, or something like this.
I really begin to feel discouraged and alone.
Many thanks to Tümer that pointed me some problems, but seems as stuck 
as me :-(

I end with an answer to Adam suggestion with zebraidx -s update 
testrec.xml :

 >> <?xml version="1.0" encoding="UTF-8"?>
 >> <collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 >> xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.l ...
 > Your root element is collection. Not record. I don't think melm will
 > match that. Had you used record as root element - it should do it.
 >
 > It's always a good idea to try things out with
 >   zebraidx -s update testrec.xml
 > and see what gets matched.. (Look for the Idx: lines).

for XML :
 > <?xml version="1.0" encoding="UTF-8"?>
 > <collection xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:schemaLocation="http://www.loc.gov/MARC21/slim 
http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd" 
xmlns="http://www.loc.gov/MARC21/slim">	
 > <record>
 > 	<leader>00543     2200181   4500</leader>
 > 	<controlfield tag="001">19</controlfield>
 > 	<datafield tag="010" ind1=" " ind2=" ">
 > 		<subfield code="a">2010140001</subfield>
 > 		<subfield code="d">45 F</subfield>
 > 	</datafield>
 > 	<datafield tag="090" ind1=" " ind2=" ">
 > 		<subfield code="9">16</subfield>
 > 		<subfield code="a">16</subfield>
 > 	</datafield>
 > 	<datafield tag="100" ind1=" " ind2=" ">
 > 		<subfield code="a">1995                y0fre 0103    ba</subfield>
 > 	</datafield>
 > 	<datafield tag="101" ind1=" " ind2=" ">
 > 		<subfield code="a">fre</subfield>
 > 	</datafield>
 > 	<datafield tag="105" ind1=" " ind2=" ">
 > 		<subfield code="a">y       00  y</subfield>
 > 	</datafield>
 > 	<datafield tag="200" ind1="1" ind2=" ">
 > 		<subfield code="a">Pour l'honneur de l'esprit humain</subfield>
 > 		<subfield code="b">LIVR</subfield>
 > 		<subfield code="e">Les mathematiques aujourd'hui</subfield>
 > 		<subfield code="f">Jean DIEUDONNE</subfield>
 > 	</datafield>
 > 	<datafield tag="995" ind1=" " ind2=" ">
 > 		<subfield code="b">CDI</subfield>
 > 		<subfield code="c">CDI</subfield>
 > 		<subfield code="e">SL</subfield>
 > 		<subfield code="f">Non inventorie</subfield>
 > 		<subfield code="j">000006</subfield>
 > 		<subfield code="o">2</subfield>
 > 		<subfield code="9">27</subfield>
 > 	</datafield>
 > </record>
 > </collection>

with zebraidx -s update testrec.xml I get (many lines snipped, complete 
log at end of mail) :
 > Record type: 'collection'
 >     Local tag: 'collection'
 >          tag=collection/
 >                 Local tag: 'subfield'
 >                      tag=subfield/datafield/record/collection/
 >                     Data: '16'
 >               Idx: [w]bib1:Identifier-standard [1007] data XData:"16"
 >               Idx: [p]bib1:Identifier-standard [1007] data XData:"16"
 >                         Idx: [w]bib1:Any [1016] data XData:"16"
 >                      tag=subfield/datafield/record/collection/
 >                 Data: '
 >                 '
 >             Local tag: 'datafield'
 >                  tag=datafield/record/collection/
 >                 Data: '
 >                         '
 >                 Local tag: 'subfield'
 >                      tag=subfield/datafield/record/collection/
 >                     Data: 'Pour l'honneur de l'esprit humain'
 > Idx: [w]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
 > Idx: [p]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
 > Idx: [w]bib1:Any [1016] data XData:"Pour l'honneur de l'esprit humain"
 >                      tag=subfield/datafield/record/collection/
 >                 Data: '
 >                         '
 > 11:31:48-08/02 zebraidx(26418) [log] zebra_end_trans
 > 11:31:48-08/02 zebraidx(26418) [log] sorting section 1
 > 11:31:48-08/02 zebraidx(26418) [log] Iterations . . .     42
 > 11:31:48-08/02 zebraidx(26418) [log] Distinct words .     20
 > 11:31:48-08/02 zebraidx(26418) [log] Updates. . . . .     17
 > 11:31:48-08/02 zebraidx(26418) [log] Deletions. . . .      1
 > 11:31:48-08/02 zebraidx(26418) [log] Insertions . . .      2
 > 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_register_close 
p=0x8106c70
 > 11:31:48-08/02 zebraidx(26418) [log] Records:       0 i/u/d 0/0/0
 > 11:31:48-08/02 zebraidx(26418) [log] user/system: 0/0
 > 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_stop
 > 11:31:48-08/02 zebraidx(26418) [log] zebraidx times:  0.06  0.00  0.00
 > [paul at bureau unimarc]$


If I read correctly, The Identifier-standard [1007] is correctly 
detected, but it does not work anymore.






The complete log from zebraidx :
==========================================================
> Record type: 'collection'
>     Local tag: 'collection'
>          tag=collection/
>         Data: '
> 
>         '
>         Local tag: 'record'
>              tag=record/collection/
>             Data: '
>                 '
>             Local tag: 'leader'
>                  tag=leader/record/collection/
>                 Data: '00543     2200181   4500'
>                  tag=leader/record/collection/
>             Data: '
>                 '
>             Local tag: 'controlfield'
>                  tag=controlfield/record/collection/
>                 Data: '19'
>                  tag=controlfield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '2010140001'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '45 F'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '16'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '16'
>                         Idx: [w]bib1:Identifier-standard [1007] data XData:"16"
>                         Idx: [p]bib1:Identifier-standard [1007] data XData:"16"
>                         Idx: [w]bib1:Any [1016] data XData:"16"
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '1995                y0fre 0103    ba'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'fre'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'y       00  y'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Pour l'honneur de l'esprit humain'
>                         Idx: [w]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
>                         Idx: [p]bib1:Title [4] data XData:"Pour l'honneur de l'esprit humain"
>                         Idx: [w]bib1:Any [1016] data XData:"Pour l'honneur de l'esprit humain"
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'LIVR'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Les mathematiques aujourd'hui'
>                         Idx: [w]bib1:Title [4] data XData:"Les mathematiques aujourd'hui"
>                         Idx: [p]bib1:Title [4] data XData:"Les mathematiques aujourd'hui"
>                         Idx: [w]bib1:Any [1016] data XData:"Les mathematiques aujourd'hui"
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Jean DIEUDONNE'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>                 '
>             Local tag: 'datafield'
>                  tag=datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'CDI'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'CDI'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'SL'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: 'Non inventorie'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '000006'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '2'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                         '
>                 Local tag: 'subfield'
>                      tag=subfield/datafield/record/collection/
>                     Data: '27'
>                      tag=subfield/datafield/record/collection/
>                 Data: '
>                 '
>                  tag=datafield/record/collection/
>             Data: '
>         '
>              tag=record/collection/
>         Data: '
> '
>          tag=collection/
> -------------
> 
> 11:31:48-08/02 zebraidx(26418) [log] zebra_end_trans
> 11:31:48-08/02 zebraidx(26418) [log] sorting section 1
> 11:31:48-08/02 zebraidx(26418) [log] Iterations . . .     42
> 11:31:48-08/02 zebraidx(26418) [log] Distinct words .     20
> 11:31:48-08/02 zebraidx(26418) [log] Updates. . . . .     17
> 11:31:48-08/02 zebraidx(26418) [log] Deletions. . . .      1
> 11:31:48-08/02 zebraidx(26418) [log] Insertions . . .      2
> 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_register_close p=0x8106c70
> 11:31:48-08/02 zebraidx(26418) [log] Records:       0 i/u/d 0/0/0
> 11:31:48-08/02 zebraidx(26418) [log] user/system: 0/0
> 11:31:48-08/02 zebraidx(26418) [log][app2] zebra_stop
> 11:31:48-08/02 zebraidx(26418) [log] zebraidx times:  0.06  0.00  0.00
> [paul at bureau unimarc]$                                                      

-- 
Paul POULAIN et Henri Damien LAURENT
Consultants indépendants
en logiciels libres et bibliothéconomie (http://www.koha-fr.org)





More information about the Koha-devel mailing list