[Koha-devel] Bugs related to Zebra indexing for Unimarc

Tue Aug 26 11:21:30 CEST 2014

Hi Matieu,

Il 26/08/2014 10:08, Mathieu Saby ha scritto:
> He was about to sign the 2d patch about 7XX fields (authors)
> http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=9352
> But there is an issue that I am not sure to be able to fix:
> The patch index only relevant subfields in author index (so, no more the
> profession of the author, its date, or its address, because those
> information are not here to be searched by users, but to disambiguate
> authors).
> As a result, each subfield is indexed separately in "phr" and "word"
> index. So the complete name is no more searchable with "au:phr" index
> Ex :
> in master, you can retreive a record with "Doe John" in "au:phr" index
> if you have 700$aDoe$bJohn
> with the patch, you get 0 record
>
> I don't want to spend days on that... so do you know if the problem
> occurs also for MARC21?

I think that a solution is possible, but is not easy.

As first step for au:phr we need:

<index_subfields 
xmlns="http://www.koha-community.org/schemas/index-defs" tag="700" 
subfields="abdg">
...
</index_data_field>

a different entry from the entry of au:word that is:
<index_subfields 
xmlns="http://www.koha-community.org/schemas/index-defs" tag="700" 
subfields="bdg">
...
</index_data_field>

Then we need a XSLT procedure that create a phrase instead of single 
subunits.

In fact DOM indexing is XSLT.

In MARC21 all subfield in 100 (equivalent of 700 in Unimarc) are indexed 
in one shot.

In biblio-koha-indexdefs.xml:
<index_data_field 
xmlns="http://www.koha-community.org/schemas/index-defs" tag="100">
     <target_index>Author:w</target_index>
     <target_index>Author:p</target_index>
     <target_index>Author:s</target_index>
     <target_index>Author-title:w</target_index>
     <target_index>Author-name-personal:w</target_index>
     <target_index>Name:w</target_index>
     <target_index>Name-and-title:w</target_index>
      <target_index>Personal-name:w</target_index>
</index_data_field>

The result in biblio-zebra-indexdefs.xsl is:

<xslo:template mode="index_data_field" match="marc:datafield[@tag='100']">
   <z:index name="Author:w Author:p Author:s Author-title:w 
Author-name-personal:w Name:w Name-and-title:w Personal-name:w">
     <xslo:variable name="raw_heading">
       <xslo:for-each select="marc:subfield">
         <xslo:if test="position() > 1">
           <xslo:value-of select="substring(' ', 1, 1)"/>
         </xslo:if>
         <xslo:value-of select="."/>
       </xslo:for-each>
     </xslo:variable>
     <xslo:value-of select="normalize-space($raw_heading)"/>
   </z:index>
</xslo:template>

> Is there a way to resolve it by using some
> (new?) template in *koha-indexdefs-to-zebra.xsl ?

Probsbly is, but we need to write a new template

Bye
Zeno Tajoli

-- 
Dr. Zeno Tajoli
Soluzioni per la Ricerca Istituzionale - Automazione Biblioteche
z.tajoli at cineca.it
fax +39 02 2135520
CINECA - Sede operativa di Segrate