[Koha-patches] [PATCH] Bug 9352 : More specific indexing of UNIMARC 7XX fields in Zebra

Mathieu Saby mathieu.saby at univ-rennes2.fr
Sat Jan 5 13:27:20 CET 2013


Problem :
Zebra indexes all subfields of UNIMARC 7XX fields in author index, including $9, $3, $4 (function code), $f (dates of authors), $c (additions other than dates), $p (address).
It causes Koha to give too much results.
For example, if an author is born in 1984 and is a teacher, searching "1984" or "teacher" in simple search will return all work by this author if these pieces of informations are in 7XX field. This is not how most ILS work, and it should be corrected.

Solution :
This patch makes indexing of unimarc 7XX fields more specific. For each field, useless subfields are not indexed.
70X : Do not index $f (dates),$c (additions other than dates),$p (affiliation/address),$3,$4. Index $9 only in Koha-Auth-Number.
710-712 : Do not index $p (affiliation/address),$3,$4. Index $9 only in Koha-Auth-Number. (I kept all other subfields : even if some may be useless I am not sure of it)
716 : Do not index $f (dates),$c (additions other than dates),$3,$4. Index $9 only in Koha-Auth-Number.
72X : Do not index $f (dates),$3,$4. Index $9 only in Koha-Auth-Number.
730 : Do not index $4. Index $9 only in Koha-Auth-Number.
Additionnaly, this patch indexes 205$f/$g in Author index (author of the edition of the work)
Testing :
a/ Create a record with
700$a Doe $b John $f1950 $cteacher $4070
716$a Trademark $f1960 $cgreat $4340
720$a Family $f1980 $4651
205$a 1st edition $fBy some guy $gAnd other guys

a/ Before applying patch :
Search in simple search and author search : "teacher", "great", "1950", "070", "340", "651" => you will see the record among the results
Search in simple search and author search : "Doe", "John Doe", "Trademark", "Family"        => you will see the record among the results
Search in simple search and author search : "guy", "guys"                                   => you will not see the record among the results


b/ Before applying patch :
Search in simple search and author search : "teacher", "great", "1950", "070", "340", "651" => you will not see the record among the results
Search in simple search and author search : "Doe", "John Doe", "Trademark", "Family"        => you will see the record among the results
Search in simple search and author search : "guy", "guys"                                   => you will see the record among the results


---
  etc/zebradb/marc_defs/unimarc/biblios/record.abs |   89 ++++++++++++++++------
  1 file changed, 67 insertions(+), 22 deletions(-)

diff --git a/etc/zebradb/marc_defs/unimarc/biblios/record.abs b/etc/zebradb/marc_defs/unimarc/biblios/record.abs
index 44a5bbe..e7fe909 100644
--- a/etc/zebradb/marc_defs/unimarc/biblios/record.abs
+++ b/etc/zebradb/marc_defs/unimarc/biblios/record.abs
@@ -125,39 +125,84 @@ melm 116$a     Graphics-type:w:range(data,0,1),Graphics-support:w:range(data,1,1
  melm 200$f		Author:w,Author:p
  # other Authors
  melm 200$g		Author:w,Author:p
+
+# main author of the edition
+melm 205$f        Author:w,Author:p
+# other authors of the edition
+melm 205$g        Author:w,Author:p
+
  # physical Author
+# Do not index $f (dates),$c (additions other than dates),$p (affiliation/address),$3,$4. Index $9 only in Koha-Auth-Number.
  melm 700$9      Koha-Auth-Number,Koha-Auth-Number:n
  melm 700$a      Author,Personal-name,Author:p,Personal-name:p,Personal-name,Author:s
-melm 700        Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 700$b      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 700$d      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 700$g      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+
  melm 701$9      Koha-Auth-Number,Koha-Auth-Number:n
-melm 701        Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 701$a      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 701$b      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 701$d      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 701$g      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+
  melm 702$9      Koha-Auth-Number,Koha-Auth-Number:n
-melm 702        Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 702$a      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 702$b      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 702$d      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
+melm 702$g      Author,Personal-name,Author:p,Personal-name:p,Personal-name:p
  
  # collective Author
-melm 710$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 710    Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
-melm 711$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 711    Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
-
-melm 712$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 712    Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+# Do not index $p (affiliation/address),$3,$4. Index $9 only in Koha-Auth-Number.
+melm 710$9      Koha-Auth-Number,Koha-Auth-Number:n
+melm 710$a      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 710$b      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 710$c      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 710$d      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 710$e      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 710$f      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 710$g      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 710$h      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+
+melm 711$9      Koha-Auth-Number,Koha-Auth-Number:n
+melm 711$a      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 711$b      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 711$c      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 711$d      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 711$e      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 711$f      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 711$g      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 711$h      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+
+melm 712$9      Koha-Auth-Number,Koha-Auth-Number:n
+melm 712$a      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 712$b      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 712$c      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 712$d      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 712$e      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 712$f      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 712$g      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
+melm 712$h      Author,Author-name-corporate,Author-name-conference,Corporate-name,Conference-name,Author:p,Author-name-corporate:p,Author-name-conference:p,Corporate-name:p,Conference-name:p
  
  # trademark Author : 716
-melm 716$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 716    Author:w,Author:p
+# Do not index $f (dates),$c (additions other than dates),$3,$4. Index $9 only in Koha-Auth-Number.
+melm 716$9    Koha-Auth-Number,Koha-Auth-Number:n
+melm 716$a    Author:w,Author:p
  
  # family Author : 72X
-melm 720$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 720    Author:w,Author:p
-melm 721$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 721    Author:w,Author:p
-melm 722$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 722    Author:w,Author:p
-
-# name-responsabily Author
-melm 730$9        Koha-Auth-Number,Koha-Auth-Number:n
-melm 730    Author:w,Author:p
+# Do not index $f (dates),$3,$4. Index $9 only in Koha-Auth-Number.
+melm 720$9    Koha-Auth-Number,Koha-Auth-Number:n
+melm 720$a    Author:w,Author:p
+
+melm 721$9    Koha-Auth-Number,Koha-Auth-Number:n
+melm 721$a    Author:w,Author:p
+
+melm 722$9    Koha-Auth-Number,Koha-Auth-Number:n
+melm 722$a    Author:w,Author:p
+
+# name-responsabily Author : 730
+# Do not index $4. Index $9 only in Koha-Auth-Number.
+melm 730$9    Koha-Auth-Number,Koha-Auth-Number:n
+melm 730$a    Author:w,Author:p
  
  # 740-742 = uniform and conventional headings for legal and religious texts. Use not recommended in France (503 used instead, see http://multimedia.bnf.fr/unimarcb_trad/B7XX-6-2011.pdf )
  
-- 
1.7.9.5

-- 
Mathieu Saby
Service d'Informatique Documentaire
Service Commun de Documentation
Université Rennes 2
Téléphone : 02 99 14 12 65
Courriel : mathieu.saby at univ-rennes2.fr



More information about the Koha-patches mailing list