[Koha-bugs] [Bug 9828] Zebra indexes useless subfields in UNIMARC 6XX

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Sat Mar 16 22:07:44 CET 2013


http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=9828

--- Comment #2 from mathieu saby <mathieu.saby at univ-rennes2.fr> ---
Created attachment 16179
  -->
http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=16179&action=edit
[PATCH] Bug 9828: More specific indexing of UNIMARC 6XX fields

Most of 6XX may contain a $2 that identifies the system used for indexing. It
should not be indexed.
In French libraries, $2 contains "rameau". So searching books about the music
composer "Rameau" retreive thousands of records!
For some 6XX fiels, other subfields should not be indexed, for example dates of
persons and familiy, or adresses.
In unimarc guide, 600$t,601$t,602$t are said to exist but to be "not used". I
keep them indexed.

Additionnally, subject indexing could be improved by using specific indexes for
each 6XX if possible :
In ccl.properties :
- su-to, su-geo and su-ut are defined as aliases of Subject.
- a specific index are defined, but not used in record.abs :
Subject-name-personal, alis su-na
We can use this index and create new specific indexes by using existing bib1
attributes.

We could also index $j,$x,$y,$z subdivision in specific indexes.


This patch does the following changes :
1) Add comments in record.abs
2) For all 6XX : Not indexing $2 (LSCH, Rameau...), $3 and $5
3) Not indexing specific subfields
600 : Personal name used as a subject // see Marc21 600
not indexing c (additional elements),f (dates),p (address/affiliation)
602 : Family name used as a subject // see Marc21 600 3X
not indexing f (dates)
616 : Trademark
not indexing c,f

3) For all 6XX : index $j,$x,$y,$z in several indexes in addition to the
specfific index for their 6XX field:
# 6XX$j : Genre/form                  : indexed in Subject,
Subject-subdivision, Subject-genre-form
# 6XX$x : Subject                     : indexed in Subject, Subject-subdivision
# (could be topical subject or genre/form subject, so don't index in
Subject-topical)
# 6XX$y : Geographical subject        : indexed in Subject,
Subject-subdivision, Subject-name-geographical
# 6XX$z : Chronological subject       : indexed in Subject,
Subject-subdivision, Subject-chronological

4) Define in ccl.properties some specific indexes :
Subject-name-conference 1=1073 => alias su-conf
Subject-name-corporate 1=1074 => alias su-corp
Subject-genre-form 1=1075 => alias su-genre and su-form
Subject-geographical 1=1076 => alias su-geo
Subject-chronological 1=1077 => alias su-chrono
Subject-title 1=1078 => alias su-ut and su-ti
Subject-topical 1=1079 => alias su-to

5) Adding new aliases in Search.pm :
su-chrono, su-form, su-genre, su-corp, su-conf, su-ti

6) Using these new indexes in record.abs for
600 : all field in Subject and Subject-Personal-Name
all subfields except subdivisions in Personal-name
601 : all field in Subject, Subject-name-conference and Subject-name-corporate
and Subject-name-conf
all subfields except subdivisions in Corporate-name and Conference-name
602 : same as 600 but could be improved later
604 : all field in Subject and Subject-title ; $a in Subject-Personal-Name
all subfields except subdivisions in Name-and-Title
605 : all field in Subject and Subject-title
606 : all field in Subject and Subject-topical
607 : all field in Subject and Subject-geographical
all subfields except subdivisions in Name-geographic
608 : all field in Subject and Subject-genre-form

To test :

A. In a GRS 1 environment
1) Apply the patch
2) Rebuild zebra
3) Create a record A with
- the string "bz9828" in 600$c 600$f 600$p, 602$f, 616$c, 616$f
- the string "subform" in 602$j
4) Create a record B with the string "subgeo" in 606$y
5) Create a record C with the string "subdate" in 606$z
6) try to search "su:bz9828". You should have no results
7) try to search "su-geo:subgeo". You should have 1 result : record A
8) try to search "su-genre:subform". You should have 1 result : record B
9) try to search "su-chrono:subdate". You should have 1 result : record C
10) on existing records, try su-ut, su-to, su-na, su-form, su-corp, su-geo
indexes, and see it results are relevant

B. In a DOM environment
same operations


M. Saby

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list