[Koha-bugs] [Bug 32916] [Bug 30280 follow-up] Problems in linking authorities to biblio fields (MARC 21)

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Fri Feb 10 23:30:46 CET 2023


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=32916

Janusz Kaczmarek <januszop at gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|koha-bugs at lists.koha-commun |januszop at gmail.com
                   |ity.org                     |
             Status|NEW                         |Needs Signoff

--- Comment #1 from Janusz Kaczmarek <januszop at gmail.com> ---
Created attachment 146523
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=146523&action=edit
[PATCH] Bug 32916: [Bug 30280 follow-up] Problems in linking  authorities to
biblio fields (MARC 21)

After applying the bug patch 30280:

1. Koha does not link with Zebra.

2. Koha does not properly link headings other than 6XX with ES.

3. Koha does not link subject headings when 6XX indicator 2 = ‘4’ or in
the case when in the auth record 008/11 = ‘z’, but 040 $f is not
defined, which is legal in terms of MARC 21 documentation
(https://www.loc.gov/marc/authority/ad008.html): “A MARC code for the
conventions used to formulate the heading may be contained in
subfield $f (Subject heading/thesaurus conventions) in field 040
(Cataloging Source).” -- ‘may’, not ‘should’.

A possible solution to this is to make ES emulate the Zebra indexing
of auth 008/11 and a correction to C4::Heading::_search.  Some minor
corrections in other places had to be done.

This solution requires a new ES field type and so also a modification
of the database – an expansion of the ENUM type for search_field.type
(ALTER TABLE search_field MODIFY COLUMN `type`
ENUM('','string','date','number','boolean','sum','isbn','stdno','year','callnumber','thesaurus')
NOT NULL COMMENT 'what type of data this holds, relevant when storing
it in the search engine';)

The new strict behaviour should be controlled (on/off) by a new
preference (e.g. LinkerStrictAuthInfo) -- since not every library is
using well formatted data.  The semantics of this preference could be
expanded in the future to take into account also 008/14-16 of an auth
record for instance.

Test plan:
==========

1. Have a clean master (or 22.11.0[0-2]) installation

2. Load provided data sample (bulkmarcimport.pl -d -a/b -file ...)

3. Reindex with Zebra and ES

4. Perform link_bibs_to_authorities.pl -v -t -l with Zebra, you
should get:

Linked headings (from most frequent to least):
-------------------------------------------------------

Unlinked headings (from most frequent to least):
-------------------------------------------------------
Feminism:       3 occurrences
Author 01:      1 occurrences
Author 01. Second work title:   1 occurrences
feminism:       1 occurrences
Person DBN:     1 occurrences
Series entry:   1 occurrences
Subject DBN without non-mandatory 040f: 1 occurrences
Subject lcsh:   1 occurrences
Subject with thesaurus not specified:   1 occurrences

5. Perform link_bibs_to_authorities.pl -v -t -l with ES, you
should get:

Linked headings (from most frequent to least):
-------------------------------------------------------
Feminism:       2 occurrences
feminism:       1 occurrences
Person DBN:     1 occurrences
Subject lcsh:   1 occurrences

Unlinked headings (from most frequent to least):
-------------------------------------------------------
Author 01:      1 occurrences
Author 01. Second work title:   1 occurrences
Feminism:       1 occurrences
Series entry:   1 occurrences
Subject DBN without non-mandatory 040f: 1 occurrences
Subject with thesaurus not specified:   1 occurrences

6. Apply the patch (pay attention to the location of the
authority-zebra-indexdefs.xsl file in you test environment).
Add a system preference LinkerStrictAuthInfo = 1 and perform
database modification (ALTER TABLE search_field MODIFY
COLUMN `type`
ENUM('','string','date','number','boolean','sum','isbn','stdno','year','callnumber','thesaurus')
NOT NULL COMMENT 'what type of data this holds, relevant
when storing it in the search engine';)

7. Full reindex with Zebra (koha-rebuild-zebra --full --force
-a -b) and ES (koha-elasticsearch --rebuild -r -d -a -b)

8. Perform link_bibs_to_authorities.pl -v -t -l with
Zebra and ES, you should get in both cases:

Linked headings (from most frequent to least):
-------------------------------------------------------
Feminism:       2 occurrences
Author 01:      1 occurrences
Author 01. Second work title:   1 occurrences
feminism:       1 occurrences
Person DBN:     1 occurrences
Series entry:   1 occurrences
Subject DBN without non-mandatory 040f: 1 occurrences
Subject lcsh:   1 occurrences
Subject with thesaurus not specified:   1 occurrences

Unlinked headings (from most frequent to least):
-------------------------------------------------------
Feminism:       1 occurrences

9. Control the results in Koha -- all the heading
fields should be properly linked to the appropriate
auth records.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.


More information about the Koha-bugs mailing list