[Koha-bugs] [Bug 997] New: No MARC subfield sequence specification or subfield repeatablity

bugzilla-daemon at wilbur.katipo.co.nz bugzilla-daemon at wilbur.katipo.co.nz
Mon Jun 27 22:57:03 CEST 2005


http://bugs.koha.org/cgi-bin/bugzilla/show_bug.cgi?id=997

           Summary: No MARC subfield sequence specification or subfield
                    repeatablity
           Product: Koha
           Version: unspecified
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: critical
          Priority: P2
         Component: MARC
        AssignedTo: paul.poulain at free.fr
        ReportedBy: koha at alinto.com
         QAContact: koha-bugs at lists.sourceforge.net


GENERAL PROBLEM

The Koha MARC data model has a design mistake that is more than a mere
implementation bug and results in data loss for affected records.  The
cataloguing forms have no means for subfield sequence to be specified correctly
by the cataloguer when the sequence requried by the item to be catalogued
differs from the Koha default subfield order.  Even when the correct subfield
order is imported from an external record, the Koha MARC framework has no means
to preserve the order of subfields in data storage when the order differs from
the Koha default order.  Furthermore, while the cataloguing forms have a means
to enter repeated fields, they have no means to enter repeated subfields within
the same field when required by the item to be catalogued.

This problem would never be seen in very many common and simple cases but it
this problem exists in Koha for many fields when the item requires a different
data model than Koha uses.  This is a terrible problem for intent of MARC
records to be shared between systems, where the already corrupt data pool would
become corrupted by these records.  The problem may be most acute for subject
fields where the complete form of the subject field with subdivisions should
match the form found on other remote systems to properly locate other items with
the same complete subject.

Even where externally created records are imported into Koha, such as the former
records of an institution prior to adopting Koha, the data will be corrupted and
information will be lost.  The information conveyed by the subfield sequence is
lost for the problematic records.  No automated process could recover the lost
sequence information for most cases without referencing the original uncorrupted
records; or some nonexistent intelligent system actually understanding the
meaning of the subfield content and reordering the subfields according to real
world knowledge of how they should properly be ordered.


PROBLEMATIC EXAMPLES

Some examples of the most problematic effects this problem has on the creation
and interpretation of records follow.  Many other examples may be taken from the
same and other fields but these seem especially problematic.

260  ##$aParis :$bGauthier-Villars ;$aChicago :$bUniversity of Chicago
Press,$c1955.  Multiple publishers cannot be entered using the cataloguing
forms.  An externally catalogued record imported into Koha would be transformed
as 260  ##$aChicago :$aParis :$bGauthier-Villars ;$bUniversity of Chicago
Press,$c1955.  Note that the place of publication and publisher are not only not
adjacent but the correct place of publication would no longer match the correct
publisher if reordered from this point.

490  1#$aDepartment of State publication ;$v10846.$aDepartment and Foreign
Service series ;$v12128 .  Multiple series titles cannot be entered using the
cataloging forms.  An externally catalogued record imported into Koha would be
transformed as 490  1#$aDepartment and Foreign Service series ;$aDepartment of
State publication ;$v10846.$v12128 .  Note that the series title and volume
number are not only not adjacent but the correct series title would no longer
match the correct volume if reordered from this point.

505  00$tQuark models /$rJ. Rosner --$tIntroduction to gauge theories of the
strong, weak, and electromagnetic interactions /$rC. Quigg --$tDeep inelastic
leptognnucleon scattering /$rD.H. Perkins --$tJet phenomena /$rM. Jacob --$tAn
accelerator design study /$rR.R. Wilson --$tLectures in accelerator theory /$rM.
Month.  Formatted contents notes cannot be entered using the cataloguing forms.
 .An externally catalogued record imported into Koha would be transformed as 505
 00$rC. Quigg --$rD.H. Perkins --$rJ. Rosner --$rM. Jacob --$rM. Month.$rR.R.
Wilson --$tAn accelerator design study /$tDeep inelastic leptognnucleon
scattering /$tIntroduction to gauge theories of the strong, weak, and
electromagnetic interactions /$tJet phenomena /$tLectures in accelerator theory
/$tQuark models / .  Note that the title and statement of responsibility are not
only not adjacent but the correct title would no longer match the correct
statement of responsibility if reordered from this point.

650  #0$aArchitecture$zIllinois$zChicago$xHistory$vPictorial works.  Repeated
subject subdivisions cannot be entered using the cataloguing forms.  An
externally catalogued record imported into Koha would be transformed as 650 
#0$aArchitecture$vPictorial works.$xHistory$zChicago$zIllinois .  Note that if
reordered from this point, an automated system would need to know the subsidiary
relationship of Chicago to Illinois for correctly reordering the subfields.


SOLUTION APPROACHES

It should be obvious why many many cataloguing systems have one free entry text
field for all subfields.  I the guided data entry approach that Koha uses for
cataloguing preferable for its self validating effects.

A means to include repeated subfields should be incorporated into the
cataloguing forms similar to the means for repeated fields.  The subfield to be
entered at a particular point in the subfield list should be alterable from a
drop down selection list for those fields that may have a variant subfield
sequence and those records that may not have a varient subfield.  The actual
coded values for the subfield should have the position of that subfield in the
sequence of subfields for that selection list for maintaining the subfield
sequence in the Koha database.  No implementation should prevent the use of a
recommended or required value list or thesaurus for the subfield contents.  If
the page needs to be rewritten when the cataloguer alters the default subfield
se sequence of subfields to accomidate a value list or thesaurus in the
cataloguing form corresponding to the altered subfield sequence, then the page
needs to be rewritten.  The default cataloguing subfield sequence should provide
for the most usually correct subfield order as a default that corresponds to the
different usage of different fields.

A parallel data structure to the existing data structure can be introduced to
preserve the subfield sequence if there is no easy means to augment the existing
data structure with subfield sequence information.  At least newly catalogued
and newly imported or reimported records from an uncorrupted source should have
their subfield sequence preserved in the data structure after a suitable
implementation is developed to correct this problem.



------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.




More information about the Koha-bugs mailing list