[Koha-bugs] [Bug 8746] rebuild_zebra_sliced.sh don't work where Record length of 106041 is larger than the MARC spec allows

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Thu Sep 20 14:29:34 CEST 2012


http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=8746

--- Comment #1 from Julian Maurice <julian.maurice at biblibre.com> ---
Created attachment 12386
  -->
http://bugs.koha-community.org/bugzilla3/attachment.cgi?id=12386&action=edit
Bug 8746: rebuild_zebra_sliced.sh now export/index records as MARCXML

This avoid indexing failures due to "bad offset" or "bad length" error
with ISO2709 format

+ minor improvements:
  -  --length parameter is optional. If not given, it will execute the
     right sql query to find the number of records to index
  -  new parameter --reset-index. If set, index is reset before indexing


--

Exporting multiple records in the same ISO2709 file can cause problems. If one
record is wrong (say a record is longer than 99999 bytes, for example), then
all following records won't be parsed correctly. For example yaz-marcdump
refuse to read records that follow a malformed record.
This cause problems in rebuild_zebra_sliced.sh because yaz-marcdump is used to
count the number of records contained in one file.

So the workaround used here is to export in MARCXML. yaz-marcdump can't split
MARCXML files so a piece of Perl code is used instead (I didn't succeed to
achieve this job with POSIX tools)

To test you can create a biblio record longer than 99999 bytes and try to index
a range of biblionumbers that contain it.
On master indexation of this record should fail and this should succeed with
this patch.
(Records longer than 99999 bytes seem to not alter indexation of following
records. I've encountered this behaviour with badly encoded records)

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list