[Koha-bugs] [Bug 29440] Refactor/clean up bulkmarcimport.pl

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Wed Nov 8 13:53:31 CET 2023


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=29440

--- Comment #72 from M <schodkowy.omegi-0r at icloud.com> ---
I rebased this patch and Bug 25539 in our library and successfully used it to
update a lot of biblio and auth records. My feedback is that it generally
works, and marc overlay rules are also applied.

There's one biggest issue I've ran into: I've had issues with the -match option
with auths. For biblios, I could use `-match Other-control-number,035a`
perfectly fine. But for auths, using `-match LC-card-number,010a` just didn't
work, so I had to resort to `-match Any,010a`. It works, but not really ideal.

After playing with yaz-client and other tools, I can say this wasn't really
indexing issue it seems. I've also ran:
> xsltproc /etc/koha/zebradb/marc_defs/marc21/authorities/authority-zebra-indexdefs.xsl  <(sudo koha-mysql biblioteka <<< "select  marcxml from auth_header where ExtractValue( marcxml, '//datafield[@tag=010]/subfield[@code=\"a\"]' ) != '' limit 1\\G" | sed -n 's/marcxml: //;2,$p' ) | grep 2005134806
>   <z:index name="LC-card-number:w LC-card-number:p">n 2005134806</z:index>
>   <z:index name="Any:w Any:p">n 2005134806</z:index>
>   <z:index name="Any:w Any:p">http://nukat.edu.pl/aut/n 2005134806</z:index>

It seems that the control field is indexed as "LC-card-number", but using it
with the scripts always yielded no matches, and with "-update" a warning that
insertion of new record was skipped. From my memory of debugging this a few
days ago, the $query variable had a value of `LC-card-number="n 2005134806"`.

I also have two more tiny feedback items:

- In the if commented as `#Skip if authority in database is the same or newer
than the incoming record`, it skips the record with `next;` with no feedback in
logs, neither warn, nor entry in the log like "skipped" similar to
"updated"/"inserted". I think it might be a good idea to add some log line
here.

- In the final logs it says `n MARC records done in x seconds`. The `n` stands
for number of all records that were checked from the file, including those that
were skipped. I think it'd be nice if it said something like `n/n MARC records
done`, where first `n` would be the number of records actually inserted or
updated into the database, while the second `n` would remain the current
number.
This improvement alone would also partially rectify the problem presented with
the first suggestion above.

All in all though it's a very nice changeset, we appreciate that it can load
biblio records while respecting MARC overlay rules, it already works for us
pretty much as-is, so it'd be nice if it could be merged to 23.11 version
already, and maybe tiny details improved later on with time, without blocking
this.

The way I see it, this changeset already makes this script MUCH better, fixes
broken behavior in it, and doesn't make anything more broken than it already
was before, so I definitely vouch for it...

Not sure if I have any authority to review code or make a sign-off, but we used
it to import/update thousands of biblio and auth records now, without noticing
any issues (I paid special attention to MARC export of biblio before/after).

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list