[Koha-bugs] [Bug 24544] Add a script for inserting persistent identifiers to MARC records

Wed Jul 8 04:39:33 CEST 2020

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24544

--- Comment #16 from David Cook <dcook at prosentient.com.au> ---
(In reply to Marcel de Rooy from comment #13)
> See also comment9. I chose here to not add stuff directly to core Koha
> routines. But work my way thru the data with a cron job. Surely this would
> be a next step. Problem with a cataloguing plugin is that you just know the
> record number only after saving  it. Although a PID generator might
> theoretically not need it, many implementations, including my own, do use it.
> 

I could see the utility of that cronjob for you, but maybe not for all Koha
users?

That's a good point about the PID generator. 

> Not sure how much you saw from the patches, but this patch set provides an
> interface via plugins to an external PID service.
> 

I skimmed through that code, but found it a bit difficult to read.

> No Koha should not mint its own PIDs. 

Ok excellent.

> Formally the PID generator
> may have its own PID lookup table. We do not really care here. (My local
> generator does not, since it is based on a Koha identifier. Its result can
> be found with a Standard-identifier index in Koha or even another future
> ILS. In that way actually turning my ES or Zebra index into a PID lookup
> table..)

I'm not sure that I understand this part. 

So you use the Koha identifier to mint a PID with a local non-Koha generator,
then you store that in the Koha record and index it. 

Your organisation resolver then forward to a local non-Koha resolver which then
queries Zebra/ES to get the record that matches... I assume not a full URL but
a partial path?

> But even with a full resolver having its own table, I would still argue to
> save a copy of the PID in the MARC record too for optimization, while
> respecting the lookup table as authoritative.
> As a side note: Could you give me another example of vital data on biblio
> level that we do not store in MARC? Not meaning optimization or calculated
> aggregates etc.

I don't know what you mean by "vital data" in this case, but some standouts are
biblio.frameworkcode, biblio.datecreated (debateable), biblio_metadata.format,
biblio_metadata.schema. 

To be honest, I think that I see where you're coming from. In the past, I
wanted to store OAI-PMH identifiers in the 024 field for imported MARC
bibliographic records and then look them up with a Zebra index search, but then
I realized was problematic for me. While those MARC fields exist, the data in
those fields aren't descriptive metadata about the record. They're metadata
about the metadata record. In my case, I ended up putting the OAI-PMH
identifier in the relational database, as it was much more robust than putting
it in the MARC record and indexing it into Zebra (especially since there can be
a lag for index updates).

-- 
You are receiving this mail because:
You are watching all bug changes.