[Koha-bugs] [Bug 10662] Build OAI-PMH Harvesting Client

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Fri Sep 14 19:59:23 CEST 2018


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=10662

--- Comment #180 from David Cook <dcook at prosentient.com.au> ---
(In reply to Christopher Davis from comment #179)
> David,
> 
> I am impressed with your work on this- so much done in so little time. I
> think that enabling Koha to harvest metadata is a grand idea, so I have been
> following this bug. I heard that Koha's new indexer, Elastic Search, can
> index non-MARC metadata- is this true? If this is true, then I hope that you
> do not mind me asking you a question: this bug patch would have Koha's
> OAI-PMH harvester ingest only metadata which has been crosswalked to
> MARCXML; however, why stick with only MARCXML when Elastic Search can index
> the incoming metadata in its native schema (Dublin Core, TEI, EAD, etc.)?
> Please pardon my ignorance, but am I missing something?
> 
> Thank you,
> 
> Christopher Davis

Hi Christopher, 

It feels like a very long time to me, but thank you very much! 

Neither ElasticSearch nor Zebra require MARC themselves per se, so either of
them *could* index non-MARC metadata. However, Koha itself is a MARC-driven
system, so the limitation you've observed with the OAI-PMH harvester is because
of Koha's internals (and how we've chosen to index Koha's metadata).

Indexing is certainly one part of the issue. I'm not 100% familiar with the
implementation of ElasticSearch, but I think we're using MARC with that as
well. I think the solution would be to have a generic schema that we use for
indexing/searching so that we could map any metadata format to Koha's generic
index schema. Then we'd have mapping from any metadata format to a Koha generic
display format. 

The other issue would be how Koha handles "records" for other purposes.
Deleting bibs, modifying bibs, acquisitions, subscriptions, etc. But really the
first step is just being able to store a non-MARC bib, index it, and display
it.  

Basically... lots of work needs to be done and someone just needs to start it.
I think the first step will be creating non-MARC metadata bibliographic records
which store non-MARC metadatain the biblio_metadata table. And the first step
to doing thing would be changing the "marcflavour" column in that table to
"schema". 

There's a lot of work to do there, but I think it's very important work!

And it's work which would be great to do, since the Swedish Union Catalogue is
using RDF. It would be great if we could just download the Swedish Union
Catalogue metadata in its native format and work with that for indexing and
display.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list