[Koha-devel] Fwd: Call for testers: next release of Net::OAI::Harvester Perl module may break legacy custom handlers

David Cook dcook at prosentient.com.au
Mon Jan 18 23:29:35 CET 2016


Yeah, I saw that in the perl4lib email, but we don't actually use
Net::OAI::Harvester in Koha. We use HTTP::OAI for the OAI-PMH server, and
I'm using HTTP::OAI for the harvester as well (although it certainly has
some issues...).

David Cook
Systems Librarian
Prosentient Systems
72/330 Wattle St, Ultimo, NSW 2007

> -----Original Message-----
> From: koha-devel-bounces at lists.koha-community.org [mailto:koha-devel-
> bounces at lists.koha-community.org] On Behalf Of Tajoli Zeno
> Sent: Friday, 15 January 2016 11:03 PM
> To: koha-devel <koha-devel at lists.koha-community.org>
> Subject: [Koha-devel] Fwd: Call for testers: next release of
> Net::OAI::Harvester Perl module may break legacy custom handlers
> 
> 
> 
> 
> -------- Messaggio Inoltrato --------
> Oggetto: Call for testers: next release of Net::OAI::Harvester Perl module
> may break legacy custom handlers
> Data: Fri, 15 Jan 2016 12:38:17 +0100
> Mittente: Thomas Berger <ThB at Gymel.com>
> Rispondi-a: ThB at Gymel.com
> Organizzazione: Gymel.com
> A: Perl for libraries <perl4lib at perl.org>, Using XML in libraries
> <xml4lib at listserv.nd.ed>
> 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> [Apologies for cross-posting]
> 
> The Perl module Net::OAI::Harvester implements a client framework for the
> OAI Protocol for Metadata Harvesting (OAI-PMH) and was authored and
> originally maintained by Ed Summers. It has been available on CPAN ever
> since 2003 and its last stable version 1.15 has been released almost four
years
> ago:
>           < http://search.cpan.org/~thb/OAI-Harvester-1.15/ >.
> 
> Since one of the repositories used for testing vanished from the web some
> time ago and this is breaking the test suite a new version has to be
released
> fairly soon.
> 
> Over time I had been tackling various minor issues and published developer
> releases on CPAN, cf. the list at the end of this mail or the Changes
document
> linked at the CPAN page for the current release 1.16_12: <
> http://search.cpan.org/~thb/OAI-Harvester-1.16_12/ >
> 
> However the sum of these changes is not negligible and specifically their
> impact on "custom metadata handlers" (which are to be used when
> processing other metadata formats than oai_dc) may affect applications
> using the module:
> 
> >>>>>
> Up to version 1.15 the metadataHandler was inconsistently fed with input
> :
> 
>   - GetRecord exposed the almost complete XML response to the Handler
>     (including start_document/end_document events)
>   - ListRecords exposed the (OAI)record element (header, metadata and
>     optional about containers) but did not propagate start_document or
>     end_document events.
> 
> In both cases the events for the header tags itself and for the optional
> setSpec subelements had not been forwarded
> 
> Version 1.20 introduces a modified behavior for metadataHandler and an
> additional recordHandler:
>   - a metadataHandler will see only the (single) subelement of the OAI
>     metadata element (so for an deleted record it might never be invoced
>     at all)
>   - a recordHandler will see the OAI record element and its subelements
> 
> Therefore a metadataHandler will now be confined to the metadata
> fragment(s) of the response, and the new recordHandler approximates the
> old behavior of ListRecords, however OAI-PMH:identifier and OAI-
> PMH:datestamp will now be properly encapsulated within their OAI-
> PMH:header element.
> 
> Additionally, two new methods responseDate() and request() allow access to
> the corresponding top-level OAI-PMH elements in all response types.
> A SAX filter of class Net::OAI::Record::DocumentHelper may be used to
inject
> start_document and end_document events into the chain if they are
> needed.
> 
> As a temporary measure, you may set
>     $Net::OAI::Harvester::OLDmetadataHandler =1 to change the behavior of
> handlers passed as "metadataHandler" into that of a recordHandler.
> <<<<<<
> 
> Obviously the change of semantics for a metadataHandler to deal with the
> "metadata" elements of the response instead of the "record" elements is a
> design decision and may be questioned by users of the module.
> 
> The current version also contains several changes which solve deficiencies
of
> Net::OAI::Harvester 1.15 but possibly break existing workarounds for these
> deficiencies. For example officially (per
> documentation) you never could acccess
> the responseDate of the OAI-PMH result, but due to a sloppy
> implementation of processing for the identify verb it was possible to
extract
> it in this case by an undocumented method. The current version supplies a
> dedicated responseDate() accessor for all verbs but at the same time fixes
> the behavior in the identify case.
> 
> I may be overly optimistic but my impression is that the changes between
> the current 1.15 and the coming version (most probably numbered 1.20) do
> actually fix many issues but the fear is realistic (I experienced that
myself
> with an old application of mine using the module) that these fixes may
> conflict with workarounds introduced by users to make things work before.
> 
>   ***
> 
> So please, if you are currently using Net::OAI::Harvester *and* had been
> forced to introduce workarounds or tweak internals of the module, perform
> thorough testing before upgrading to the coming stable version, preferably
> already now with the developer version 1.16_12.
> 
> And, please, please: provide feedback if you should run into trouble,
either
> via the CPAN request tracker for the module at <
> https://rt.cpan.org/Public/Dist/Display.html?Name=OAI-Harvester > or by
> direct mail.
> 
> Sorry for the inconvenience
> viele Gruesse
> Thomas Berger
> 
> 
> Changes to Net::OAI::Harvester since version 1.15
> 
> 1.16_12	Tue, Jan 12 00:20:05 CET 2016
> - - dealing with CPANTS Kwalitee issues, esp. version number mess
> - - new filter class Net::OAI::Record::DocumentHelper for tweaking
> 
> 1.16_11	Tue, Jan 12 00:20:05 CET 2016
> - - minor cleanup
> 
> 1.16_10	Mon, Jan 11 01:29:46 CET 2016
> - - renamed alldata() method for accessing recordHandler results
>    to recorddata()
> - - better propagation of namespace prefix mapping events
> - - Net::OAI::NamespaceFilter with a result() method
> - - Net::OAI::NamespaceFilter tested with XML::SAX::Writer
> - - AUTHOR formatting
> 
> 1.16_09 Sun, Feb 14 17:29:39 CET 2014
> - - Net::OAI::NamespaceFilter as kind of generic metadata handler
> - - Queries are now constructed basing on a copy of the Harvester's
>    baseURL
> - - pass parameters to URI->query_form() more reproducably,
>    (esp. "verb" should now always be first to accommodate some
>    allegedly broken repositories)
> - - temporary? tests for correctness of LWP operations
> 
> 1.16_07	Tue, Apr 30 01:26:40 CEST 2013
> - - added new methods: response(), responseDate(), error()
> - - Smoke still tests failed on 'Bad Host' tests (wrong error codes
>    induced by HTTP proxies?)
> - - aligned behavior of metadataHandler for listRecords() and
>    getRecord()
> - - introducing alternative recordHandler for listRecords() and
>    getRecord()
> - - removed erroneous resumptionToken handling for identify()
> 
> 1.16_04	Fri Dec  7 09:49:03 CET 2012
> - - consider HTTP proxies in design of t/003.error.t
> - - 'Bad Host' tests failing b/c error code 500 is not the expected
>     code 404 (due to some recent change in LWP)?
> 
> 1.16_01 Mon Apr  2 23:14:35 CEST 2012
> - - Modules were not namespace aware.
> - - Add HTTPRetryAfter() method (catches HTTP Retry-After header)
> - - Check responses for Content-Type and charset before parsing
> - - Net::OAI::Header handed up (empty) header elements and other stuff
>    to the request's metadataHandler
> - - SKIP tests when HTTP errors are encountered
> 
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
> Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
> 
> iJwEAQECAAYFAlaY2iYACgkQYhMlmJ6W47MMCwP/Yhij11TfEL1dfYtimdXG8h
> kf
> FYLvvXECzECPxKbHIC0dKvf5v4myW8oedlK3B+oOzIjjOY60pT7pdC4KB/xgU+a
> 1
> N1djewSgT4hJ3IoacmUkLpnh81NSM1oA0osw48qVco4qpxDOY2HrR3bdBZksK
> BcI
> lQH10kIYqo/TZYGXHYQ=
> =v03A
> -----END PGP SIGNATURE-----
> 
> 
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website : http://www.koha-community.org/ git : http://git.koha-
> community.org/ bugs : http://bugs.koha-community.org/




More information about the Koha-devel mailing list