[Koha-devel] Replace Catmandu indexing code with pure perl and eventually drop Catmandu as a Koha dependency

Mon May 21 10:40:09 CEST 2018

I'm regering to a book return / bookdrop machine machine, where books are
returned when patrons put them in. With Catmandu there will be a
significant delay between each return.

Compile step was perhaps the wrong term to use. I havn't dug that deep in
what causes this, but I would guess that it's the Fix language
parsing/conversion to perl code that has this overhead. Don't think plack
would help. Would perhaps be possible to cache the resulting perl code if
this is the culprit, but this would not improve the time of a full reindex
significantly since biblios are indexed in batches, and the startup
overhead will only occur once per batch.

I actually used NYTProf when developing the patch and the benchmarks can be
found as attachment s in the bugzilla issue.

David

måndag 21 maj 2018 skrev David Cook <dcook at prosentient.com.au>:

> When you say “machine where you return books”, are you referring to a
> self-checkout machine or a computer where staff are checking in books
> manually?
>
>
>
> When you say “Catmandu has some kind of compile step”, what do you mean by
> “compile”? If you’re using Plack, surely we should be pre-loading Catmandu
> and thus any compilation will already have happened? Admittedly I don’t use
> Plack with Koha, so I wouldn’t know if that’s how they’re doing it, but I
> use Plack with other systems and preload all the time-consuming modules to
> speed things up.
>
>
>
> If you want to see what’s the problem exactly, I’d suggest using
> https://wiki.koha-community.org/wiki/Profiling_with_Devel::NYTProf. That
> should show you where you are losing time.
>
>
>
> David Cook
>
> Systems Librarian
>
> Prosentient Systems
>
> 72/330 Wattle St
> <https://maps.google.com/?q=72/330+Wattle+St+Ultimo,+NSW&entry=gmail&source=g>
>
> Ultimo, NSW
> <https://maps.google.com/?q=72/330+Wattle+St+Ultimo,+NSW&entry=gmail&source=g>
> 2007
>
> Australia
>
>
>
> Office: 02 9212 0899
>
> Direct: 02 8005 0595
>
>
>
> *From:* David Gustafsson [mailto:glasklas at gmail.com]
> *Sent:* Friday, 18 May 2018 7:43 PM
> *To:* David Cook <dcook at prosentient.com.au>
> *Cc:* Koha-devel at lists.koha-community.org
> *Subject:* Re: [Koha-devel] Replace Catmandu indexing code with pure perl
> and eventually drop Catmandu as a Koha dependency
>
>
>
> “the book drop machine” = machine where you return books. It does not
> matter if using Plack or not, Catmandu has some kind of compile step or
> similar that has a startup time of a couple of seconds. So everytime one
> returned a book (and the biblio was updated and indexed) there was a delay
> of a couple of seconds, if returning multiple books is a major issue.
>
>
>
> Best Regards
>
> David
>
>
>
> 2018-05-18 2:33 GMT+02:00 David Cook <dcook at prosentient.com.au>:
>
> I don’t do anything with Elastic or Catmandu at the moment, so I won’t
> comment about that.
>
>
>
> But you mention the overhead of Catmandu start-up. Can you speak more to
> that? What’s “the book drop machine”? Why isn’t Catmandu running in a
> persistent process?*
>
> *I say as someone who still uses Koha using CGI rather than Plack…
>
>
>
> David Cook
>
> Systems Librarian
>
> Prosentient Systems
>
> 72/330 Wattle St
> <https://maps.google.com/?q=72/330+Wattle+St+Ultimo,+NSW&entry=gmail&source=g>
>
> Ultimo, NSW
> <https://maps.google.com/?q=72/330+Wattle+St+Ultimo,+NSW&entry=gmail&source=g>
> 2007
>
> Australia
>
>
>
> Office: 02 9212 0899
>
> Direct: 02 8005 0595
>
>
>
> *From:* koha-devel-bounces at lists.koha-community.org [mailto:
> koha-devel-bounces at lists.koha-community.org] *On Behalf Of *David
> Gustafsson
> *Sent:* Thursday, 17 May 2018 11:57 PM
> *To:* Koha-devel at lists.koha-community.org
> *Subject:* [Koha-devel] Replace Catmandu indexing code with pure perl and
> eventually drop Catmandu as a Koha dependency
>
>
>
> Hi all!
>
> I have been working on replacing Catmandu depandant indexing code with a
> simpler and faster Koha-specific one using the Search::Elasticsearch
> package (which Catmandu uses internally): https://bugs.koha-community.or
> g/bugzilla3/show_bug.cgi?id=19893
>
>
>
> Some of the benefits would be:
>
>
>
> 1) Increased indexing performance (about twice as fast, six times as fast
> if comparing time spent in update_index()), due to more efficient
> json-conversion and fewer Elasticsearch requests.
>
> 2) With Catmandu indexing speed decreases as more mappings are added, with
> the alternative algorithm indexing is kept more or less constant no matter
> how many mappings you add.
>
> 3) Neglectable indexing start-up time. Especially noticeable when indexing
> a single document. For example we have an issue with the book drop machine,
> each return taking a couple of seconds because of the Catmandu start-up
> overhead (or when saving biblios in staff client).
>
> 4) More transparent code and less complexity compared with Catmandu
> (admittedly partly subjective statement) should lead to improved
> maintainability and increased stability.
>
> 5) No need for new developers to learn the Fix language
>
> 6) Closer to the metal so easier to perform even more Koha-specific
> optimizations and customizations which might not be feasible with Catmandu
> in tthe way
>
>
>
> The proposed patch only addresses the indexing logic but the remaining
> Catmandu-dependant code (mainly for searching) should be pretty trivial to
> replace with Search::Elasticsearch implementation which can be done as a
> next step.
>
>
>
> Would be wonderful if this could be raised for discussion at the next
> developers meeting.
>
>
>
> Best regards
>
> David Gustafsson
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20180521/036fb8d8/attachment-0001.html>