[Koha-bugs] [Bug 21872] Elasticsearch indexing faster by making it multi-threaded

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Wed Nov 28 11:39:08 CET 2018


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=21872

--- Comment #19 from David Gustafsson <glasklas at gmail.com> ---
(In reply to David Cook from comment #15)
> (In reply to Joonas Kylmälä from comment #2)
> > (In reply to David Cook from comment #1)
> > > I'll just split hairs and mentioning that multithreading in Perl is not
> > > recommended and never really done, but you could achieve the thing by
> > > forking workers. 
> > 
> > Thanks for making the distinction.
> > 
> > > 
> > > In #10662, I use the following modules to perform rapid event-driven
> > > processing of job queues:
> > > 
> > > https://metacpan.org/pod/POE::Component::JobQueue
> > > https://metacpan.org/pod/POE::Wheel::Run
> > 
> > The Parallel::ForkManager is also used already in Koha so it would be worth
> > to take look if it could be used with the indexing code as it looks super
> > simple!
> 
> Parallel::ForkManager is only used in the tests at the moment and it's
> marked as a non-required dependency, but... it is marked as a dependency in
> Koha and I do see it in the debian/control file as well, so I suppose a
> person could use it. 
> 
> The nice thing about POE::Wheel::Run is that it uses bilateral communication
> channels between the parent and children, so you can fork off X number of
> workers and then continue to send data to the workers. Plus the event-driven
> nature of POE means that things happen really quickly. You can have the
> parent manage the queue, and have it fire off data to the children workers. 
> 
> There's even a POE::Component::* module for non-blocking HTTP requests,
> although I haven't played with it myself yet, but that could also speed
> things up with indexing ElasticSearch, but that would probably require not
> using Catmandu (which I think is Ere's plan in the long-run anyway?).

Isn't PEO event-loop based and thus runs in a single thread? If so it would not
help at all in speeding up the indexing process (except for perhaps committing
to Elasticsearch in parallel since that does not run in perl).

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.


More information about the Koha-bugs mailing list