[Koha-devel] Task schedulers and message queues for Koha

David Cook dcook at prosentient.com.au
Mon Feb 20 02:01:41 CET 2017


Hi all,

 

In 2016, I worked on a Koha task scheduler for downloading and importing
records via OAI-PMH. I have code which works, but it's lacking test coverage
and I'm unsure that it will make it through QA and be accepted.

 

I recall Chris Cormack suggesting I look at Gearman (http://gearman.org/)
instead, and it looks pretty good at a glance, although Andreas and I had
talked about having more control over the workers than Gearman seems to
offer. Plus, it added another dependency to Koha where people already
struggle with dependencies.

 

Martin suggested that there were a lot of other implementations out there,
but I haven't really found much else that provides everything we want out of
the box. The one I have found is Celery (http://www.celeryproject.org/),
which looks like exactly what I want I think, but it requires Python for its
server and workers, and it requires a message broker like RabbitMQ or Redis.
Lots of extra dependencies.

 

Since my goal is to get our code into Koha, I really want to know what will
work for people. Are people happy with a home grown solution? It's not
really that complicated. 

 

My current version is essentially a task scheduler which forks a worker on
demand when it's time to run a task (up to a configurable max of X tasks so
you don't kill your server). It lets you submit tasks, tell the scheduler to
start/schedule them, and you can even tell in progress tasks to stop (by
having the scheduler tell the worker to stop and the worker decides where in
its task it checks for stop commands from the scheduler).

 

As I try to add test coverage and make this scheduler more palatable, I find
myself thinking about the code more like Koha::Scheduler and Koha::Queue,
and using the more scalable worker model used by others. The scheduler
daemon would listen on a socket for tasks, it would create a Koha::Scheduler
instance which would enqueue tasks to run once that task's time was met or
exceeded. Now depending on the architecture. you could have a separate
daemon or the same daemon with a Koha::Queue instance. It would accept
tasks/messages from the scheduler, and it would dequeue tasks to available
workers - which are separate processes - have previously registered against
particular queues. In this way, you can have a oaipmh-download queue,
oaipmh-import queue, a email-report queue, etc.

 

I suspect that we could make use of Koha::Scheduler and Koha::Queue
throughout much of Koha for doing background tasks. If we don't want to
reinvent the wheel with Koha::Queue, we could use something like RabbitMQ.
But I think we need *something*.

 

I'm open to ideas. I already have the OAI-PMH download and OAI-PMH import
handled. That's the easy part. Any worker can do that. The hard part is
figuring out how the Koha Community will take up a task scheduler. 

 

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australia

 

Office: 02 9212 0899

Direct: 02 8005 0595

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20170220/0a7b2d48/attachment-0001.html>


More information about the Koha-devel mailing list