[Koha-bugs] [Bug 22417] Delegate background jobs execution

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Thu May 16 03:53:27 CEST 2019


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=22417

--- Comment #26 from David Cook <dcook at prosentient.com.au> ---
(In reply to Juan Romay Sieira from comment #25)
> * Use version 3.5.3 or later of RabbitMQ. The one I have installed with
> Debian Jessie is 3.3.5. From version 3.5.3 you can use a plugin for delayed
> messages, so that these can be consumed in the future, and not immediately,
> which is how RabbitMQ works. I do not know how Net::RabbitFoot will behave
> in this case ...
> 

This sounds like it's probably not an option due to the version issue.

> * Use an intermediate table (or the background_jobs table, with a new column
> called exec_on) for future messages, and have a producer (cronjob) to send
> the messages to RabbitMQ at the time they need to be consumed.
> 

If you're going to use a cronjob as a producer for future messages, maybe it's
worthwhile to use a cronjob as a producer for all messages. That way there's
just 1 method for enqueuing messages. 

My only reluctance to use a cronjob is that it relies on a system administrator
setting it up properly, and it's challenging to know from the web interface if
it has been set up correctly, so the user experience could suffer greatly. How
would users know that the cronjob is running and their task will actually be
run?

As a result, it seems better to run a task scheduler service, but then that
gets us back to using existing tools like Minion, Celery, etc. I suppose we
could write our own task scheduler though. 

That reminds me of something that Fridolin said to me at Kohacon18 though. From
a vendor/system administrator perspective, we don't necessarily want libraries
to be in-charge of their own scheduling. For instance, say a vendor is running
30 Koha instances with a shared database server, and the librarians at each of
the 30 Koha libraries all schedule a very intensive report to run at 7pm. Or
say they schedule a task that does API calls, and then an upstream API server
rejects all the calls after the 10th due to rate limiting by domain. 

I mention this because I already wrote my own task queue/task scheduler
(https://bugs.koha-community.org/bugzilla3/page.cgi?id=splinter.html&bug=10662&attachment=85224),
and Fridolin at BibLibre pointed out that since each instance ran their own
task scheduler which could schedule a task for 7pm, they could cause these
problems.

I suppose in this case the timers would just be for the enqueuing of tasks. The
dequeuing and execution would still be handled by RabbitMQ, and I suppose if
you ran with a low number of workers you'd be less likely to overwhelm your
systems. And if you needed better performance, you could add workers and add
database servers for the report scenario. The rate limiting scenario is
trickier though. I don't know how to solve that one. My only ideas for that
involve complex home-baked solutions. I don't know the best practice for that
one.

Anyway, mostly just playing devil's advocate and trying to think of edge cases.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list