[Koha-bugs] [Bug 32481] Rabbit times out when too many jobs are queued and the response takes too long

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Thu Dec 22 00:25:50 CET 2022


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=32481

--- Comment #11 from David Cook <dcook at prosentient.com.au> ---
It's an interesting topic though...

https://docs.celeryq.dev/en/latest/faq.html#should-i-use-retry-or-acks-late

It looks like Celery workers acknowledge messages when they receive them and
then they do work. 

They provide an "acks_late" option for tasks where crashes would be a problem.

--

Maybe it is better to ack early so that we can handle longer running tasks, and
then we handle failure scenarios more as edge cases...

It's probably more likely that you'll have a long running task than an
unexpected crash. 

I think if there is a crash (say VM power outage rather than a fatal error
executing a Perl function), the job should get stuck in "started" state as
well? So that would allow sysadmins to then deal with that situation (or we
could have a cronjob that "times out" jobs in "started" state if they're in
that state for longer than X time.

--

In that way, we still have timeouts, but we're putting those timeouts into the
application rather than relying on RabbitMQ's...

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.


More information about the Koha-bugs mailing list