[Koha-bugs] [Bug 30654] Even with RabbitMQ enabled, we should poll the database for jobs at worker startup

Thu Jul 28 04:05:35 CEST 2022

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=30654

--- Comment #16 from David Cook <dcook at prosentient.com.au> ---
(In reply to Tomás Cohen Arazi from comment #13)
> Send message to a durable RabbitMQ queue(In reply to David Cook from comment
> #10)
> > 4. Send message to a durable RabbitMQ queue
> 
> What is the 'durable RabbitMQ queue' in this context?

"Durable (the queue will survive a broker restart)"
(https://www.rabbitmq.com/queues.html)

> In order for using RabbitMQ makes sense, I feel like we need:
> 
> - A Task queue manager (we have the table, just missing a process that polls
> it and sends 'the message'
> - A way to dispatch the message to notify workers about it (Yay, RabbitMQ)
> - A worker that reads the message and acts accordingly
> 
> Questions:
> 
> - how does a worker report back if it failed? Probably directly to the DB?
> This is a case for a task queue manager, which would read and make the
> decision it is time to retry.
> - if we had 2 workers waiting for 'index' jobs through the mq, how do we
> pick which process takes the job? how does the system know it is being
> processed by who? what if the process dies and we need to assign a new
> worker? That's where I was going with the PID thing. BEcause I was thinking
> about things locally (wrong) but the case still stands with this question.
> Who is running what?

- If a worker fails, it needs to record that in the "result store" (which in
this case is the database). 
- Why do you need to know which worker processes which message? It shouldn't
matter. If the worker dies, the MQ broker would detect the broken connection,
and give that message to the next worker that connects asking for a message. 

> I personally don't see a case in which RabbitMQ is being useful right now.
> At least when we still haven't implemented a way to avoid tasks that failed
> to get notified through the mq to be retried, scheduled, etc.

It's not the job of the message broker to handle scheduling. There needs to be
a separate scheduler. That can be a cronjob, it can be a daemon. It's whatever
we want it to be. (The advantage of the minion daemon is that it already exists
and we wouldn't need to write it.)

Are you having problems with retries? RabbitMQ should automatically retry. Or
are you referring to this scenario:
https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=30654#c0

RabbitMQ is useful right now because it provides a standard interface for
passing message between separate processes. It's the only reason we're able to
process asynchronous jobs in a reasonable way. (Forking CGI processes for the
old school background jobs has a major barrier to using Plack 100%.)

(In reply to Martin Renvoize from comment #0)
> Our worker currently starts up and immediately tries to listen for jobs
> being passed via STOMP.  However, if rabbitMQ wasn't running when the tasks
> were enqueue, then the worker will never know about them.

Why wouldn't RabbitMQ be running? I'd argue that we shouldn't be saving jobs in
the database if the message broker (which is a component of the overall Koha
system) is unavailable. 

To me, it sounds like a Koha design problem rather than a RabbitMQ problem. If
the database weren't available, we wouldn't try saving a change to the file
system on our own. We'd throw an exception saying that we can't save the
record. 

I'm in the process of implementing a background job system using RabbitMQ on
another Perl app and that's the process I'll be following. First I'll check my
RabbitMQ connection, if good, then I'll insert the job into the DB, commit into
the DB, which is necessary to avoid a race condition, send the message to
RabbitMQ. If the message sends, then there's nothing more to do but tell the
user that their job is in progress. If it doesn't send, I update the job in the
database as failed (or you could delete the job), and tell the user that there
was an error. 

I'm not saying that's necessarily the only right way to do it. Or that we have
to keep RabbitMQ, but that's my interpretation of the situation. 

Totally not opposed to yanking out RabbitMQ and replacing with Minion. I think
that we'd gain a lot of functionality for free by doing that, but I suppose I
want to make sure we're not unfairly condemning RabbitMQ either when I think
we're the reason RabbitMQ might not be working the way we want it to.

-- 
You are receiving this mail because:
You are watching all bug changes.