[Koha-devel] The many failings of background_jobs_worker.pl
Philippe Blouin
philippe.blouin at inlibro.com
Wed Dec 21 14:58:40 CET 2022
Good evening, David,
Thanks for the response. Yours and David's and Michael's. I feel less
alone...
I validated, and yes all the patches you refer are in our pile. And
until the problems arose, there were no customizations around that code.
So yeah, even at 22.05.06, I get the JSON error and the race condition
(we use ES). And the _abandonned_ children. So I surmise, or dare I
say postulate, that those issues are not as resolved as some would presume.
I will revert background_jobs_worker.pl to its default, and shutdown MQ
everywhere, for now. :(
Philippe Blouin,
Directeur de la technologie
Tél. : (833) 465-4276, poste 230
philippe.blouin at inLibro.com
inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>
On 2022-12-20 17:55, David Cook wrote:
>
> Salut Philippe,
>
> That first issue should’ve been resolved in 22.05.00 by
> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=30172. I
> haven’t had any problems like that since applying that patch. Are you
> running Koha with or without customizations?
>
> As you say, bug 30654 discusses that second issue. And I obviously
> have my own opinion on that one 😉.
>
> That JSON issue should be fixed by Bug 31351 in Koha 22.05.06 as well
> I believe: https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=31351
>
> --
>
> The only issue I’ve had with the background jobs has been the one
> covered by Bug 30172. Otherwise, it’s been all fine for me, although I
> use Zebra rather than Elasticsearch. I think part of the reason I
> haven’t had issues is that I haven’t had many people using the
> background jobs either though.
>
> I’m actually planning on writing a background job system based on
> RabbitMQ for a different non-Koha system. The main difference is that
> I’ll reject or fail tasks where messages aren’t sent to RabbitMQ. I
> think that’ll make my system a bit more robust than Koha’s.
>
> The problem with the background jobs at the moment is that we haven’t
> fully committed to RabbitMQ. We’re trying to do this weird hybrid with
> the database fallback which is not the right direction in my mind. We
> should do one or the other but not try to do both.
>
> But that’s just my 2 cents.
>
> David Cook
>
> Senior Software Engineer
>
> Prosentient Systems
>
> Suite 7.03
>
> 6a Glen St
>
> Milsons Point NSW 2061
>
> Australia
>
> Office: 02 9212 0899
>
> Online: 02 8005 0595
>
> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org> *On
> Behalf Of *Philippe Blouin
> *Sent:* Wednesday, 21 December 2022 6:13 AM
> *To:* koha-devel at lists.koha-community.org
> *Subject:* [Koha-devel] The many failings of background_jobs_worker.pl
>
> Howdy!
>
> Since moving a lot of our users to 22.05.06, we've installed the
> worker everywhere. But the number of issues encountered is staggering.
>
> The first one was
>
> Can't call method "process" on an undefined value
>
> where the id received from MQ was not found in the DB, and the process
> is going straight to process_job and failing. Absolutely no idea how
> that occurs, seems completely counterintuitive (the ID comes from the
> DB after all), but here it is. Hacked the code to add a "sleep 1" to
> fix most of that one.
>
> Then came the fact that stored events were not checked if the
> connection to MQ was successful at startup. Bug 30654 refers it.
> Hacked a little "$init" in there to clear that up at startup.
>
> Then came the
>
> malformed UTF-8 character in JSON string, at character offset 296
> (before "\x{e9}serv\x{e9} au ...")
>
> at decode_json that crashes the whole process. And for some reason,
> it never gets over it, gets the same problem at every restart, like
> the event is never "eaten" from the queue. Hacked an eval then a
> try-catch over it...
>
> After coding a monitor to alert when a background_jobs has been "new"
> over 5 minutes in the DB, I was inundated by messages. There's alway
> one elasticsearch_update that escapes among the flurry, and they
> slowly add up.
>
> At this point, the only viable solution is to run the workers but
> disable RabbitMQ everywhere. Are we really the only ones experiencing
> that?
>
> Regards,
>
> PS Our servers are well-above-average Debian 11 machines with lot of
> firepower (ram, cpu, i/o...).
>
> --
>
> Philippe Blouin,
> Directeur de la technologie
>
> Tél. : (833) 465-4276, poste 230
> philippe.blouin at inLibro.com <mailto:philippe.blouin at inLibro.com>
>
> inLibro| pour esprit libre |www.inLibro.com <http://www.inLibro.com>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20221221/8ba9c905/attachment.htm>
More information about the Koha-devel
mailing list