[Koha-devel] The many failings of background_jobs_worker.pl

Philippe Blouin philippe.blouin at inlibro.com
Wed Dec 21 17:53:35 CET 2022


My understanding is that all tasks go to the DB right now, so shutting 
down MQ doesn't lose anything.

Right now.

But having MQ running _does_ lose me some things, since not all tasks 
are processed, and the process can even get stuck in limbo with the UTF8 
issue.

With tens of installations to monitor, some putting thousands of jobs on 
the queue each day, I'm really looking for a quick fix before the 
Holidays...

Philippe Blouin,
Directeur de la technologie

Tél.  : (833) 465-4276, poste 230
philippe.blouin at inLibro.com

inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>
On 2022-12-21 09:15, David Schmidt wrote:
> I dont think you can shut down rabbitmq and expect things to work.
>
> rabbitmq needs to be running and this command should return 4 
> processes (at least it does on our systems)
>
> `|ps ax|grep 
> 'background_jobs_worker\.pl.*--queue.*\(default\|long_tasks\)'`|
> 1016012 ?        S      0:00 daemon --name=inlibro-koha-worker 
> --errlog=/var/log/koha/inlibro/worker-error.log 
> --stdout=/var/log/koha/inlibro/worker.log 
> --output=/var/log/koha/inlibro/worker-output.log 
> --pidfiles=/var/run/koha/inlibro/ --verbose=1 --respawn --delay=30 
> --user=inlibro-koha.inlibro-koha -- 
> /usr/share/koha/bin/background_jobs_worker.pl --queue default
> 1016014 ?        S      0:37 /usr/bin/perl 
> /usr/share/koha/bin/background_jobs_worker.pl --queue default
> 1016040 ?        S      0:00 daemon 
> --name=inlibro-koha-worker-long_tasks 
> --errlog=/var/log/koha/inlibro/worker-error.log 
> --stdout=/var/log/koha/inlibro/worker.log 
> --output=/var/log/koha/inlibro/worker-output.log 
> --pidfiles=/var/run/koha/inlibro/ --verbose=1 --respawn --delay=30 
> --user=inlibro-koha.inlibro-koha -- 
> /usr/share/koha/bin/background_jobs_worker.pl --queue long_tasks
> 1016042 ?        S      0:01 /usr/bin/perl 
> /usr/share/koha/bin/background_jobs_worker.pl --queue long_tasks
> ||
>
>
> regards
> david
>
> On Wed, 21 Dec 2022, at 3:04 PM, Philippe Blouin wrote:
>>
>> Although I precise,
>>
>> Cannot connect to broker Failed to connect: Error connecting to localhost:61613: Connection refused at /usr/share/perl5/Net/Stomp.pm line 27.; giving up at /usr/share/perl5/Net/Stomp.pm line 27.
>>
>>
>> So shutting down MQ has its own issues....
>>
>> Philippe Blouin,
>> Directeur de la technologie
>>
>> Tél.  : (833) 465-4276, poste 230
>> philippe.blouin at inLibro.com <mailto:philippe.blouin at inLibro.com>
>> inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>
>> On 2022-12-21 08:58, Philippe Blouin wrote:
>>>
>>> Good evening, David,
>>>
>>> Thanks for the response.  Yours and David's and Michael's. I feel 
>>> less alone...
>>>
>>> I validated, and yes all the patches you refer are in our pile.  And 
>>> until the problems arose, there were no customizations around that 
>>> code.
>>>
>>> So yeah, even at 22.05.06, I get the JSON error and the race 
>>> condition (we use ES).  And the _abandonned_ children.  So I 
>>> surmise, or dare I say postulate, that those issues are not as 
>>> resolved as some would presume.
>>>
>>> I will revert background_jobs_worker.pl to its default, and shutdown 
>>> MQ everywhere, for now.  :(
>>>
>>> Philippe Blouin,
>>> Directeur de la technologie
>>>
>>> Tél.  : (833) 465-4276, poste 230
>>> philippe.blouin at inLibro.com <mailto:philippe.blouin at inLibro.com>
>>> inLibro | pour esprit libre | www.inLibro.com <http://www.inLibro.com>
>>> On 2022-12-20 17:55, David Cook wrote:
>>>>
>>>> Salut Philippe,
>>>>
>>>>
>>>> That first issue should’ve been resolved in 22.05.00 by 
>>>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=30172 
>>>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=30172>. 
>>>> I haven’t had any problems like that since applying that patch. Are 
>>>> you running Koha with or without customizations?
>>>>
>>>>
>>>> As you say, bug 30654 discusses that second issue. And I obviously 
>>>> have my own opinion on that one 😉.
>>>>
>>>>
>>>> That JSON issue should be fixed by Bug 31351 in Koha 22.05.06 as 
>>>> well I believe: 
>>>> https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=31351 
>>>> <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=31351>
>>>>
>>>>
>>>> --
>>>>
>>>>
>>>> The only issue I’ve had with the background jobs has been the one 
>>>> covered by Bug 30172. Otherwise, it’s been all fine for me, 
>>>> although I use Zebra rather than Elasticsearch. I think part of the 
>>>> reason I haven’t had issues is that I haven’t had many people using 
>>>> the background jobs either though.
>>>>
>>>>
>>>> I’m actually planning on writing a background job system based on 
>>>> RabbitMQ for a different non-Koha system. The main difference is 
>>>> that I’ll reject or fail tasks where messages aren’t sent to 
>>>> RabbitMQ. I think that’ll make my system a bit more robust than Koha’s.
>>>>
>>>>
>>>> The problem with the background jobs at the moment is that we 
>>>> haven’t fully committed to RabbitMQ. We’re trying to do this weird 
>>>> hybrid with the database fallback which is not the right direction 
>>>> in my mind. We should do one or the other but not try to do both.
>>>>
>>>>
>>>> But that’s just my 2 cents.
>>>>
>>>>
>>>> David Cook
>>>>
>>>> Senior Software Engineer
>>>>
>>>> Prosentient Systems
>>>>
>>>> Suite 7.03
>>>>
>>>> 6a Glen St
>>>>
>>>> Milsons Point NSW 2061
>>>>
>>>> Australia
>>>>
>>>>
>>>> Office: 02 9212 0899
>>>>
>>>> Online: 02 8005 0595
>>>>
>>>>
>>>> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org> 
>>>> <mailto:koha-devel-bounces at lists.koha-community.org> *On Behalf Of 
>>>> *Philippe Blouin
>>>> *Sent:* Wednesday, 21 December 2022 6:13 AM
>>>> *To:* koha-devel at lists.koha-community.org 
>>>> <mailto:koha-devel at lists.koha-community.org>
>>>> *Subject:* [Koha-devel] The many failings of background_jobs_worker.pl
>>>>
>>>>
>>>> Howdy!
>>>>
>>>> Since moving a lot of our users to 22.05.06, we've installed the 
>>>> worker everywhere.  But the number of issues encountered is staggering.
>>>>
>>>> The first one was
>>>>
>>>> Can't call method "process" on an undefined value
>>>>
>>>> where the id received from MQ was not found in the DB, and the 
>>>> process is going straight to process_job and failing.  Absolutely 
>>>> no idea how that occurs, seems completely counterintuitive (the ID 
>>>> comes from the DB after all), but here it is.  Hacked the code to 
>>>> add a "sleep 1" to fix most of that one.
>>>>
>>>> Then came the fact that stored events were not checked if the 
>>>> connection to MQ was successful at startup.  Bug 30654 refers it.  
>>>> Hacked a little "$init" in there to clear that up at startup.
>>>>
>>>> Then came the
>>>>
>>>> malformed UTF-8 character in JSON string, at character offset 296 
>>>> (before "\x{e9}serv\x{e9} au ...")
>>>>
>>>> at decode_json that crashes the whole process.  And for some 
>>>> reason, it never gets over it, gets the same problem at every 
>>>> restart, like the event is never "eaten" from the queue.  Hacked an 
>>>> eval then a try-catch over it...
>>>>
>>>> After coding a monitor to alert when a background_jobs has been 
>>>> "new" over 5 minutes in the DB, I was inundated by messages.  
>>>> There's alway one elasticsearch_update that escapes among the 
>>>> flurry, and they slowly add up.
>>>>
>>>> At this point, the only viable solution is to run the workers but 
>>>> disable RabbitMQ everywhere.  Are we really the only ones 
>>>> experiencing that?
>>>>
>>>> Regards,
>>>>
>>>> PS Our servers are well-above-average Debian 11 machines with lot 
>>>> of firepower (ram, cpu, i/o...).
>>>>
>>>> --
>>>>
>>>> Philippe Blouin,
>>>> Directeur de la technologie
>>>>
>>>> Tél.  : (833) 465-4276, poste 230
>>>> philippe.blouin at inLibro.com <mailto:philippe.blouin at inLibro.com>
>>>>
>>>> inLibro| pour esprit libre |www.inLibro.com <http://www.inLibro.com>
>>>>
>> _______________________________________________
>> Koha-devel mailing list
>> Koha-devel at lists.koha-community.org
>> https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
>> website : https://www.koha-community.org/
>> git : https://git.koha-community.org/
>> bugs : https://bugs.koha-community.org/
>>
>
>
> _______________________________________________
> Koha-devel mailing list
> Koha-devel at lists.koha-community.org
> https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
> website :https://www.koha-community.org/
> git :https://git.koha-community.org/
> bugs :https://bugs.koha-community.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20221221/33ed5b8f/attachment-0001.htm>


More information about the Koha-devel mailing list