[Koha-bugs] [Bug 15032] [Plack] Scripts that fork (like stage-marc-import.pl) don't work as expected

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Mon Oct 29 14:22:04 CET 2018


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=15032

--- Comment #34 from Tomás Cohen Arazi <tomascohen at gmail.com> ---
(In reply to David Cook from comment #33)
> (In reply to Jonathan Druart from comment #26)
> > I do not think your Koha::Daemon will work for the background jobs.
> > 
> > To me, the best way to fix this would be to provide a Koha daemon, instead.
> > But it would watch a DB table, which would allow us to build a view on top
> > of it to manage the different jobs (and history).
> > 
> > We could also have different daemons for different needs (a param of the
> > daemon matching a column in the DB table).
> > 
> > The question behind that is security, we will have to list the different
> > operations the daemon is allowed to process (and so we will loose
> > flexibility).
> > 
> > I have spent a *lot* of time trying to fix this issue by forking, etc. And
> > my conclusion is: it's not feasible (see discussions and tries on related
> > bug reports)
> > 
> > See also bug 1993.
> 
> I just realized that I misread your comment, Jonathan...
> 
> Take a look at line 106 of "Harvester.pm" at
> https://bugs.koha-community.org/bugzilla3/page.cgi?id=splinter.
> html&bug=10662&attachment=79039. 
> 
> Using POE::Component::JobQueue, I check a DB table called
> "oai_harvester_import_queue" for "new" import jobs. 
> 
> You could do the same thing. 
> 
> In my case, the job in "oai_harvester_import_queue" actually has a field
> containing a JSON array of record identifiers, and the import task fetches
> those records from the file system and then works on them as a batch. Going
> 1 by 1 for each record would be way too slow (both in terms of retrieving
> tasks and importing records). 
> 
> But yeah the web interface could upload the records, store their location,
> put that in a job, and then store that data in the database. The import
> daemon would poll the database (which isn't super efficient but efficient
> enough for Koha's purposes I imagine), and then work on batches as it could. 
> 
> --
> 
> (One thing to keep in mind here is import matching rules that use Zebra...
> if you have high performance importing, Zebra won't be able to keep up, and
> it means that you could wind up with duplicate records if someone tries to
> import the same record too quickly... )

This is worth discussing in koha-devel, and IRC meeting or Marseille, but I
think we should have a job queue in the lines of what David proposes. And I
would add maybe the use of ZeroMQ to possibly notify other services.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list