[Koha-bugs] [Bug 15032] [Plack] Scripts that fork (like stage-marc-import.pl) don't work as expected

Thu Oct 25 01:57:15 CEST 2018

https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=15032

--- Comment #33 from David Cook <dcook at prosentient.com.au> ---
(In reply to Jonathan Druart from comment #26)
> I do not think your Koha::Daemon will work for the background jobs.
> 
> To me, the best way to fix this would be to provide a Koha daemon, instead.
> But it would watch a DB table, which would allow us to build a view on top
> of it to manage the different jobs (and history).
> 
> We could also have different daemons for different needs (a param of the
> daemon matching a column in the DB table).
> 
> The question behind that is security, we will have to list the different
> operations the daemon is allowed to process (and so we will loose
> flexibility).
> 
> I have spent a *lot* of time trying to fix this issue by forking, etc. And
> my conclusion is: it's not feasible (see discussions and tries on related
> bug reports)
> 
> See also bug 1993.

I just realized that I misread your comment, Jonathan...

Take a look at line 106 of "Harvester.pm" at
https://bugs.koha-community.org/bugzilla3/page.cgi?id=splinter.html&bug=10662&attachment=79039. 

Using POE::Component::JobQueue, I check a DB table called
"oai_harvester_import_queue" for "new" import jobs. 

You could do the same thing. 

In my case, the job in "oai_harvester_import_queue" actually has a field
containing a JSON array of record identifiers, and the import task fetches
those records from the file system and then works on them as a batch. Going 1
by 1 for each record would be way too slow (both in terms of retrieving tasks
and importing records). 

But yeah the web interface could upload the records, store their location, put
that in a job, and then store that data in the database. The import daemon
would poll the database (which isn't super efficient but efficient enough for
Koha's purposes I imagine), and then work on batches as it could. 

--

(One thing to keep in mind here is import matching rules that use Zebra... if
you have high performance importing, Zebra won't be able to keep up, and it
means that you could wind up with duplicate records if someone tries to import
the same record too quickly... )

-- 
You are receiving this mail because:
You are watching all bug changes.