[Koha-bugs] [Bug 21872] Elasticsearch indexing faster by making it multi-threaded

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Thu Nov 22 16:49:04 CET 2018


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=21872

--- Comment #4 from David Gustafsson <glasklas at gmail.com> ---
A really simple method of achieving this could be to use GNU parallel to run
multiple instances of rebuild_elastic_search.pl script if it where to accept
--start-biblionum --end-biblionum options.

I have already made a script to generate batches utilized in a parallel export
that we use:
https://github.com/ub-digit/Koha/blob/gub-dev-record-batches-script/misc/record_batches.pl

I might rewrite this script a bit since I think there are better ways to
produce batches, but it works.

It could then be used in a wrapper script for parallel running of
rebuild_elastic_search.pl like:

$KOHA_ROOT/misc/record_batches.pl | parallel --colsep ' ' -j$CONCURRENCY_LEVEL
$KOHA_ROOT/misc/search_tools/rebuild_elastic_search.pl --start-biblionum={1}
--end-biblionum={2}

The above is just pseudo-code and would have to be worked out to forward
options to rebuild_elastic_search.pl etc, but I think this would be a pretty
easy and efficient way to implement parallel indexing.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.


More information about the Koha-bugs mailing list