[Koha-bugs] [Bug 35086] Koha::SearchEngine::Elasticsearch::Indexer->update_index needs to commit in batches

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Tue Dec 26 22:28:06 CET 2023


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35086

David Nind <david at davidnind.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Text to go in the|                            |This enables breaking large
      release notes|                            |Elasticsearch or Open
                   |                            |Search indexing requests
                   |                            |into smaller chunks (for
                   |                            |example, from batch
                   |                            |modifications). It adds a
                   |                            |chunk_size configuration to
                   |                            |the elasticsearch section
                   |                            |in koha-conf.xml (for
                   |                            |example:
                   |                            |<chunk_size>250</chunk_size
                   |                            |>). So instead of sending a
                   |                            |single background request
                   |                            |for indexing, which could
                   |                            |exceed the limits of the
                   |                            |search server or take up
                   |                            |too many resources, this
                   |                            |limits index update
                   |                            |requests to a more
                   |                            |manageable size.
                   |                            |
                   |                            |NOTE:
                   |                            |This doesn't change the
                   |                            |command line indexing
                   |                            |script, as this already
                   |                            |allows passing a commit
                   |                            |size defining how many
                   |                            |records to send.
                 CC|                            |david at davidnind.com

--- Comment #5 from David Nind <david at davidnind.com> ---
I've signed off, as everything in the test plan worked.

However, I did note one thing when updating authority records:

1. Starting fresh (patch applied, shutting down KTD, then starting up again so
that there are no previous jobs, adding the <chunk_size>250</chunk_size>,
restarting everything).

2. After updating all the authority records by adding text to 680$1, there are
1320 entries for jobs:
   - the last 7 job entries (1314-1320) are for the elastic search index
updates, split into 250 chunks (1320 is for 206 record updates)
   - the first job entry (1) is for the batch authority record modification
(1706 modifications)
   - the rest of the job entries (2-1313) are for Elasticsearch index updates
to individual bibliographic records (from a sample I checked)

3. I'm assuming that the individual bibliographic record updates are because
the authority terms updated are linked to them.

4. I don't know whether this is what is expected, or whether there are plans to
chunk the subsequent individual bibliographic records updated because of
authority term changes. Or whether that is even possible.

5. Irrespective of that, this is still a great improvement!

Testing notes (using KTD):

1. I tested using ES8 (ktd --es8 up).

2. For the modification of bibliographic records, I updated the 'z - Public
note' with some text.

3. For the modification of authority records, I had a rule to add some text to
680$i subfield.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list