[Koha-bugs] [Bug 35799] New: Loading svc/cataloguing/framework bottlenecks advanced cataloging editor

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Fri Jan 12 21:29:19 CET 2024


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=35799

            Bug ID: 35799
           Summary: Loading svc/cataloguing/framework bottlenecks advanced
                    cataloging editor
 Change sponsored?: ---
           Product: Koha
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5 - low
         Component: Cataloging
          Assignee: koha-bugs at lists.koha-community.org
          Reporter: phil at chetcolibrary.org
        QA Contact: testopia at bugs.koha-community.org
                CC: m.de.rooy at rijksmuseum.nl, nick at bywatersolutions.com

The advanced editor loads
/cgi-bin/koha/svc/cataloguing/framework?frameworkcode=&callback=define with
RequireJS, with a 30 second timeout and no failure handling. That gives you 30
seconds for request, generation, and download, or the advanced editor simply
won't load and will sit with a progress indicator spinning forever (and in
versions before bug 34275 landed, with no way to edit bibs without editing your
cookie so you can get back to using the basic editor).

There are two ways (plus the combination) to have it fail to load, generation
or network.

Network is easy enough to reproduce even in koha-testing-docker, since
Firefox's devtools have network throttling that goes back to "Regular 2G" which
will certainly time it out.

Generation is a bit harder without a decent sized database: with 300K bibs and
600K authorities in my production database it's easy to either write reports
that ExtractValue(biblio_metadata.metadata...) and run several at once, or to
notice that authority search results actually support &resultsperpage=1000
despite not having UI to set it, and opening a few pages of a thousand
authority records that require searching for the number of bibs using each of
those thousands of authorities.

svc/cataloguing/framework is a json representation of the requested framework
plus all authorized values.

Frameworks are incredibly compressible, consisting of things like 4273
instances of value_builder of which 4253 are "value_builder":"", but it's sent
uncompressed, so 1.8MB for something that gzips to 160KB.

It's sent with Cache-Control: no-cache and Pragma: no-cache headers, so even
though it continues to load after the RequireJS timeout, just sitting waiting
for it (I've had a painfully large number of times where it would consistently
load in 31 or 32 seconds) and then reloading does no good. Not caching doesn't
really do much of anything to ensure freshness, since if you stick to just one
framework in an open instance of the editor, you won't ever reload
svc/cataloguing/framework, but it does at least mean that if you know you've
changed a framework and you don't see the change you can just open the editor
in a new tab without having to actually clear the browser cache.

The authorized values are too small of a percentage of the file size to make
much difference there, but, they don't vary with the framework, so there's no
reason to reload them for each change of framework, and they are 3/4 of the
database connections (one for AV, one for itemtypes, one for classification
sources), so loading them once separately probably would help for some
generation-limited cases and certainly would for the combination of generation-
and network-limited cases. Getting three db connections and then loading a tiny
file in less than 30 seconds, then getting one db connection and loading a
large file in less than a separate 30 seconds is easier than getting four
connections and a large file in one 30 second period.

Beyond that, there's no clear easy answer. Solving for my sketchy internet at
home is easy, just gzip the json after generating it, but that adds server
(though not db server) load. Okay, generate an already gzipped file once and
serve that, after all, I typically only change my frameworks twice a year, how
fresh do they need to be? But, editing frameworks consists of hundreds of tiny
saves of each subfield of each tag, so making each one regenerate a static file
is going to be painful. Okay, let the browser cache the output so at least a
reload after getting it to load in 35 seconds works. Cache it for how long?

-- 
You are receiving this mail because:
You are watching all bug changes.
You are the assignee for the bug.


More information about the Koha-bugs mailing list