[Koha-devel] Optimizing Starman startup

dcook at prosentient.com.au dcook at prosentient.com.au
Wed Apr 28 06:29:56 CEST 2021


Based on my request, Jan has added an option to skip validation (https://github.com/jhthorsen/mojolicious-plugin-openapi/commit/673079d19f827ce8c8ab3a2943a4abc798fa1e18). 

However, he doesn't seem to want to implement my cache idea. He has said that he's willing to consider a PR, so I'll look at doing that some evening after work.

He did propose freezing a JSON::Validator object and passing that between processes, but I don't really like that idea. 

David Cook
Software Engineer
Prosentient Systems
Suite 7.03
6a Glen St
Milsons Point NSW 2061
Australia

Office: 02 9212 0899
Online: 02 8005 0595

-----Original Message-----
From: Ere Maijala <ere.maijala at helsinki.fi> 
Sent: Tuesday, 27 April 2021 3:28 PM
To: dcook at prosentient.com.au; 'Renvoize, Martin' <martin.renvoize at ptfs-europe.com>
Cc: 'Koha Devel' <koha-devel at lists.koha-community.org>
Subject: Re: [Koha-devel] Optimizing Starman startup

Thanks for digging up all the information. I think adding a caching validator would provide a nice improvement, so I'm in favor of it, and if I were to decide, I'd also include it in Koha.

Still, getting it upstream would be great in the long run, so I'd support that as well.

--Ere

dcook at prosentient.com.au kirjoitti 27.4.2021 klo 5.34:
> After more experimenting, I have concluded that
> JSON::Validator->validate() is the culprit in terms of CPU time and 
> memory usage.
> 
> Fortunately, I’ve determined that Mojo::JSON and Digest::MD5 can be 
> used together to create consistent reproducible checksums, which could 
> be used for caching validated schemas.
> 
> Of course, a solution would involve changes to JSON::Validator (and 
> possibly Mojolicious::Plugin::OpenAPI depending on the chosen 
> solution), and then we’d have to wait for the new and improved version 
> to come downstream, so we wouldn’t see the benefit of this for years.
> 
> That said… we could always roll our own JSON::Validator. And if we 
> don’t want to do it as a community, I could always just do it myself.
> 
> In terms of testing… with 18 CPUs I can restart 60 instances (120
> processes) and get through the app setup in about 60 seconds with 
> significant server load, when using validation. Without validation, I 
> can do it in 20 seconds without significant server load (beyond a few 
> short-lived CPU spikes).
> 
> I’m thinking about writing a patch and sending a pull request for 
> JSON::Validator, but also really thinking about implementing it 
> locally too at least in the meantime.
> 
> I haven’t heard from the author of JSON::Validator for a little while 
> now, but I hope I do hear back from him. I think it would be a great 
> addition to the library.
> 
> David Cook
> 
> Software Engineer
> 
> Prosentient Systems
> 
> Suite 7.03
> 
> 6a Glen St
> 
> Milsons Point NSW 2061
> 
> Australia
> 
> Office: 02 9212 0899
> 
> Online: 02 8005 0595
> 
> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org> *On 
> Behalf Of *dcook at prosentient.com.au
> *Sent:* Monday, 26 April 2021 7:01 PM
> *To:* 'Renvoize, Martin' <martin.renvoize at ptfs-europe.com>
> *Cc:* 'Koha Devel' <koha-devel at lists.koha-community.org>
> *Subject:* Re: [Koha-devel] Optimizing Starman startup
> 
> After some more experimenting, it’s clear that the problem isn’t 
> JSON::Validator::OpenAPI::Mojolicious or 
> Koha::REST::Plugin::PluginRoutes. If you exclude 
> Mojolicious::Plugin::OpenAPI, the startup is very fast. It’s 30 
> seconds start to finish to restart 60 instances and each instance 
> restarts very quickly.
> 
> When using Mojolicious::Plugin::OpenAPI, it takes about 3 minutes and 
> there’s a fair bit of downtime during that time.
> 
> When I do a strace, I’m noticing that a process can spend 30 seconds 
> just allocating memory for Mojolicious::Plugin::OpenAPI,  but it only 
> happens once you hit a certain volume of processes. If you’re just 
> starting up 1 or 2, then it’s only a couple seconds. But if you have 
> say
> 60-120 processes, it can take up to 30 seconds for 
> Mojolicious::Plugin::OpenAPI to do its work. I’m putting 10 CPUs to 
> this work, but clearly that’s not enough. I imagine there may be other 
> bottlenecks accessing the memory as well.
> 
> Has anyone profiled Mojolicious before? I’m guessing maybe Martin?
> 
> I suspect that this is just a problem that I’m going to have to live 
> with but maybe it is a case where I can find a way to optimize 
> Mojolicious::Plugin::OpenAPI.
> 
> David Cook
> 
> Software Engineer
> 
> Prosentient Systems
> 
> Suite 7.03
> 
> 6a Glen St
> 
> Milsons Point NSW 2061
> 
> Australia
> 
> Office: 02 9212 0899
> 
> Online: 02 8005 0595
> 
> *From:*Koha-devel <koha-devel-bounces at lists.koha-community.org
> <mailto:koha-devel-bounces at lists.koha-community.org>> *On Behalf Of 
> *dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
> *Sent:* Monday, 26 April 2021 5:12 PM
> *To:* 'Renvoize, Martin' <martin.renvoize at ptfs-europe.com 
> <mailto:martin.renvoize at ptfs-europe.com>>
> *Cc:* 'Koha Devel' <koha-devel at lists.koha-community.org
> <mailto:koha-devel at lists.koha-community.org>>
> *Subject:* Re: [Koha-devel] Optimizing Starman startup
> 
> So I just tried the following…
> 
> --
> 
> root at kohadevbox:koha(master)$ npm install -g swagger-cli
> 
> /usr/bin/swagger-cli -> 
> /usr/lib/node_modules/swagger-cli/swagger-cli.js
> 
> npm WARN @apidevtools/swagger-parser at 10.0.2 requires a peer of
> openapi-types@>=7 but none is installed. You must install peer 
> dependencies yourself.
> 
> + swagger-cli at 4.0.4 <mailto:swagger-cli at 4.0.4>
> 
> added 46 packages from 27 contributors in 8.203s
> 
> --
> 
> root at kohadevbox:koha(master)$ time swagger-cli bundle 
> api/v1/swagger/swagger.json --outfile api/v1/swagger/openapi.json 
> --type json
> 
> Created api/v1/swagger/openapi.json from api/v1/swagger/swagger.json
> 
> real    0m0.296s
> 
> user    0m0.346s
> 
> sys     0m0.032s
> 
> openapi.json is 10891 lines long but it actually contains 741 $ref 
> lines like  "$ref": "#/definitions/error" and "$ref":
> "#/definitions/patron_extended_attribute".
> 
> --
> 
> Now to do some benchmarking… I ran the following code:
> 
> #!/usr/bin/perl
> 
> use JSON::Validator::OpenAPI::Mojolicious;
> 
> my $validator = JSON::Validator::OpenAPI::Mojolicious->new;
> 
> my $spec = $validator->bundle({
> 
>      replace => 1,
> 
>      schema => "api/v1/swagger/swagger.json",
> 
> });
> 
> The first time I ran it… it took 1.343 seconds. The second time and 
> subsequent times it took .354 seconds. (That’s using Ubuntu 20.04 and 
> JSON Validator 3.14.) That suggests caching although I’m not sure where.
> I don’t see anything obvious in /usr/share/perl5/JSON/Validator/cache.
> 
> Trying with openapi.json yields .280 seconds instead of .354 seconds. 
> It’s faster, but not significantly.
> 
> So that suggests that the problem is actually with 
> Koha::REST::Plugin::PluginRoutes or Mojolicious::Plugin::OpenAPI more 
> specifically…
> 
> David Cook
> 
> Software Engineer
> 
> Prosentient Systems
> 
> Suite 7.03
> 
> 6a Glen St
> 
> Milsons Point NSW 2061
> 
> Australia
> 
> Office: 02 9212 0899
> 
> Online: 02 8005 0595
> 
> *From:*dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> 
> <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>>
> *Sent:* Monday, 26 April 2021 11:24 AM
> *To:* 'Renvoize, Martin' <martin.renvoize at ptfs-europe.com 
> <mailto:martin.renvoize at ptfs-europe.com>>
> *Cc:* 'Koha Devel' <koha-devel at lists.koha-community.org
> <mailto:koha-devel at lists.koha-community.org>>
> *Subject:* RE: [Koha-devel] Optimizing Starman startup
> 
> I think that I accidentally offended him, as he hasn’t responded to me 
> since his initial response.
> 
> I do wonder if reducing the number of references would help, although 
> I wonder how easy that would be to do in practice. It looks like we 
> have about 5767 lines of JSON all up as is… so it would probably get 
> even bigger if we dereferenced them.
> 
> Oh… here’s a thought. Why don’t we compile it? According to 
> https://davidgarcia.dev/posts/how-to-split-open-api-spec-into-multiple
> -files/ 
> <https://davidgarcia.dev/posts/how-to-split-open-api-spec-into-multipl
> e-files/>, you can maintain many different files, and then use 
> something like swagger-cli to create a single built/compiled OpenAPI 
> file.
> 
> That way JSON::Validator wouldn’t need to resolve any references for 
> the core API. I don’t know if the plugins have any $ref in them but 
> I’m guessing not (just based on Coverflow). So that could be a big win.
> 
> I’m working on other things at the moment, but I’m going to put that 
> on my eternal list.
> 
> David Cook
> 
> Software Engineer
> 
> Prosentient Systems
> 
> Suite 7.03
> 
> 6a Glen St
> 
> Milsons Point NSW 2061
> 
> Australia
> 
> Office: 02 9212 0899
> 
> Online: 02 8005 0595
> 
> *From:*Renvoize, Martin <martin.renvoize at ptfs-europe.com 
> <mailto:martin.renvoize at ptfs-europe.com>>
> *Sent:* Friday, 23 April 2021 5:24 PM
> *To:* David Cook <dcook at prosentient.com.au 
> <mailto:dcook at prosentient.com.au>>
> *Cc:* Koha Devel <koha-devel at lists.koha-community.org
> <mailto:koha-devel at lists.koha-community.org>>
> *Subject:* Re: [Koha-devel] Optimizing Starman startup
> 
> Jan's code is certainly challenging to read and understand at times I 
> agree.. I used to contribute to the plugin a number of years ago now..
> but the project that gave me time to play with that has since been 
> sold on so I'm not involved at the level I used to be.. he uses lots 
> of Perl foo which often takes me a long time to wrap my head around.
> 
> As for the refs, I think we split our spec up too much in all honesty.. 
> even the swagger spec suggest we went too far.. I think I might have 
> been unclear when I first pushed for a split from one massive file.  
> We could/should certainly reduce that somewhat..   it'll be 
> interesting to see if it makes much difference.. that could be a fairly quick win.
> 
> On Thu, 22 Apr 2021, 12:32 am , <dcook at prosentient.com.au 
> <mailto:dcook at prosentient.com.au>> wrote:
> 
>     Hi Ere,
> 
>     I think you're right about the refs. While they get resolved by the
>     OpenAPI plugin, you probably have to resolve them before trying to
>     dynamically inject the routes from plugins.
> 
>     Jan Thorsen (the author of Mojolicious::Plugin::OpenAPI and
>     JSON::Validator) thinks that the ref resolution is actually what's
>     taking so long. I looked it up and I think we have over 400
>     different references in the main OpenAPI spec alone. I haven't
>     profiled it but something to think about.
> 
>     At some point, I'm going to have a play with newer versions of the
>     modules. I'm gong to look at Ubuntu 20.04 and newer Debian versions
>     to see what I can get away with in terms of newness. Needs more
>     investigation, but I am really hoping that this is an issue that can
>     be solved by just upgrading the OS.
> 
>     I find Jan's code to be unnecessarily opaque (could use more
>     descriptive comments and function naming) but... I'll investigate.
>     Probably not right away as I have a bunch of other priorities that I
>     have to address but... this is on my mind.
> 
>     Starman startup time is probably the thing about Koha annoying me
>     the most right now and probably the most practical thing I can
>     improve at the moment...
> 
>     David Cook
>     Software Engineer
>     Prosentient Systems
>     Suite 7.03
>     6a Glen St
>     Milsons Point NSW 2061
>     Australia
> 
>     Office: 02 9212 0899
>     Online: 02 8005 0595
> 
>     -----Original Message-----
>     From: Ere Maijala <ere.maijala at helsinki.fi
>     <mailto:ere.maijala at helsinki.fi>>
>     Sent: Wednesday, 21 April 2021 6:31 PM
>     To: dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>;
>     koha-devel at lists.koha-community.org
>     <mailto:koha-devel at lists.koha-community.org>
>     Subject: Re: [Koha-devel] Optimizing Starman startup
> 
>     Hi David,
> 
>     I wish I'd remember all the details, but my memory fails me. I think
>     not using JSON had something to do with how the refs are resolved.
>     That may or may not have been the reason, but if everything works
>     with JSON module, I can't think of a reason not to use it.
> 
>     Thanks for taking a look!
> 
>     --Ere
> 
>     dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> kirjoitti
>     21.4.2021 klo 3.28:
>      > Hi Ere,
>      >
>      > Thanks for your reply. 24700 looks much better. I'll look at
>     backporting it locally.
>      >
>      > Although I'm looking at JSON::Validator::OpenAPI::Mojolicious at
>     https://metacpan.org/pod/release/JHTHORSEN/Mojolicious-Plugin-OpenAPI-2.19/lib/JSON/Validator/OpenAPI/Mojolicious.pm
>     <https://metacpan.org/pod/release/JHTHORSEN/Mojolicious-Plugin-OpenAPI-2.19/lib/JSON/Validator/OpenAPI/Mojolicious.pm>
>     and it says "Do not use this module directly. Use
>     Mojolicious::Plugin::OpenAPI instead." I notice that you're using
>     the "bundle" method. Do we really need that there? Why don't we just
>     load the JSON using the JSON module, merge with the plugin spec
>     files, and then pass it to the OpenAPI plugin? Shouldn't the plugin
>     take care of the $ref replacement?
>      >
>      > Hmm... I didn't realize until now that the OpenAPI plugin was
>     doing a validate behind the scenes. That's tricky.
>      >
>      > At a glance, we might be able to pre-load the app into the Starman
>      > master process pre-fork. There are warnings about doing that with
>     open
>      > database connections, so we'd need to review plack.psgi, but a quick
>      > glance suggests it might be OK. (Alternatively, I have wondered
>     about
>      > running the REST API as a separate process apart from Starman using
>      > hypnotoad. According to
>      > https://docs.mojolicious.org/Mojolicious/Guides/Cookbook
>     <https://docs.mojolicious.org/Mojolicious/Guides/Cookbook>,
>      > Mojo::Server::Prefork preloads the application in the manager/master
>      > process, and Hypnotoad is based off that, so that would help.)
>      >
>      > It does seem like changes to the OpenAPI plugin would be needed
>     for caching.
>      >
>      > I'm going to try backporting your change and try pre-loading and
>     see how far that gets me.
>      >
>      > David Cook
>      > Software Engineer
>      > Prosentient Systems
>      > Suite 7.03
>      > 6a Glen St
>      > Milsons Point NSW 2061
>      > Australia
>      >
>      > Office: 02 9212 0899
>      > Online: 02 8005 0595
>      >
>      > -----Original Message-----
>      > From: Koha-devel <koha-devel-bounces at lists.koha-community.org
>     <mailto:koha-devel-bounces at lists.koha-community.org>> On
>      > Behalf Of Ere Maijala
>      > Sent: Tuesday, 20 April 2021 4:48 PM
>      > To: koha-devel at lists.koha-community.org
>     <mailto:koha-devel at lists.koha-community.org>
>      > Subject: Re: [Koha-devel] Optimizing Starman startup
>      >
>      > Hi,
>      >
>      > I did some work on improving it here:
>      >
>      > https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24700
>     <https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24700>
>      >
>      > That shaved a good bit of time from it, but it's still a heavy
>      > operation, and it would make sense to
>      >
>      > 1.) avoid doing it too often
>      >
>      > 2.) cache the results and avoid doing it if results are cached
>      >
>      > If you could address the first one, that'd go a long way. I'm
>     afraid the second one would require changes to the OpenAPI plugin to
>     support caching.
>      >
>      > --Ere
>      >
>      > dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>
>     kirjoitti 20.4.2021 klo 6.15:
>      >> Hi all,
>      >>
>      >> Do you despair when you see the following periodically in “top”
>     when
>      >> a starman worker is recreated ?
>      >>
>      >>     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM   
>       TIME+
>      >> COMMAND
>      >>
>      >> 9529 my-koha  20   0  460108 197212  17172 R 100.0  0.4   0:03.41
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> Or the following in top when you install koha-common package or
>      >> restart the koha-common service?
>      >>
>      >> 11101 1-koha  20   0  447232 193320  16076 R   10.6  0.4   0:09.09
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> 11168 1-koha  20   0  447240 193264  16056 R   10.6  0.4   0:08.72
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> 11306 2-koha  20   0  447220 193148  16000 R   10.6  0.4   0:08.07
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> 11543 2-koha  20   0  447232 193036  15828 R   10.6  0.4   0:07.07
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> 11784 3-koha  20   0  441536 189664  16172 R   10.6  0.4   0:06.04
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> 11830 3-koha  20   0  439548 187212  15748 R   10.6  0.4   0:05.82
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> 11831 4-koha  20   0  438620 186344  15748 R   10.6  0.4   0:05.81
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> 11853 4-koha  20   0  437680 185672  16000 R   10.6  0.4   0:05.79
>      >> /usr/share/koha/api/v1/app.pl <http://app.pl>
>      >>
>      >> Well, I still have a lot of investigation left to do, but I
>     notice 1
>      >> place that a lot of time taken is here (per worker):
>      >>
>      >>       my $validator = JSON::Validator::OpenAPI::Mojolicious->new;
>      >>
>      >>       $validator->load_and_validate_schema(
>      >>
>      >>           $self->home->rel_file("api/v1/swagger/swagger.json"),
>      >>
>      >>           {
>      >>
>      >>             allow_invalid_ref  => 1,
>      >>
>      >>           }
>      >>
>      >>         );
>      >>
>      >>       push @{$self->routes->namespaces}, 'Koha::Plugin';
>      >>
>      >>       my $spec = $validator->schema->data;
>      >>
>      >>       $self->plugin(
>      >>
>      >>           'Koha::REST::Plugin::PluginRoutes' => {
>      >>
>      >>               spec      => $spec,
>      >>
>      >>               validator => $validator
>      >>
>      >>           }
>      >>
>      >>       );
>      >>
>      >>       $self->plugin(
>      >>
>      >>           OpenAPI => {
>      >>
>      >>               spec  => $spec,
>      >>
>      >>               route =>
>      >> $self->routes->under('/api/v1')->to('Auth#under'),
>      >>
>      >>               allow_invalid_ref =>
>      >>
>      >>                 1,    # required by our spec because $ref directly
>      >> under
>      >>
>      >>                       # Paths-, Parameters-, Definitions- &
>      >> Info-object
>      >>
>      >>                       # is not allowed by the OpenAPI specification.
>      >>
>      >>           }
>      >>
>      >> );
>      >>
>      >> Anyone have ideas for improving this? Do we have to validate the
>      >> schema every time? Can we move the schema validation into a
>     different
>      >> module and preload it into Starman using the -M flag so that it’s
>      >> done
>      >> 1 time per Starman master instance rather than 1 time per
>     Starman worker instance?
>      >>
>      >> I find “/usr/share/koha/api/v1/app.pl <http://app.pl>” to be the
>     bane of deployments,
>      >> as it puts a massive load on a server, when you have multiple Koha
>      >> instances on the server.
>      >>
>      >> David Cook
>      >>
>      >> Software Engineer
>      >>
>      >> Prosentient Systems
>      >>
>      >> Suite 7.03
>      >>
>      >> 6a Glen St
>      >>
>      >> Milsons Point NSW 2061
>      >>
>      >> Australia
>      >>
>      >> Office: 02 9212 0899
>      >>
>      >> Online: 02 8005 0595
>      >>
>      >>
>      >> _______________________________________________
>      >> Koha-devel mailing list
>      >> Koha-devel at lists.koha-community.org
>     <mailto:Koha-devel at lists.koha-community.org>
>      >>
>     https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
>     <https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
>      >> website : https://www.koha-community.org/
>     <https://www.koha-community.org/> git :
>      >> https://git.koha-community.org/
>     <https://git.koha-community.org/> bugs :
>      >> https://bugs.koha-community.org/ <https://bugs.koha-community.org/>
>      >>
>      >
>      > --
>      > Ere Maijala
>      > Kansalliskirjasto / The National Library of Finland
>      > _______________________________________________
>      > Koha-devel mailing list
>      > Koha-devel at lists.koha-community.org
>     <mailto:Koha-devel at lists.koha-community.org>
>      >
>     https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
>     <https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
>      > website : https://www.koha-community.org/
>     <https://www.koha-community.org/> git :
>      > https://git.koha-community.org/ <https://git.koha-community.org/>
>     bugs :
>      > https://bugs.koha-community.org/ <https://bugs.koha-community.org/>
>      >
>      >
> 
>     --
>     Ere Maijala
>     Kansalliskirjasto / The National Library of Finland
> 
> 
>     _______________________________________________
>     Koha-devel mailing list
>     Koha-devel at lists.koha-community.org
>     <mailto:Koha-devel at lists.koha-community.org>
>     https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
>     <https://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel>
>     website : https://www.koha-community.org/
>     <https://www.koha-community.org/>
>     git : https://git.koha-community.org/ <https://git.koha-community.org/>
>     bugs : https://bugs.koha-community.org/
>     <https://bugs.koha-community.org/>
> 

--
Ere Maijala
Kansalliskirjasto / The National Library of Finland




More information about the Koha-devel mailing list