[Koha-devel] Koha and Qvarn, a suggestion (also too long for anyone to finish reading)

Mon Nov 28 09:06:45 CET 2016

Hi Lars,
We are moving - slowly - away from SQL all over the place with DBIx and Koha::Object[s].
At the same time moving code from C4 to these Koha objects.
This is a huge operation.
Another ongoing project is the extension of the RESTful API (using Mojolicious and Swagger).

So, things are already moving..

Marcel

-----Oorspronkelijk bericht-----
Van: koha-devel-bounces at lists.koha-community.org [mailto:koha-devel-bounces at lists.koha-community.org] Namens Lars Wirzenius
Verzonden: vrijdag 25 november 2016 15:56
Aan: koha-devel at lists.koha-community.org
Onderwerp: [Koha-devel] Koha and Qvarn, a suggestion (also too long for anyone to finish reading)

Hi, Koha developers,

some of you may remember me from the Debian packaging of Koha I made in 2010 for Catalyst IT. I've never looked much at the actual code base of Koha, but one thing that even a quick grep reveals is that SQL code is sprinkled all over the code base. To me, this indicates that the application code ("business logic") and the way data is stored are closely coupled, and that's generally not a good thing. Among other things, if there's a need so change the database schema, or database engine, the whole code base might be affected.

In my current job we develop a storage system (roughly, "database, but with better access control") with an emphasis on privacy. Our software is free software released under the Affero GPL, version 3 or later. I suggest that our software (called Qvarn) would be worth considering for Koha.

A summary of my impression of the situation:

* Koha stores very sensitive information about people's reading
  habits.

* In the EU the new General Data Protection Regulation (GDPR) is in
  effect (with a transition period until some time in 2014), which
  requires everyone who collects or stores personal information to
  take good care so the data doesn't leak and violate people's
  privacy. Koha, or at least libraries using Koha, would seem to fall
  under this (IANAL, TINLA, AYOL).

* At minimum, a "privacy impact assessment" needs to be made.
  Basically, think about what the happens to the people whose data it
  is that gets leaked. (Best to assume it will be.)

* Koha stores all its data (including the personal stuff) in
  MySQL/MariaDB. There is no clean abstraction layer, which means SQL
  is all over the code base, making things harder to change. I think
  it would be good for Koha to add an abstraction layer even if Koha
  stays with MySQL.

* Qvarn was designed from the ground up with privacy and security in
  mind. It is meant to be the least weak link in the chain to protect
  privacy of the data it stores. It's still under active development,
  and doesn't have all the planned features yet. It's backed by my new
  company, QvarnLabs, and is in production use already, storing around
  a million personal identities.

* Qvarn provides a RESTful HTTP(S) JSON API, meaning there's no
  per-client/per-user session state on the Qvarn side. Makes it really
  easy to use, from the client, in my opinion. All data is in JSON
  form, which also makes things easier to use, in my opinion. It's
  conceptually JSON objects, not tables/columns. Qvarn uses Postgres
  behind the scenes, at least for now, but that's not visible to API
  clients (read: we could replace it with individual files on disk and
  you'd not notice).

* Qvarn requires the use of Gluu (see gluu.org) for authentication and
  authorization. Gluu is an identity management server, which is also
  free software. We aim to hide Gluu entirely behind the Qvarn API,
  and already mostly do that, so it's not something you need to know
  much about, except that operationally it's a bit tricky, since it's
  harder to deploy than Qvarn itself it (Qvarn comes as a .deb).

Here's my thoughts about Koha using Qvarn:

* First step would be to change Koha to have a suitable abstraction
  layer for storing data. I imagine this will be a big job.

* Second step would be to model the data Koha stores in Qvarn resource
  types, i.e., decide which types of records should be and what they
  should contain.

* Third step is ...

* Fourth step is profit.

Benefits to Koha as far as I can see:

* Abstraction layer: cleaner code, easier database related changes in
  the future, possibly a better test suite as a side effect.

* Qvarn: more privacy protection for user data, maybe (this is my
  highly personal and biased opinion) a nicer way to store/access the
  data.

What do you think? Is this something that the Koha community might want? I don't want to waste your time or anyone else's time if it's not interesting. However, if it is interesting, QvarnLabs is willing to work on making sure Qvarn is good for Koha.

Note: though both Qvarn and Koha are developed/supported by companies, they're free software and also free to use. This is not unlike Koha itself.

I'm happy to answer any questions you may have, preferably on the koha-devel list.
_______________________________________________
Koha-devel mailing list
Koha-devel at lists.koha-community.org
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/ git : http://git.koha-community.org/ bugs : http://bugs.koha-community.org/