[Koha-devel] Dedicated Zebra Server

Wed Jul 19 04:09:15 CEST 2017

A sizeable portion of koha-conf.xml is actually Zebra configuration. An interesting task would be to separate out all the Zebra-specific stuff and put it in its own file, so that it was less intimidating and perhaps easier to manage.

For searching, Koha is just using a Z39.50 connection, so Zebra could be accessed over TCP on a different server.

For indexing, the indexing needs to happen on the Zebra server, and in order to index you need to have records from the Koha database. There’s a number of ways you could handle that I suppose… but yeah it currently is closely tied to the Koha application monolith. In theory, if your MySQL database was on yet another server, you’d just need an indexing script and the credentials to access the MySQL database. 

I suppose there are some system preferences that get taken into account as well… but that’s still coming from the database.

In theory, I would think the “koha-common” package would contain the libraries needed for anything interfacing with Koha and then a person could use Ansible to keep the remote servers all consistently configured. 

Anyway, all that would require some work, which I know I’m not planning to do. At this point, it’s probably more worthwhile to focus on Elasticsearch development, although I think the plan is to keep Zebra around anyway, so decoupling it a bit from the Koha monolith wouldn’t necessarily be a bad idea…

David Cook

Systems Librarian

Prosentient Systems

72/330 Wattle St

Ultimo, NSW 2007

Australiarpm 

Office: 02 9212 0899

Direct: 02 8005 0595

From: koha-devel-bounces at lists.koha-community.org [mailto:koha-devel-bounces at lists.koha-community.org] On Behalf Of Michael Hafen
Sent: Wednesday, 19 July 2017 2:45 AM
To: Kivilahti Olli-Antti <olli-antti.kivilahti at jns.fi>
Cc: koha-devel at lists.koha-community.org
Subject: Re: [Koha-devel] Dedicated Zebra Server

I'm thinking of doing something like this too.  I'd really like to see the zebra var's from the koha_conf.xml too, if possible.  Thanks.

On Tue, Jul 18, 2017 at 8:37 AM, Kivilahti Olli-Antti <olli-antti.kivilahti at jns.fi <mailto:olli-antti.kivilahti at jns.fi> > wrote:

Hi!

We install Koha in a cluster of LXC containers:

One haproxy for dns redirection and ssl offloading
One container for MariaDB
One for OPAC
One for staff client
One for Zebra
One for misc stuff, like the SIP2-server
One for ElasticSearch
One for Memcached
+containers for other support services

With haproxy it is rather straight forward to start load balancing multiple "web bricks" if you choose to. We don't need to do it since we chose to buy a lot of hardware. Hardware is less expensive than work where we live in.
This clusterization is simply to increase robustness and security.
And actually has saved our bacon few times during the past years, when we get unidentifiable hard-drive freezes with the SIP2-container,
or some cronjob has went awol and completely hogged all system resources available to it.
Without containerization we would have had downtime.
Also helps with simple DOS-attacks which can happen by accident.

I recommend this type of setup.

Unfortunately the debian packages add a lot of unnecesary cruft to each container so we use dev-install instead.

Getting Zebra to work in a separate container:
Zebra is really really really tightly wound into the make-toolchain Koha uses and there is no convenient way of separating it. Maybe you could just bind mount the zebradb-directory between containers. This might be enough to run Zebra separate of the Koha's source code.

Since mass storage is very cheap, we just run two identical Koha instances from the same source code repo. We make several identical Koha-installations via the dev-install.
The Zebra-one only has Zebra and Koha sources (no Apache2, MariaDB, etc packages needed by a standalone Koha).
You can configure Zebra to listen on tcp socket via the $KOHA_CONF. You can conveniently centrally configure the same $KOHA_CONF to all the Koha containers, so they all find the same Zebra by the ip and the same MariaDB. This is best done with Ansible (since community has already started working towards Ansible)

Separating Zebra to a isolated (and easily monitorable) environment is entirely possible.

Here is a snippet from our Ansible infrastructure definitions where Koha is set up with capability: zebra

These should be most of the Ansible steps needed together with Koha-source to get Zebra running standalone:

ansible at hephaestus:~/KSAnsible$ cat roles/koha/tasks/capabilities/zebra.yml

---

- name: Install zebrasrv and zebraidx
  apt:
    name: idzebra-2.0
  become: yes

- name: Create the zebra socket-dir and permissions
  file:
    path: "{{zebra_run_dir}}"
    owner: koha
    group: koha
    state: directory
  become: yes

- name: Create the zebra lock-dir and permissions
  file:
    path: "{{zebra_lock_dir}}"
    owner: koha
    group: koha
    state: directory
  become: yes

- name: Create the zebra data-dir and permissions
  file:
    path: "{{zebra_data_dir}}"
    owner: koha
    group: koha
    state: directory
  become: yes

- name: Configure Zebra stripes
  lineinfile:
    path:   "{{item.path}}"
    regexp: "{{item.regexp}}"
    line:   "{{item.line}}"
  with_items:
    - path:   "{{koha_dev_path}}/etc/zebradb/zebra-biblios-dom.cfg"
      regexp: "^memMax:"
      line:    "memMax: {{zebra_index_mem}}"
    - path:   "{{koha_dev_path}}/etc/zebradb/zebra-biblios-dom.cfg"
      regexp: "^register:"
      line:    "register: {{zebra_data_dir}}/biblios/register:{{zebra_index_size}}"
    - path:   "{{koha_dev_path}}/etc/zebradb/zebra-biblios-dom.cfg"
      regexp: "^shadow:"
      line:    "shadow: {{zebra_data_dir}}/biblios/shadow:{{zebra_index_size}}"
    - path:   "{{koha_dev_path}}/etc/zebradb/zebra-biblios-dom.cfg"
      regexp: "^sortmax:"
      line:    "sortmax: {{zebra_sort_max}}"
    - path:   "{{koha_dev_path}}/etc/zebradb/zebra-biblios-dom.cfg"
      regexp: "^facetNumRecs:"
      line:    "facetNumRecs:{{zebra_facet_num_records}}"
      owner: koha
  become: yes
  notify: Restart Zebra

- name: Zebra - Link Zebra-service
  file:
    src:   "{{koha_dev_path}}/bin/koha-zebra-ctl.sh"
    dest:  "/etc/init.d/koha-zebra-daemon"
    owner: root
    group: root
    state: "link"
  become: yes

- name: Zebra - Enable Zebra-service
  systemd:
    name: koha-zebra-daemon
    enabled: yes
    daemon_reload: yes
    state: started
  become: yes

- name: Zebra-index - Link Zebra-index-service
  file:
    src:   "{{koha_dev_path}}/bin/koha-index-daemon-ctl.sh"
    dest:  "/etc/init.d/koha-index-daemon"
    owner: root
    group: root
    state: "link"
  become: yes

- name: Zebra-index - Enable Zebra-index-service
  systemd:
    name: koha-index-daemon
    enabled: yes
    daemon_reload: yes
    state: started
  become: yes

- name: Deploy Zebra cronjobs
  template:
    owner: "{{koha_user}}"
    src:   "etc_cron.d_koha.j2"
    dest:  "/etc/cron.d/koha-zebra"
  vars:
    cronjobs:
      rbza:
        comment: '#Trigger full Zebra reindexing daily'
        prefix:  ' '
        timing:  '46 10 * * *'
        user:    'koha'
        command: 'migration_tools/rebuild_zebra.pl <http://rebuild_zebra.pl>  -b -a -r -v -x'
  become: yes

  _____  

From: koha-devel-bounces at lists.koha-community.org <mailto:koha-devel-bounces at lists.koha-community.org>  [koha-devel-bounces at lists.koha-community.org <mailto:koha-devel-bounces at lists.koha-community.org> ] on behalf of Mansur Ali [mansuralih at gmail.com <mailto:mansuralih at gmail.com> ]
Sent: Tuesday, July 18, 2017 3:11 PM
To: Tajoli Zeno
Cc: koha-devel at lists.koha-community.org <mailto:koha-devel at lists.koha-community.org> 
Subject: Re: [Koha-devel] Dedicated Zebra Server

Thank you for your valuable information.

On Tue, Jul 18, 2017 at 2:59 PM, Tajoli Zeno <z.tajoli at cineca.it <mailto:z.tajoli at cineca.it> > wrote:

Hi,

Il 18/07/2017 13:28, Mansur Ali ha scritto:

Hello,

I am using KOHA ILS. Can i able to use dedicated Zebra Server which is installed in another server? Could you please advice if possible?

with present standard installation, the answer is no, it is not possible.

In theory if:
1)You setup to use PazPar instead of Zebra [an old option not tested from years] AND
2)You rewrite the updating scripts and deamon

you can try to do it.

I suggest you to do a try only if you are very skilled on Zebra and others Indexdata's tools.

Bye
Zeno Tajoli

-- 
Zeno Tajoli
/SVILUPPO PRODOTTI CINECA/ - Automazione Biblioteche
Email: z.tajoli at cineca.it <mailto:z.tajoli at cineca.it>  Fax: 051/6132198
*CINECA* Consorzio Interuniversitario - Sede operativa di Segrate (MI)

-- 

Mansur Ali

+966568978721 <tel:+966%2056%20897%208721>  | +919061635954 | mansuralih at gmail.com <mailto:mansuralih at gmail.com> 

_______________________________________________
Koha-devel mailing list
Koha-devel at lists.koha-community.org <mailto:Koha-devel at lists.koha-community.org> 
http://lists.koha-community.org/cgi-bin/mailman/listinfo/koha-devel
website : http://www.koha-community.org/
git : http://git.koha-community.org/
bugs : http://bugs.koha-community.org/

-- 

Michael Hafen

Washington County School District Technology Department

Systems Analyst

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20170719/12ccf665/attachment-0001.html>