[Koha-bugs] [Bug 12478] Elasticsearch support for Koha
bugzilla-daemon at bugs.koha-community.org
bugzilla-daemon at bugs.koha-community.org
Mon Aug 31 07:20:20 CEST 2015
http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=12478
--- Comment #91 from Robin Sheat <robin at catalyst.net.nz> ---
(In reply to Jonathan Druart from comment #81)
> Well, it's defined yes, but does not work at all (the marc21 mappings are
> used) :)
> It is caused by some errors in the sql file. Patch's coming.
Ah, ta.
>
> Note the following:
> MariaDB [koha_es_unimarc]> insert into search_field (name, type) select
> distinct mapping, type from elasticsearch_mapping;
> Query OK, 73 rows affected, 57 warnings (0.05 sec)
> Records: 73 Duplicates: 0 Warnings: 57
>
> MariaDB [koha_es_unimarc]> show warnings;
> +---------+------+--------------------------------------------+
> | Level | Code | Message |
> +---------+------+--------------------------------------------+
> | Warning | 1265 | Data truncated for column 'type' at row 1 |
Hmm, I remember that, but I'm not 100% sure it mattered. Could be wrong though.
> Yes of course, but I am not a real tester, I am a developer, and it would be
> useful to share info on specific data.
> I am fine to use the sandbox DB, if it's ok for you.
Fair point. Let me see if I can tidy the database some for uploading somewhere.
Here it is:
http://elasticsearch.koha.catalystdemo.net.nz/files/koha_es_marc21.sql.bz2
it's not the best data, but it's good enough for messing about with.
> > > 2/ The number of tests provided is very low.
> > Yes, I've been meaning to go back and add a pile more.
> Ok, I let it that for you :)
Oh, you don't have to. I don't mind if you go and write them all for me :)
> Patch is coming.
> Patch is coming.
> Patch is coming.
> Patch is coming.
Thanks!
>
> Yes it has:
> title":["Dollhouse"],["Seasons one & two."]]
>
> 245$a Dollhouse
> 490$a Seasons one & two.
>
> But 245$a should be used for sorting :)
Yes, that's something I'm trying to fix at the moment :)
> The item is a "Visual Materials" which has a itemtype.notforloan flag set.
Good to know, I've not tested that case yet.
> Outch, not sure how I could find that easily.
Probably easiest to construct a case manually.
> It comes from the 008
> > "Pictura murală*" has "pubdate":"||||" (/_search?q=_id:39&pretty)
> 008 090409|||||||||xx |||||||||||||| ||und||
> > The Korean Go Association's learn to play go "pubdate":"uuuu"
> 008 971030muuuu9999nyua 000 0 eng
>
> But the index should not contain an invalid date.
Hmm. I don't know if we can put validation into the fixer rules. I'll have to
explore that some further. Possibly also telling ES that this must be a number
could cause bad data to get rejected, but it may reject the whole record, not
sure.
Do you happen to know how zebra handles that?
> For Solr (you can find the code on the BibLibre repo at
> https://git.biblibre.com/biblibre/koha_biblibre/commits/dev/solr Browse
> C4/Search/), we used a system of plugins. And there is a Date plugin
> (https://git.biblibre.com/biblibre/koha_biblibre/blob/
> bd38ce1811289fcfbd75a37ec99fc4cd3c5d37f4/C4/Search/Plugins/Date.pm) which
> does this job.
> A plugin can be linked to a mapping.
We probably can't directly reuse that, at present we're using Catmandu do do
the data conversion and interfacing with ES for the most part. But it's
possible I can hook something in somewhere.
> Just a note: I know nobody has ever had a look at the Solr code, but it is
> used in production by several (4 or 5) customers for more than 4 years now.
> And I have already had all the issues and problems you will encounter.
I'm sure I'll encounter some exciting new ones :)
> I will try and see if I can find some time and propose something here, I you
> want some help.
Sure, anything is welcome.
(In reply to Jonathan Druart from comment #90)
> Something else, there is a sort issue in the facets:
>
> [Some entries]
> Zeitoun, Ariel,
> Ó Cadhain, Máirtín.
> Ślez, Ts..
>
> Ó should be after O, not after Z.
Line 573 of opac/opac-search.pl does a sort with cmp, which isn't very unicode
aware. I'm putting that in the not-my-problem bin as it's in upstream :)
--
You are receiving this mail because:
You are watching all bug changes.
More information about the Koha-bugs
mailing list