[Koha-bugs] [Bug 27153] ElasticSearch should search keywords apostrophe blind

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Thu Sep 1 01:53:22 CEST 2022


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=27153

--- Comment #18 from David Cook <dcook at prosentient.com.au> ---
I tried to understand this without applying the patches, but it's too hard
without the context, so let's see...

I assume that the second patch "POC" is an alternative patch and not an
additional patch...

First patch:
- adds apostrophe filter to "analyzer_standard" which is used for all default
searches...
- apostrophe strips out apostrophes

Second patch:
- Adds a "punc_removed" field (which uses the analyzer_stdno analyzer which
already has a punctuation filter) to "default" under "search". If I understood
Elasticsearch and Koha's integration better, I would probably understand this,
but I don't currently.
- Reading through Koha/SearchEngine/Elasticsearch.pm and
https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html
it looks like "phrase", "raw", "ci_raw" and "punc_removed" are "fields" only
used when queried specifically
- In Koha/SearchEngine/Elasticsearch/QueryBuilder.pm adds title.punc_removed
field to all Elastic queries. We do something similar in Zebra in
C4::Search::_build_weighted_query() where we add title fields to search. 

I wonder a bit if adding title.punc_removed there will have unintended
consequences but it seems similar to Zebra so might not be a big drama.

Regarding Zebra, I don't think we can get full feature parity here. However, we
could potentially add a Title-punc_removed index and update
./etc/zebradb/xsl/koha-indexdefs-to-zebra.xsl to strip punctuation for it, and
then add that into C4::Search::_build_weighted_query() (or elsewhere). 

--

I think the "POC" patch would need some testing to make sure there aren't any
unintended consequences, but overall it sounds like a reasonable proposition.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list