[Koha-bugs] [Bug 28884] ElasticSearch: Question mark in title search returns no results

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Tue Apr 18 20:51:58 CEST 2023


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=28884

--- Comment #10 from Janusz Kaczmarek <januszop at gmail.com> ---
(In reply to Katrin Fischer from comment #9)
> (In reply to Janusz Kaczmarek from comment #8)
> > (In reply to Nick Clemens from comment #7)
> > > This was "solved" with a hack on bug 31213
> > > 
> > > If I understand what this patch is doing, you are escaping the question mark
> > > so it no longer operates as a special character?
> > > 
> > > This is not a bad idea, however, it needs to be documented with a comment in
> > > the code, as well as in the system preference. I think it would actually be
> > > better as its own system preference.
> > 
> > Nick, I'm aware of your 'hack'--now, I'm unable to verify if quickly, but as
> > far as I remember it didn't solve this case.  If it did, there would be no
> > point for me to dig into this.
> > 
> > If I remember well, the patch intends to treat the question mark, in the
> > function _query_regex_escape_process, in exactly the same way as it is done
> > with slash: instead of escaping/unescaping only '/': (?=/) it escapes also
> > '?': (?=[/\?]).  Maybe I could extend the existing comment, but I don't feel
> > like this is a candidate for a new syspref...?
> 
> I think it would be nice if we could keep the truncation feature without
> requiring to escape it. It's easy to explain to people to use ?, but much
> harder to explain \? - also quite a change in behavior at this point maybe?
> 
> I was wondering, if ? replaces one or more characters... why does it not
> work as a wild card if there really is a question mark?

Well, IMHO it will be not so easy to explain to an ordinary user (who usually
will not read complicated introductory texts, and what we are discussing is a
very special case) how '?' functions with ES.  Moreover, I expect that an
ordinary user will rather use '?' as a verbatim '?' (from copy-paste) than as a
wildcard.  So, I expect (and have already experienced) alarms from the users
who are not able to find a book (with a title containing '?', copied from other
data source) in the local catalog even if it is there. 

If I understand well, in query_string ES query, '?' stands for a single
character. But, at the same time, after a normal text analysis of a string
containing a question mark attached at the end of the word, the '?' character
is removed from what is stored in the index.  So, the query 'are?' will not
find an original string 'are?' but it would find for instance 'ares', 'arer',
'area' etc.

(Similarly, '/' has a special meaning in ES but at the same time can be put
unconsciously by an ordinary user as a part of query string -- and hence, if I
get it right, the option QueryRegexEscapeOptions of escaping it.)

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list