[Koha-bugs] [Bug 24807] Add "year" type to improve sorting by publication date

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Mon Sep 28 18:51:28 CEST 2020


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=24807

--- Comment #56 from Nick Clemens <nick at bywatersolutions.com> ---
Created attachment 110884
  -->
https://bugs.koha-community.org/bugzilla3/attachment.cgi?id=110884&action=edit
Bug 24807: [20.05.x] Add "year" type to improve sorting behaviour

Add a "year" search field type. Fields with this type will only
retain values that looks like years, so invalid values such as
whitespace or word characters will not be indexed.
This for instance improves the behaviour when sorting by
"date-of-publication". If all values are indexed, records with
junk data instead of valid years will appear first among the search
results, drowning out more relevant hits. If assigning this field
the "year" type these records will instead always appear last,
regarless of sort order.

To test:

1) Have at least two biblios, one with a valid year in 008 (pos 7-10)
and another with an invalid one ("uuuu" for example)
2) Perform a wildcard search (*) and sort results by publication date.
3) The record with invalid year of pulication in 008 should appear first
4) Apply patch and run database updates
5) Reindex ElasticSearch
6) Perform the same search as in 2)
7) The record with the invalid year should now appear last

Signed-off-by: Nick Clemens <nick at bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: Add database update script

Signed-off-by: Nick Clemens <nick at bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: Update tests

Signed-off-by: Nick Clemens <nick at bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: Add suppport for uncertain fields and ranges

To test:
1 - Have some records with uncertain dates in the 008
    19uu, 195u, etc.
2 - Index them in Elasticsearch
3 - Do a search that will return them
4 - Sort results by publication/copyright date
5 - Note odd results
6 - Apply patch
7 - Reindex
8 - Sorting should be improved

Signed-off-by: Nick Clemens <nick at bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: Refactor using tokenize_callbacks

Signed-off-by: Nick Clemens <nick at bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: Simplify with new and imporved value_callbacks

Signed-off-by: Nick Clemens <nick at bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: (follow-up) Fix spelling

Signed-off-by: Nick Clemens <nick at bywatersolutions.com>

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: (follow-up) Add support for spaces as unknown characters

Signed-off-by: Katrin Fischer <katrin.fischer.83 at web.de>

Bug 24807: (QA follow-up) Remove uneccessary tests

These tests fail now, the code expects a real response from ES in Indexer.pm
but these tests mock 'bulk' and so don't have the necessary fields.

We are testing the same code above and can just add the _id == biblionumber
test

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list