[Koha-bugs] [Bug 12897] New: Enhance date ranges in ccl.properties with Zebra

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Wed Sep 10 08:03:22 CEST 2014


http://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=12897

            Bug ID: 12897
           Summary: Enhance date ranges in ccl.properties with Zebra
 Change sponsored?: ---
           Product: Koha
           Version: master
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P5 - low
         Component: Z39.50 / SRU / OpenSearch Servers
          Assignee: gmcharlt at gmail.com
          Reporter: dcook at prosentient.com.au
        QA Contact: testopia at bugs.koha-community.org
                CC: m.de.rooy at rijksmuseum.nl

At the moment, we're fairly inconsistent with how we handle dates with Zebra
indexing and searching.


1) _INDEXING_

The "registers" that we typically want to use for dates are "year" (y -
Non-tokenized and non-normalized 4 digit numbers) and "date" (d - Non-tokenized
and non-normalized ISO date strings)
(http://www.indexdata.com/zebra/doc/querymodel-zebra.html). Sometimes we might
need to use "numeric" if the date format isn't supported by "year" (YYYY) or
"date" (which has to be YYYY-MM-DD format, it seems).

However, at the moment, "pubdate" is the only index with a "year" register, and
"acqdate"/"Date-of-acquisition" is the only index with a "date" register. The
numeric register is also used sub-optimally. We use it for "onloan" and
"pubdate", but both of those have better registers to use. Numeric (via
st-numeric) works with "pubdate" because 1900-2000 is a straight forward
numerical range. (The more I think about it, the more the difference between
"year" and "numeric" might not matter too much. However "date" offers benefits
when it comes to range searching.)

Here are some other date indexes we have:

1) Date/time-last-modified (only has a 'word' register) - could maybe have a
'numeric' register as it has a format of yyyymmddhhmmss.f, but I'm not 100%
sure.
2) date-entered-on-file ('word' and 'sort') - could have a 'numeric' register
as it should be in YYDDMM format. (Neither 'year' nor 'date' would work, but
'numeric' could.)

3) copydate (only has a 'word' and 'sort' register) - could have a 'year'
register as it should be in YYYY format.
4) onloan ('numeric' and 'word') - could have a 'date' register as it's in ISO
format
5) datelastseen ('word') - could have a 'date' register as it's in ISO format
6) datelastborrowed ('word') - could have a 'date' register as it's in ISO
format
7) replacementpricedate ('word') - could have a 'date' register as it's in ISO
format

['pubdate' has 'word', 'numeric', 'year', and 'sort' registers.]
['acqdate'/'Date-of-acquisition' has 'word', 'date', and 'sort' registers.]
['tpubdate' exists in ccl.properties but isn't indexed for our MARC21.]



2) _SEARCHING_

Zebra has a built-in ability to do range searches. It relies on two special
attributes to accomplish this: r=o and r=r
(http://www.indexdata.com/yaz/doc/tools.html#ccl.qualifiers).

r=o
Allows ranges and the operators greather-than, less-than, ... equals. This sets
Bib-1 relation attribute accordingly (relation ordered). A query construct is
only treated as a range if dash is used and that is surrounded by white-space.
So -1980 is treated as term "-1980" not <= 1980. If - 1980 is used, however,
that is treated as a range.

r=r
Similar to r=o but assumes that terms are non-negative (not prefixed with -).
Thus, a dash will always be treated as a range. The construct 1980-1990 is
treated as a range with r=r but as a single term "1980-1990" with r=o. The
special attribute r=r is available in YAZ 2.0.24 or later.

Uses of r=r:

1) copydate
2) pubdate
3) lex
4) arl
5) arp

Uses of r=o:

1) st-numeric (this is a qualifier or "structure attribute" which tells Zebra
to use the "numeric" register)

So... looking back at the _INDEXING_ section... it's clear that anything with a
"date" register should use "r=o" for range searching, as YYYY-MM-DD has
internal hyphens which would negatively affect a r=o search.

When using a "numeric" or "year" register, we can use r=r.

Personally, I rather we use r=o (e.g. X - X) rather than r=r (e.g. X-X), but we
already have existing documentation in the manual (and on the advanced search
page) saying that "X-X" is how you specify a date range.

I'd be open to hearing ideas about this one. r=o is the only possibility for
specifying date ranges for ISO formatted dates, and it works for YYYY dates as
well, whereas r=r will only work for positive integers.

-- 
You are receiving this mail because:
You are watching all bug changes.


More information about the Koha-bugs mailing list