[Koha-devel] Searching and ILL (from: Searching Group Meeting Notes)

Sun Aug 7 10:32:43 CEST 2005

Summary:
1. Referring to practical problems as ideology looks like trying
to polarise the discussion, even if it's not.
2. CQL default for advanced search, att:val for simple search.
3. Implement as few search engines as possible, translate to/from
any other similar ones.
4. OpenSearch should be supported, but not with its own engine.

I'm sorry that Joshua has missed the points I'm trying
to make.  This is best shown by describing some of them as
"ideological". If you think any of this is ideology, please ask
me to explain it further. I've no great love for namespaces
for their own sake. I love what they allow us to do and hate
the problems they solve.

Some specific comments:

Joshua Ferraro <jmf at liblime.com> wrote:
> [.quote.] As any good text based interface, CQL is intended to 'do 
> what you mean' for simple, every day queries, while allowing means to 
> express complex concepts when necessary." [...]

Unfortunately, CQL barfs on some every day searches because it
reserves too many words and symbols (exact for mathematicians,
for example).  It seems a long way from "do what you mean"
for web-opac users.  For that reason alone, I think it should
not be the default for searching. Can it be the default for
"advanced search" only?

> The main problem I have with going with MJ's suggestion is that we've
> not found a well-defined syntax definition. So we're not really sure how
> to do thinks like proximity searching or other more complex search
> syntax types.

What would persuade you to accept it as "well-defined"? Here are
three definitions of the style:

http://www.altavista.com/help/search/syntax
http://www.gigablast.com/help.html
http://www.google.co.uk/help/basics.html

and here's a naive try at a BNF grammar for it:

    <search> ::= <term> [<whitespace> <search>]
      <term> ::= [<negation>] <simpleterm>
  <negation> ::= '-'
<simpleterm> ::= <value> | <attvalue>
  <attvalue> ::= <attribute> ':' <value>
 <attribute> ::= <pstring>
     <value> ::= <pstring> | <qstring>
   <pstring> ::= characters other than ':' or <whitespace> not starting '-' 
   <qstring> ::= '"' characters '"'

I'm sure a real computer scientist could fix my attempt easily.
Someone who knows searching better may recognise this as a type
that's already standardised, but I couldn't.

Those other search types aren't directly expressable. I don't
think many ordinary users want them (feel free to prove me
wrong), as evidenced by the complaints in your reference:

[...]
> Finally, it's important to remember that although some users will use
> the syntax and advanced search, 99% probably won't. But that doesn't
> mean that it's not important to have the syntax. There's a (mostly) 
> nicely written article in Library Journal that brings up some good
> points regarding research and the weaknesses of the keyword method:
> http://www.libraryjournal.com/article/CA623006.html

That's largely independent of the default syntaxes, though. If
anything, I think offering a web-search-like subject:English
subject:dialect way in the default will make using catalogue
headings more popular. (I've already seen some searches which
check web page returns against VLib and the Open Directory
to suggest subject headings, which can be a big improvement.
It would be even better if we had more librarians, but wouldn't
lots of things?)

A syntax which can do all of the advanced searches is needed
for the advanced search, but please don't curse web-opac users
with poor usability. Give them an obvious syntax for today.
If it's easy to translate att:val to CQL, then yippee!

> [...] So I propose that Koha
> support all three of the major standards for record sharing to
> maximize the number of clients that can access the database.

I agree with that and I suggested a possible way to do this
with translators, which would avoid having to maintain three
more external interfaces to Koha's catalogue (and more later).
What do you think?

[...]
> OpenSearch may be flawed (RSS vs. RDF), the fact that it's so easy to
> implement (when compared to SRW/U for instance) means [..later..]

It's no easier to implement than doing the right thing (RDF),
which would help show Koha seriously supports the Semantic Web.

It's only marginally harder to implement the right thing and then
add OpenSearch's bugs. If they're as responsive as you suggest,
then we'll strip the bugs away over time and we should have
the benefit of being able to add support for other Semantic
Web searches easily. Can we do it that way?

[...]
> I grok the Perl analogy and I agree that RSS 2 namespaces aren't
> ideal. The problem is that OpenSearch is widely adopted and if
> we want to tap into those sources we'll need an OpenSearch 
> search and retrieval engine.

Not necessarily. We only need an OpenSearch interface, as above.
If we add an entire engine for each external search protocol,
we're probably going to drive developers insane.

> > do any other search engines use opensearch yet? 
> Yes ... lots. Peruse through the 'columns' section of the 
> opensearch.a9.com site and you'll see many many search engines
> that have adopted the standard. [...]

If you don't know, please just say you don't know. There's no
"columns" section on that site, but a little digging shows 253,
mostly specialised web search engines. Maybe useful after all.

[...]
> So ... I know that was long. I hope you made it this far. Please
> give me some feedback. I'm not trying to polarize the discussion
> so if you've got points to make please say them and I'll do my
> best to understand and then respond ...

I've tried. Thanks for reading,
-- 
MJ Ray (slef), K. Lynn, England, email see http://mjr.towers.org.uk/