[Koha-patches] [PATCH] _remove_stopwords in C4::Search had some issues
Galen Charlton
gmcharlt at gmail.com
Sun Aug 9 21:39:29 CEST 2009
Hi,
I've pushed this patch after testing, but I agree with Joe - this
bugfix really needs test cases. For that matter, the patch
description is lacking - there should at least have been an example of
the problem fixed by this patch.
Regards,
Galen
2009/7/23 Henri-Damien LAURENT <henridamien.laurent at biblibre.com>:
> For some reason, it would not really do an exact match on stopwords but would also prune some other part of words
> ---
> C4/Search.pm | 23 ++++++++++-------------
> 1 files changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/C4/Search.pm b/C4/Search.pm
> index 467e6a1..ee54d56 100644
> --- a/C4/Search.pm
> +++ b/C4/Search.pm
> @@ -713,19 +713,16 @@ sub _remove_stopwords {
> # we use IsAlpha unicode definition, to deal correctly with diacritics.
> # otherwise, a French word like "leçon" woudl be split into "le" "çon", "le"
> # is a stopword, we'd get "çon" and wouldn't find anything...
> - foreach ( keys %{ C4::Context->stopwords } ) {
> - next if ( $_ =~ /(and|or|not)/ ); # don't remove operators
> - if ( $operand =~
> - /(\P{IsAlpha}$_\P{IsAlpha}|^$_\P{IsAlpha}|\P{IsAlpha}$_$|^$_$)/ )
> - {
> - $operand =~ s/\P{IsAlpha}$_\P{IsAlpha}/ /gi;
> - $operand =~ s/^$_\P{IsAlpha}/ /gi;
> - $operand =~ s/\P{IsAlpha}$_$/ /gi;
> - $operand =~ s/$1//gi;
> - push @stopwords_removed, $_;
> - }
> - }
> - }
> + foreach ( keys %{ C4::Context->stopwords } ) {
> + next if ( $_ =~ /(and|or|not)/ ); # don't remove operators
> + if ( my ($matched) = ($operand =~
> + /(\P{IsAlnum}\Q$_\E\P{IsAlnum}|^\Q$_\E\P{IsAlnum}|\P{IsAlnum}\Q$_\E$|^\Q$_\E$)/gi) )
> + {
> + $operand =~ s/\Q$matched\E/ /gi;
> + push @stopwords_removed, $_;
> + }
> + }
> + }
> return ( $operand, \@stopwords_removed );
> }
>
> --
> 1.6.0.4
>
>
> _______________________________________________
> Koha-patches mailing list
> Koha-patches at lists.koha.org
> http://lists.koha.org/mailman/listinfo/koha-patches
>
>
--
Galen Charlton
gmcharlt at gmail.com
More information about the Koha-patches
mailing list