[Koha-patches] [PATCH] _remove_stopwords in C4::Search had some issues

Galen Charlton gmcharlt at gmail.com
Sun Aug 9 21:39:29 CEST 2009


Hi,

I've pushed this patch after testing, but I agree with Joe - this
bugfix really needs test cases.  For that matter, the patch
description is lacking - there should at least have been an example of
the problem fixed by this patch.

Regards,

Galen

2009/7/23 Henri-Damien LAURENT <henridamien.laurent at biblibre.com>:
> For some reason, it would not really do an exact match on stopwords but would also prune some other part of words
> ---
>  C4/Search.pm |   23 ++++++++++-------------
>  1 files changed, 10 insertions(+), 13 deletions(-)
>
> diff --git a/C4/Search.pm b/C4/Search.pm
> index 467e6a1..ee54d56 100644
> --- a/C4/Search.pm
> +++ b/C4/Search.pm
> @@ -713,19 +713,16 @@ sub _remove_stopwords {
>  #       we use IsAlpha unicode definition, to deal correctly with diacritics.
>  #       otherwise, a French word like "leçon" woudl be split into "le" "çon", "le"
>  #       is a stopword, we'd get "çon" and wouldn't find anything...
> -        foreach ( keys %{ C4::Context->stopwords } ) {
> -            next if ( $_ =~ /(and|or|not)/ );    # don't remove operators
> -            if ( $operand =~
> -                /(\P{IsAlpha}$_\P{IsAlpha}|^$_\P{IsAlpha}|\P{IsAlpha}$_$|^$_$)/ )
> -            {
> -                $operand =~ s/\P{IsAlpha}$_\P{IsAlpha}/ /gi;
> -                $operand =~ s/^$_\P{IsAlpha}/ /gi;
> -                $operand =~ s/\P{IsAlpha}$_$/ /gi;
> -                               $operand =~ s/$1//gi;
> -                push @stopwords_removed, $_;
> -            }
> -        }
> -    }
> +               foreach ( keys %{ C4::Context->stopwords } ) {
> +                       next if ( $_ =~ /(and|or|not)/ );    # don't remove operators
> +                       if ( my ($matched) = ($operand =~
> +                               /(\P{IsAlnum}\Q$_\E\P{IsAlnum}|^\Q$_\E\P{IsAlnum}|\P{IsAlnum}\Q$_\E$|^\Q$_\E$)/gi) )
> +                       {
> +                               $operand =~ s/\Q$matched\E/ /gi;
> +                               push @stopwords_removed, $_;
> +                       }
> +               }
> +       }
>     return ( $operand, \@stopwords_removed );
>  }
>
> --
> 1.6.0.4
>
>
> _______________________________________________
> Koha-patches mailing list
> Koha-patches at lists.koha.org
> http://lists.koha.org/mailman/listinfo/koha-patches
>
>



-- 
Galen Charlton
gmcharlt at gmail.com



More information about the Koha-patches mailing list