[Koha-devel] Handling normalized phone number data

dcook at prosentient.com.au dcook at prosentient.com.au
Fri Sep 16 02:34:14 CEST 2022


I think that there could be merit in having the separate table, although that would require changes to a number of different patron touch points. I don’t know how it would work with the “Import patrons” or “Batch patron modification” or “Batch patron deletion and anonymization” tools (among others). We’d also need to modify patron entry, modification, and self-registration.

 

I was thinking about your example, and I wasn’t sure how DBIC works with a 1 to Many relationship, so I came up with this example after consulting the wiki:

curl -u koha:koha --request GET 'http://127.0.0.1:8081/api/v1/patrons/51' --header "x-koha-embed: extended_attributes" 

 

I gave the patron multiple extended attributes and it did a fantastic job of including the extended attributes as objects in a list in the JSON output. Awesome! I think that could work really well with phones. (Having flags for “primary” and “sms” could be good in that table.)

 

My only hesitation would be with the time/energy that it would take to refactor Koha completely to use a separate phones table on my own :/. 

 

If it were a team effort, I’d be willing to help create and execute a plan though. 

 

It could also become a blueprint for how to denormalize other data in the borrowers table (like email and address). 

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: Tomas Cohen Arazi <tomascohen at theke.io> 
Sent: Thursday, 15 September 2022 9:59 PM
To: dcook at prosentient.com.au
Cc: koha-devel <koha-devel at lists.koha-community.org>; Kyle Hall <kyle at bywatersolutions.com>; Martin Renvoize <martin.renvoize at ptfs-europe.com>
Subject: Re: Handling normalized phone number data

 

I insist you should think longer term, and add a separate table. That way you will end up with a simpler (and more performant) query as well:

 

GET /patrons

x-koha-embed: phones

x-koha-query: {"phones.normalized":{"-like":"123%"}}

 

 

 

El jue, 15 sept 2022 a las 1:33, <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> > escribió:

At the moment, the only solution I can think of is to rewrite the DBIC query after it hits the REST API. I’ve figured out 3 ways to do it:

 

Current:

{"me.phone"=>{"like"=>"%020404123456%"}}

 

Option 1:

{"me.phone"=>{"in"=> \["SELECT phone FROM borrowers WHERE regexp_replace(phone,?,?) LIKE ?","[^0-9]","",$query->{'like'}]}

 

Option 2:

\["regexp_replace(me.phone,?,?) LIKE ?","[^0-9]","","%12345678901%"]

 

Option 3:

Delete {"me.phone"=>{"like"=>"%020404123456%"}}

Inject the following into the WHERE via the $attributes: \["regexp_replace(me.phone,?,?) LIKE ?","[^0-9]","","%12345678901%"]

 

On a database of over 100,000 patrons, Option 1 takes 3 seconds to return while Option 2 takes .2 seconds to return. Option 3 is a bit harder to test but should be the .2 seconds too.

 

The problem with Option 2 is that it nukes the original hashref, so if that hashref contains other keys for an OR query, they’d be lost. 

 

I was leaning towards Option 1 until I figured out Option 3.

 

Forutnately, DataTables sends a particular HTTP header, so I’m able to do the re-writing just for DataTables requests to the API.

 

I’ll be posting the patch to https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23817 shortly.

 

Even if the community doesn’t go this way, I might use it locally anyway going forward. I’ve started rolling out a similar method for older Kohas because of the number of libraries flagging phone number searching as a problem. 

 

One way or another it’s a problem that needs to be solved…

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: dcook at prosentient.com.au <mailto:dcook at prosentient.com.au>  <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> > 
Sent: Thursday, 15 September 2022 1:23 PM
To: 'Tomas Cohen Arazi' <tomascohen at theke.io <mailto:tomascohen at theke.io> >
Cc: 'koha-devel' <koha-devel at lists.koha-community.org <mailto:koha-devel at lists.koha-community.org> >; 'Kyle Hall' <kyle at bywatersolutions.com <mailto:kyle at bywatersolutions.com> >; 'Martin Renvoize' <martin.renvoize at ptfs-europe.com <mailto:martin.renvoize at ptfs-europe.com> >
Subject: RE: Handling normalized phone number data

 

So you’d add it to the client side, eh…

 

I looked through “koha-tmpl/intranet-tmpl/prog/en/includes/patron-search.inc” and “koha-tmpl/intranet-tmpl/prog/js/datatables.js”, and found the DataTables integration with the REST API to be a bit… obscure. But now I see what you mean about adding an embed option via the kohaTable constructor. 

 

Although like Kyle was saying. Adding a phone number table wouldn’t solve this problem (unless we had a normalized phone number column).

 

That said, we wouldn’t need to normalize the phone number data, if we could use use SQL functions on the left hand side of a comparison: https://metacpan.org/dist/DBIx-Class/view/lib/DBIx/Class/Manual/Cookbook.pod#Using-SQL-functions-on-the-left-hand-side-of-a-comparison

 

It’s doable from a DBIC level but I don’t see how we can do it with the DataTables + REST API integration since it requires Perl-specific syntax that JSON can’t handle. Handing DBIC-like syntax to the REST API makes this really hard. 

 

The only thing I can think to do is to rewrite the query after it hits the REST API. 

 

I’ll be writing about that again shortly…

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: Tomas Cohen Arazi <tomascohen at theke.io <mailto:tomascohen at theke.io> > 
Sent: Thursday, 15 September 2022 11:06 AM
To: David Cook <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> >
Cc: koha-devel <koha-devel at lists.koha-community.org <mailto:koha-devel at lists.koha-community.org> >; Kyle Hall <kyle at bywatersolutions.com <mailto:kyle at bywatersolutions.com> >; Martin Renvoize <martin.renvoize at ptfs-europe.com <mailto:martin.renvoize at ptfs-europe.com> >
Subject: Re: Handling normalized phone number data

 

For the REST API you would just add an embed option. Correct.

 

El mié, 14 sept 2022 21:37, <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> > escribió:

I think at some point both the phone and email details should be moved into their own table(s), and then 1 would get a primary flag. 

 

With the DataTables and REST API, I suppose we could hard-code a join into Koha/REST/V1/Patrons.pm.

 

If we had a “phone” table, we could potentially store both free-text (potentially formatted) and normalized versions of the phone number. Then we’d just update the patron search to use the normalized version. 

 

But all of that sounds like a lot of effort :/

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 

From: Tomas Cohen Arazi <tomascohen at theke.io <mailto:tomascohen at theke.io> > 
Sent: Wednesday, 14 September 2022 11:10 PM
To: dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> 
Cc: koha-devel <koha-devel at lists.koha-community.org <mailto:koha-devel at lists.koha-community.org> >; Kyle Hall <kyle at bywatersolutions.com <mailto:kyle at bywatersolutions.com> >; Martin Renvoize <martin.renvoize at ptfs-europe.com <mailto:martin.renvoize at ptfs-europe.com> >
Subject: Re: Handling normalized phone number data

 

Should we also think of moving phones to their own table?

 

El mié, 14 sept 2022 a las 3:43, <dcook at prosentient.com.au <mailto:dcook at prosentient.com.au> > escribió:

Hi all,

 

I’ve been working lately on https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23817 in an effort to let staff users easily search phone numbers in the Patrons module.

 

Currently, phone number searching doesn’t work if you have punctuation or other formatting inconsistent with your search query. 

 

So I’m proposing normalizing phone numbers. Normalizing the search query is easy, but normalizing the data in the database is harder. In older versions of Koha, I used SQL to normalize the phone column in the WHERE clause, but that’s seemingly impossible now that we’re using DataTables and the REST API.

 

So it seems to me the only way forward is to normalize the data in the database. I’ve attached a patch which provides a Koha::Patron->phone() set method which normalizes the phone number before it’s saved in the database, but I don’t know what other Koha folk think about that. 

 

Thoughts?

 

David Cook

Senior Software Engineer

Prosentient Systems

Suite 7.03

6a Glen St

Milsons Point NSW 2061

Australia

 

Office: 02 9212 0899

Online: 02 8005 0595

 




 

-- 

Tomás Cohen Arazi

Theke Solutions (http://theke.io)
✆ +54 9351 3513384
GPG: B2F3C15F




 

-- 

Tomás Cohen Arazi

Theke Solutions (http://theke.io)
✆ +54 9351 3513384
GPG: B2F3C15F

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.koha-community.org/pipermail/koha-devel/attachments/20220916/ef2db4e4/attachment-0001.htm>


More information about the Koha-devel mailing list