[Koha-bugs] [Bug 23487] New: Utf8mb4 collation causes problems with missing tables when upgrading the database

bugzilla-daemon at bugs.koha-community.org bugzilla-daemon at bugs.koha-community.org
Wed Aug 21 23:06:02 CEST 2019


https://bugs.koha-community.org/bugzilla3/show_bug.cgi?id=23487

            Bug ID: 23487
           Summary: Utf8mb4 collation causes problems with missing tables
                    when upgrading the database
 Change sponsored?: ---
           Product: Koha
           Version: unspecified
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5 - low
         Component: Architecture, internals, and plumbing
          Assignee: koha-bugs at lists.koha-community.org
          Reporter: andreas.hedstrom.mace at sub.su.se
        QA Contact: testopia at bugs.koha-community.org

Changing the default encoding to utf8mb4 and collation to utf8mb4_unicode_ci,
can create duplicate unique keys as it ignores differences in accented
characters vs non-accented characters. An example would be the Swedish letter
ö, which is treated like the letter o. More is discussed here:
https://stackoverflow.com/questions/47119794/mysql-mariadb-unable-to-handle-unique-keys-with-when-using-utf8mb4

When upgrading a large database to any version with the new encoding and
collation, there is a chance of duplicate unique keys crashing the collation
change of the table (or tables) and as a result corrupting InnoDB tablespace.
The final result is one or more tables that both exist and don't exist, and
thus can't be deleted or created again. We, not so fondly, called them
Schrödinger's tables.

For Stockholm University Library, we had about 30 duplicate unique keys in the
borrowers table (out of ~220K patrons), mostly in userid. Which crashed the
database every time we tried to upgrade until we found the cause. The
workaround solution was to change/remove these duplicates, but perhaps a better
solution for the community would be to change the default encoding/collation
for columns with unique keys? Perhaps utf8mb4_bin?

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are watching all bug changes.


More information about the Koha-bugs mailing list