2020-02-26 Special Data Migration Meeting Notes

Date

Attendees

Goals

  • List of items needed to facilitate data migration to be taken back to Tech Council.

Discussion items

TimeItemWhoNotes
15mDiscuss data migration experiencesUniversity of Chicago - Index Data

Charles Ledvina  - Synchronous endpoints work well for instances, holdings, and items. Problems appear to be in SRS. After about 2M records, the process grinds to a halt.

Wayne Schneider - Strategy was to load inventory and SRS separately and link afterward. Dropping indices dramatically improved performance.

Theodor Tolstoy (One-Group.se) - Complexity of SRS affects performance.

Mark Veksler - What is acceptable performance for data load and subsequent data upgrades?

Ian Walls - My benchmark is 70 records/second.

Tod Olson - Batch endpoints for CRUD operations need to be fleshed out. Concern about large data operations (bulk updates), efficient reads  (paging vs.streaming)

https://docs.google.com/document/d/10TiO1vn6D8U3f4GsgLCbr2_-405kFu5YydcrVu-TohI/edit#heading=h.9domz8d2r0ni

15mDiscuss data migration experiences Texas A&M - William Welling

William Welling - First attempt using Camunda to page records out of Oracle database and posting to SRM endpoint. Ended up with Duplicates in SRS. Took 10 days, never reached 4m records. Second attempt by going direct to DB results in errors in UI.

Wayne Schneider - Data migration has historically been considered out of scope for data import.

William Welling - Is the community going to have a common practice for moving data into the system in a migration?

spampell - Imperative that we have a community standard practice for migrating data from existing ILS into FOLIO

10mDiscuss data migration experiencesOthers

zeno.tajoli - Shared script used to do data migration. 2.6m took 2h 45m. Using batch API, not using SRS, only inventory.

Theodor Tolstoy (One-Group.se) - 1.3M bibs and 1.7M items, happy with the batch API's. Source records loading direct to DB. Re-implementing business logic in migration tooling not desirable. Tod Olson agrees.

Anne L. Highsmith - At times we receive errors that aren't very specific in regards to which records are failing. Theodor Tolstoy (One-Group.se) has similar experiences.


10mStatus of SRS module rewrite???

Theodor Tolstoy (One-Group.se) - Understands this has been pushed to Q2.

MODSOURCE-97 - Getting issue details... STATUS  is locus of discussion for SRS redesign.


10mAction itemsStephen Pampell

Action items

  • Develop Data Migration community best practice
  • Task force to work on SRS problems and workarounds for data migration
    • Task force should be able to directly interface with developers working on SRS
    • We would prefer the efficiency of using the existing data import endpoints to do migrations. If the endpoints are not robust enough, they should be re-worked to handle the amount of data inherent in large migrations.