2019-07-29 - Data Migration Subgroup Agenda and Notes

Date

at 11 EST

Link to meeting: https://zoom.us/j/204980147

Discussion items

TimeItemWhoNotes
5Welcome and assign note taker.Dale

Welcome and request for someone to take notes, preferably, someone who hasn't done it in awhile

35 minutesBulk API reviewvarious

We'll take a look at where we are at with bulk APIs, and what we need to do to move them along.

Some jiras of relevance:

And for some background and discussion from other sites (cited by Anatolii Starkov):

https://evertpot.com/http/207-multi-status
https://apihandyman.io/api-design-tips-and-tricks-getting-creating-updating-or-deleting-multiple-resources-in-one-api-call/
https://medium.com/paypal-engineering/batch-an-api-to-bundle-multiple-paypal-rest-operations-6af6006e002
https://developers.google.com/gmail/api/guides/batch

One complication with implementing bulk load APIs for all modules is that APIs call other APIs
This is less of an issue with storage APIs, so loading exclusively through them may be the best idea, with the exception of SRS/SRM potentially because of how complex the interactions between types of data are.
Otherwise, if we use storage APIs, we can avoid the business logic of other modules
Ian pointed out that loading directly to storage APIs shifts the burden of where the logic lives to the migration tool.
Dale pointed out that relationships between different types of data need to be maintained outside of FOLIO anyway during repeated migration attempts for consistency.
There was more discussion about how data links together and how these links are maintained, and use of storage APIs vs loading to business logic APIs.

10 minutesDowntime during migrationvariousTod asked in the implementers Slack channel and will bring it up in the next implementers meeting.
Previous requirements discussions have assumed there can be no downtime. It'd be a lot easier if that weren't the case.
Jon Miller pointed out that the downtime requirements were more about performance of the data load so that no institution would have to spend more than a day down for migration.
Implementers are basically all planning to do their migration during the summer because it's less busy.
5 minutesMetadata fields in recordsvarious

Metadata needs to be possible to populate as you load in data. Currently it's usually overwritten.

This could be accomplished with a flag to the API, or just based on whether the metadata attribute is present in what's loaded through the API.

5 minutesWork prioritizationvariousMore discussion of prioritizing work -- we need to be careful what and how much we ask for or it won't get done. Some discussion of who makes the decision regarding development priorities when everything is critical.

Link to Acquisitions Interface Fields
Link to  FOLIO Record Data Elements  (contains links to specific spreadsheets, but most of them are not up to date.)

Action Items

Tod will ask in the implementers meeting if zero downtime is a real requirement for implementers, or if a reasonable amount (less than a day) will be acceptable.