2018-01-11 - System Operations and Management SIG Notes

Date

Attendees

Goals

Discussion items

TimeItemWhoNotes
5WelcomeIngolf

Ingolf will be the note taker today.

Introductions: new member Craig Boman.

30List of Integrations

Which issues need development and decision by the PC ?

Discuss List of Integrations .

These kind of issues are not yet represented in the Backlog. The Backlog so far focuses on UI application features. The List will be presented to the PC today (Ingolf, Chris).

Texas A&M has mostly custom-developed integrations.

15

Early results on load timings /

bulk user load

Tod

Early result on load timings for the user module and their implications for other forms of bulk loading.

  • discuss needs for optimization and acceptance criteria

Tod reports that UChicago (John) has done some testing in a cloud environment. Bulk user load took 2h 28m for 90,000 users. For a one-time load this is so-so, but extrapolated to bib loading (millions of records) this will be too slow. The bulk import calls the user model one at a time for each user. This could easily be optimized. The bulk user loader has not been brought back to the User Management SIG for user acceptance (Chris M. to follow up with Katalin, see Action Items). Deleting users was not discussed in the scope of the bulk loader, but it is important for testing. Spint Review reports bulk load of 2.6 mio bib records in 1h 15m for a raw load into a blank database. But what about merges and overloads ?

There should be a review at Culto, IndexData and the other developer groups to optimize database commits. Bottlenecks and performance issues must be eliminated.

A best design practice for similarity of endpoints' look and feel is desired. Our guess is that that is being developed quite independently by different teams. We need an emphasize on this being consistent. This SIG sould keep an eye on that. The methods and the way of calling them should be similar. One might call it API consistency.

(acceptance criteria) What is acceptable for patron loading ? 10 minutes - yes. 2 hours - maybe. 3 days - no. It is hard to put on hard limits here - it depends on a lot of variables, the external conditions and the institutional environment. Each institution has its own acceptance criteria.

Objection on use of UUIDs as primary identifiers: Identifiers should be the same in the legacy system and in the new system. Creating UUIDs as new identifiers will create complexity in the migration. One probably will have to take care about migration time then. One has to take special care about data integrity. Apart from that, large loads will be a little slower as if one uses integer counters. UUIDs are very bad as primary keys from the performance standpoint. Also you will probably have to insert the row first and then get it back in order to get the UUID. You probably can't build the UUID first. This is really cumbersome for migration; we want to do the translation of IDs before we do the bulk load. - This should be presented as a migration issue sepcifically, with concerns to maintaining referential data integrity.

We will take the API consistency issue and the UUIDs issue also to the SysOps Product Owner (whom we not have yet).

10Conceptual Architectural DiagramWaynedeferred


20Data migration Ingolf / group

deferred

 Data migration is a big load of work to the SIG.

  • what aspects of data migration will the SIG be responsible for ?
  • what should be delivered by the SIG ?
  •  how will the SIG involve other Subject Matter Experts (SMEs) from other SIGs to work on data migration ? (in what form and when will that happen ?)

It seems to me (Ingolf) that just one hour per week is not enough to discuss this (data migration) and work on this further. Can we choose one of the following solutions:

  1. build a subgroup. A limited number of experts meet in addition to the regular SIG meetings, on a regular basis, and is concerned with questions of data migration.
  2. The entire SIG meets more often, say, twice a week.
  3. The SIG meetings take place weekly and cover data migration issues, but the SIG meetings will be longer (90 minutes).
5Next MeetingIngolftopics for next meeting

Action items

  • Tod and Chris M. schedule a time to bash through the List of Integrations. Chris still has a list from the OLE migration. It would be useful to give some examples. Texas (Steve) will contribute and adapt as is appropriate.
  • Chris M. will reach out to Katalin and Cate (the PO for User Management) and will talk about our issues on bulk user loading, which are important for migration and testing.
  • Sharon Wiles-Young could present the API consistency issue (best practices for endpoints' look and feel) to the developers at the Madrid conference (Jan 22 - 24). Tod is going to write notes on that for her reference. Tod will also chat with Wayne. The UUIDs issue shall be presented at this same place. Whom will we ask to finally present the two issues on the meeting ?