2017-11-27 Reporting SIG Notes

Date

Attendees

Goals

  • learn more about technical architecture design for Reporting in Folio

Discussion items

TimeItemWhoNotes
Assign Notetaker, take attendance, add agenda itemsTod

Notetaker: Anne L. Highsmith

Convener: Tod Olson

  Data Lakes and Data Warehouses Peter, All

 Peter Murray to discuss Data Lakes and Data Warehouses (see discussion thread here)

Peter Murray (PM) showed part of a draft presentation designed to to give a Deep Technical View of FOLIO architecture.The recording of the presentation can be found on the Reporting SIG's Google Drive Recordings folder at: https://drive.google.com/drive/folders/0B7G8S7WF6N20VE5Wa0I0STZGcjg?usp=sharing

When the presentation was complete, there was some discussion among committee members, including the following points:

  • Tod Olson (TO) said that the presentation raised some concerns. First, the issue of referential integrity among the identifiers of a transaction, e.g. the user/circ/inventory pieces of a circulation checkout and how they must be preserved.  Their (Chicago's) experience with other systems has been that maintaining such referential integrity is a problem. For example, research on a library's own data is one of its principal uses of reporting capability. Peter Murray (PM) responded that such integrity is maintained via the FOLIO apis. The apis guarantee such integrity because the Okapi layer will not allow the installation of a micro service that violates data integrity and consistency among micro services.
  • Michael Winkler (MW) commented that PM's statements imply that there will need to be an initial, massive load of data into the reporting system, otherwise there will be no data to report on and that that initial load probably won't happen through the Okapi layer because of the time it would take. TO added that the storage designer would have to provide an Extract, Transform, Load (ETL) process based on a standardized framework. PM responded that he disagreed with that, at least in part; that the api has to have an openly published definition that will provide such a framework. 
  • TO made the point that he didn't feel complete denormalization of codes to labels was necessary, since this is a function at which relational databases excel and that the reporting database should be left to do that. MW interjected that such denormalization is one of the functions of a data warehouse, that one shouldn't have to do a table join to determine what a code means. Also, that the data warehouse captures what is valid at a point in time; if the label for a location codes changes over time, for instance, then the data warehouse should capture that.
  • TO raised the issue that different tenants will want the data refinery (the "landing place" in the reporting system for raw data before it's integrated into the warehouse) to reflect their data, which will require work with storage designers of both the data refinery itself and the micro services modules. PM repeated that the emphasis should be on working with those who design storage in the reporting system; that designers who work with the storage micro services in the operational system must be free to select whatever storage format is appropriate, e.g. relational db in some cases or document-oriented in others.
  • Ingolf Kuss (IK) asked why are the transactions not optimized for reporting (a point PM had mentioned earlier). PM replied that it was very much an operational issue, that the Okapi layer is optimized for speed of transactions.

CODEX DesignVince, All

Vince Bareau to discuss how the design of the CODEX will impact reporting functionality

(Not addressed; Murray presentation & discussion lasted entire meeting).


Other topics?
other topics?

Next Meeting: Mon Dec 4

Build Agenda for Nov Dec 4 Meeting:

-introduce new participant Claudius Herkt

-George Stachokas to talk about how he has included Acquisitions data elements into the Reporting SIG Master Spreadsheet (or has captured this in another document)

-Sharon to consult SIG on Paradigm Shift subgroup approach

-lead for MARC Fields Subgroup needed

-possibility of continued discussion on Deep Technical View of FOLIO architecture with Peter Murray


Action items

Peter Murray will post a link to the recording on the Reporting SIG's space on http://discuss.folio.org and invite participants to comment, with a possible follow-up at next week's Reporting SIG.