2019-08-26 Reporting SIG Meeting notes

Date

Attendees

Present?

Name

Organization

Present?

Name

Organization







xSharon BeltaineCornell UniversityxSara ColglazierMount Holyoke College/Five Colleges

Elizabeth BerneyDuke University
Erin NettifeeDuke University

Joyce ChapmanDuke University
Karen NewberyDuke University

Elizabeth EdwardsUniversity of Chicago
Tod OlsonUniversity of Chicago
xClaudius Herkt-JanuschekSUB HamburgxScott PerryUniversity of Chicago

x

Doreen HeroldLehigh University
Stefan StadtherrMPIL Heidelberg
xAnne L. HighsmithTexas A&MxSimona TabacaruTexas A&M

Harry KaplanianEBSCOxKevin WalkerThe University of Alabama

Ingolf Kusshbz
Charlotte WhittIndex Data

Lina LakhiaSOASx

Andi Bihler

Munich Technical University Library

Joanne LearyCornell University
Uschi KluteGBV
xMichael PatrickThe University of AlabamaxVandana ShahCornell University
xNassib NassarIndex DataxAngela Zoss

Duke University

xVeit KöppenUniversity Magdeburg
Lisa DeCarolisSmith College/Five Colleges
xLinda MillerCornell UniversityxElena O'Malley

Emerson

xMatt HarringtonDuke University     Holly MistlebauerCornell University







Discussion Items

Item

Who

Notes

AttendanceSharon

Today's Attendance Taker: Linda Miller

Today's notetaker: Sharon Beltaine

Last week's notetaker: Angela Zoss

Labor Day meeting cancelledSharon

The Reporting SIG meeting is cancelled on Monday, September 2, 2019 in observance of Labor Day in the U.S. 


Updates from Various Reporting Related Groups and EffortsVarious

The Reporting SIG is using small work groups to address priorities and complete our work. Each week, we will provide updates to the Reporting SIG from these various reporting-related groups and efforts:

  • Community and coordination
    • working on plans to accommodate changes in the LDP roadmap
  • LDP Report Working Group
    • this group will be working on report prototypes and queries this fall 
  • LDP Data Privacy Working Group
    • Joyce: Our group has no update. At last Monday's reporting SIG meeting the Data Privacy WG was tasked with beginning to prepare a presentation for Product Council. I had forgotten that all the other group members were on vacation, so we were not able to hold our Friday call last week. We'll have to begin when everyone is back, and will hold our call this coming Friday.
  • LDP SysOps Working Group
    • Jason Root (TAMU) worked on setting up LDP Loader software and containerizing it
  • Software development
    • see next topic
  • Others?


LDP Road MapNassib
  • Considerations in developing this first road map for LDP releases:
    • Key requirements:
      • Many reports have been ranked by institutions for "go live (MVP)", which means that they will have to be written and debugged within the next few months.
      • It is not recommended to do significant SQL reporting using FOLIO's internal operational database, as has often been done with many traditional ILS systems.  One reason is that there is no requirement that FOLIO modules use the same database, which means that cross-domain table joins on the operational database may break irreparably in the future.
      • Reporting analysts want an easy and familiar query model, and one that works with common reporting tools.
      • Reporting analysts will want queries to run efficiently.
      • Reporting SIG members have repeatedly communicated the need for access to all FOLIO data for reporting purposes.
    • Challenges and constraints:
      • The number of storage interfaces in FOLIO will soon exceed 100.  Although ETL for most of them is relatively simple, synchronizing such a large number of tables reliably on schema changes requires very active coordination with FOLIO, which so far has not proved to be possible.
      • LDP critical dependencies on FOLIO core development which have been requested and flagged as critical beginning in late 2018 will likely not be addressed until mid-2020 or beyond, based on discussions with the FOLIO capacity planning and project management groups.
      • FOLIO is requesting that "go live (MVP)" features be completed by January 2020, to be released in Summer 2020.
      • Future developer resources appear to continue to be roughly 1 FTE or less on average, consisting of several part-time developers.
  • LDP 1.0 proposed core features, for "go live (MVP)":
    • Support for ad hoc, cross-domain queries for all, or a very large proportion, of FOLIO data extracted from storage modules.  We would ask this working group and the Reporting SIG to help us determine, as soon as possible, the definition of "all FOLIO data" required for inclusion in the LDP.
    • Include MARC records extracted from FOLIO SRS and transformed for easier querying.
    • Historical data will be retained in the LDP but not transformed into a single schema.
    • LDP database recommended to be refreshed once per day from the FOLIO operational database.
    • Support for optional anonymization of personal data.  The Data Privacy WG will propose requirements for this feature, in particular which fields should be anonymized.
    • Implementation guidelines (documentation) for local tables.
    • Support for PostgreSQL and Redshift database systems.
    • Proposed data model design for LDP 1.0 based on the original LDP Architecture proposal.  See LDP documentation at: https://github.com/folio-org/ldp
  • Schedule:  LDP Beta (feature complete) in January 2020, LDP 1.0 in Summer 2020.
  • LDP beyond 1.0:  Historical queries using a single schema, full ETL, and relational or star schema could in theory be implemented for later releases, but this is highly dependent on the identified critical dependencies and availability of developer resources.
  • Current support for query development:
    • The test database for the Report Prototype Working Group is now using the proposed data model design for LDP 1.0.  Please take a look and send feedback as soon as possible.
    • Also in the test database are data needed for the Circ Item Detail query, in the following tables:
      • groups

      • holdings

      • instance_types

      • instances

      • institutions

      • items

      • loans

      • locations

      • material_types

      • service_points

      • temp_loans (workaround for missing effective location attribute in FOLIO)

      • users

    • Additional tables will soon be added to support writing report queries as prioritized by the Report Prototype working group.
    • Propose that the LDP Report Prototype Working Group (RPWG) be refocused, as originally envisioned, to implement prototype reports/queries and to bring them before this SIG for comment.  We will offer the RPWG training videos and assistance in SQL at the basic level required for creating report queries.
    • We still urgently need test data that we can extract from running FOLIO installations.  Currently talking with Chicago about access to their data.


NOTES:

-first version version of LDP was designed to bring multiple FOLIO database schemas compiled into a single database with star schema

-FOLIO core project developers cannot provide support needed to complete reporting dependencies (blockers)

-the LDP roadmap requires adjustment to enable data analysts to provide high priority reports by Go Live

-deliverables must be ready by January 2020 for testing prior to Go Live

-instead of a star based schema, the first schema delivered via LDP will be a hyp

-transformation of data will be automated

-instead of star based schema, the schema will be a hybrid of relational and JSON nested data

-data that are not nested will be available in a column

-data analysts will have access to all of the data in FOLIO

-we may have access to University of Chicago's data for limited testing purposes


QUESTIONS:

-What can we expect for API documentation? 

  • FOLIO project does not have resource to work on API documentation right now
  • FOLIO project would like RPWG to prioritize areas where documentation is needed sooner


Additional Topics?All 
Topics for Future MeetingsAll

Review and update Topics for Future Reporting SIG Meetings 


Action items

  • Reporting Data Privacy to follow up with Jesse, Tod, and Cate, and Chair-Elect to determine approach to audit trails in LDP
  •