2019-07-29 Reporting SIG Meeting notes

Date

Attendees

Present?

Name

Organization

Present?

Name

Organization

xSharon BeltaineCornell UniversityxSara ColglazierMount Holyoke College/Five Colleges

Elizabeth BerneyDuke University
Erin NettifeeDuke University
xJoyce ChapmanDuke University
Karen NewberyDuke University

Elizabeth EdwardsUniversity of ChicagoxTod OlsonUniversity of Chicago

Claudius Herkt-JanuschekSUB HamburgxScott PerryUniversity of Chicago

x

Doreen HeroldLehigh University
Stefan StadtherrMPIL Heidelberg
xAnne L. HighsmithTexas A&M
Simona TabacaruTexas A&M

Harry KaplanianEBSCOxKevin WalkerThe University of Alabama
xIngolf Kusshbz
Charlotte WhittIndex Data

Lina LakhiaSOAS

Michael Winkler

OLE

Joanne LearyCornell University
Uschi KluteGBV
xMichael PatrickThe University of Alabama
Vandana ShahCornell University
xNassib NassarIndex DataxAngela Zoss

Duke University


Veit KöppenUniversity Magdeburg
Lisa DeCarolisSmith College/Five Colleges
xLinda MillerCornell University
Elena O'Malley

Emerson

xMatt HarringtonDuke University     Holly MistlebauerCornell University



xAndi BihlerMunich Technical University Library







Discussion Items

Item

Who

Notes

AttendanceSharon

Today's Attendance Taker: Linda Miller

Today's notetaker: Angela Zoss

Last week's notetaker: Anne Highsmith

Updates from Various Reporting Related Groups and EffortsVarious

The Reporting SIG is using small work groups to address priorities and complete our work. Each week, we will provide updates to the Reporting SIG from these various reporting-related groups and efforts:

  • Community and coordination
    • Planning for things to take a look at, like the blockers we'll review today
    • Looking for test data - Kevin working with Anne Highsmith, Sharon looking at getting demo site test list,
    • trying to ensure anonymization for test data we're getting
    • JIRA tickets under the epics we have, looking at timelines
  • LDP Report Working Group
    • Nothing new lately, didn't meet last week
  • LDP Data Privacy Working Group
    • Everyone helped review the flagged reports with privacy data; 26 of 35 do need to use private, personal information
    • Recommendation is still that private data don't get brought into the LDP, so big group probably needs to review the 26 reports to figure out if they should stay in-app or if there's some other need to bring the personal data in the LDP
    • Reports are in the same spreadsheet - 26 are highlighted in red
    • Maybe try to have functionality that allows libraries to turn on or off GDPR compliance in LDP, controls whether or not personal data are transferred
    • Whatever solution there is, it would need development
    • Yes, that solution is pretty much what the plan would be; the issue the Data Privacy group has been working on is that the GDPR requires data to be anonymized from day 1, as soon as you're using it it has to comply with GDPR. That's not mutually exclusive with the idea that if you didn't need the anonymization you could do without it. Affects categorization of reports; if some libraries don't want personal data, they couldn't make use of reports that use those personal data. All the libraries in Europe - if we have reports requiring personal data that are using the LDP instead of in-app, all the European libraries would rely on us to follow GDPR, so either they couldn't use the reports or we would have to comply with all the GDPR tenants, and we'd need to build all of that before the libraries would be able to use those reports. If it's not unreasonable to make the reports in-app, that's simpler.
    • First decision might need to be whether these reports are logically in-app reports, before looking at whether they have resources
    • Next week's Reporting SIG meeting will focus on this
    • Who's talking about GDPR for broader FOLIO? Not totally clear right now, used to be a group...
    • Might want to look at these reports - how integral are they to operations? Do they need to be ready at go-live? Maybe we can take some time, don't need them right away
    • Sharon will look at the pending reports and figure out next steps, but probably going to take a look next week, maybe bring in POs
    • Could take the bigger GDPR question to Product Council
  • LDP SysOps Working Group
  • Software development
  • Others?


Update on Reporting Issue BlockersNassib

We will review reporting issues in JIRA that are blocking progress on the completion of our Epics. After reviewing the details of these issues, we will discuss the potential impact on our FOLIO deliverables.

  • A blocker means that if these JIRA tickets are not completed, our LDP development cannot move forward, so work on these needs to be escalated

Blockers:

UXPROD-1438: Process for tracking data-related API schema changes

UXPROD-1431: Data attributes needed for reporting

UXPROD-1414: Complete documentation of data attributes in module APIs

FOLIO-1839: Enable incremental requests for all records since last request


Notes:

  • Documentation and tracking schema changes are related; can we keep the LDP synchronized with the FOLIO API so when FOLIO releases a new version, the LDP will not break; the LDP needs to understand changes to FOLIO data model, just like any client
  • "Complete documentation": early on, looked like many FOLIO modules were not fully documented, and prototyping working group confirmed that as they worked through the documentation to find attributes.
  • "Versioning" issue: idea from beginning of FOLIO is that modules should have versioned APIs; idea is to have clear versions when there are changes. The documentation of what attributes mean should be part of the versioning; if the meaning changes, that's a change in the contract between client and server. Not having documentation included in versioning is a problem. This is the first blocker that came up. Nassib and Sharon spent a few months working with Product Council at the beginning of the year, but there wasn't a good process for that, so just put it aside for a bit and now need to come back.
  • Trying to schedule LDP releases and plan what will be included, so need to have some idea of when the dependencies will be completed
  • "Data attributes needed for reporting": this is maybe only one attribute. Effective location. Created an umbrella issue in case others come up. Right now, effective location in circulation is computed at the time you request the day. In reporting, maybe otherwise, we need effective location as it was at the time of checkout, and we need to store that in the loan transaction. So this is basically a request for a new attribute. This is also related to documentation - documentation doesn't say when it is calculated, had to talk to people. This is a relatively small dependency, but without this we would have to duplicate the computation of the effective location, and that means there is the potential for the computation to become unsynchronized with FOLIO if it changes in the future. Would be nicer to have FOLIO API store that. Unless there is a compelling reason to do it separately, it's creating a potential breakage or technical debt in the future.
  • "Enable incremental requests": When we extract data from FOLIO to bring into the LDP right now, we can't specify that we only want recent changes. Right now, the LDP load process asks FOLIO for data ("extraction") and merges it into the LDP database. FOLIO doesn't allow us to ask for incremental updates, like changes since yesterday. We have to ask for all data, every day. A large library with a lot of loan transactions, that's a lot of data to merge. Since it's not just a simple load (because LDP will have historical data), it isn't an easy process, will be wasteful and put a load on FOLIO. If we could get incremental data, this process could happen many times a day or continuously, would be much more lightweight for both FOLIO and LDP. Would like to be able to ask for all records since last request, or since a particular timestamp. Okay to have a bit of overlap of time periods, that would be fine as long as it's not everything. Issue doesn't have much info, but Nassib needs to talk with a technical manager and come up together with a model that might work.
  • Is it possible for us, next summer implementing the LDP at our institutions, to transfer all the data every night and use that until this is developed? Or does this have to be in place before an institution can go-live with the LDP?
    • It depends how much data you have; larger library, it's conceivable that a large library could take more than 24 hours to reprocess the data. Have been reworking the LDP data loading process. Don't have good estimates right, but should be able to answer it in the near future, next few weeks. There are limited tests but we can extrapolate a bit. Some libraries might be fine, but others might have trouble.
    • People that find it too slow is going to say the software is too slow. We'd like them to have a good experience when they use the software, and it's not really designed to reprocess the data every night. It's able to do that because of this dependency, but it's not designed for that. Wouldn't really like to have a real release while it's still like this, maybe just an Alpha release.
  • These four issues are probably radically different sizes in terms of amount of work required. Adding effective location is small, documentation is quite large (in progress, but don't know how it's going). Incremental requests could be not to much work, but it has to be rolled out across all modules, so it's not trivial. Implementation itself is pretty simple, I think. Having this feature will allow us to do incremental or even continuous updates without the need for message-queue-based streaming (though that's back and the table and we could implement that later). 
  • We will have to work with the rest of the project to get proper resources to have these worked on and incorporate these updates into the LDP software


Reporting FeaturesAll

This week, we will continue our review of the JIRA tickets for Reporting Features, gather more details and use cases, and work on prioritization. (Reporting features are different from reports in that they involve reporting functionality, not an actual report.) Reporting features now carry the ldp-platform label in JIRA.


LDP Open Feature Requests


First Pass START----------------------

UXPROD-1868: Allow HTTP request for results of report/query

UXPROD-1869: Ability to connect data warehouse to workflow engine for triggering reports based on logical conditions

UXPROD-1867: Build reports based on custom lists

UXPROD-1875: Build a FOLIO Data Dictionary for reporting

UXPROD-1864: Store Custom Fields from apps in LDP

UXPROD-1872: Exclude records that are suppressed in the catalog from reports

UXPROD-1863: Ability to report on data from MARC fields

UXPROD-1874NCIP data format standards for reports

UXPROD-1870: SIP data format standards for reports

UXPROD-1865LTI data format standards for course management systems for reports

UXPROD-1866: Create an LDP reporting dashboard with canned reports

UXPROD-1862: Use results of an LDP query to select items for batch edit in the FOLIO modules

UXPROD-1861: Document requirements and instructions for reporting applications to connect to LDP

First Pass END ----------------------


UXPROD-1871: Ability to create local tables that translate (usually into a smaller number of catetories) values of a particular field (e.g., all of these 10 holdings locations translates to this particular library).

UXPROD-1873: Support for writing ad hoc reports


-where are issues related to eUsage prototype?

-one more....


Notes:

  • Custom tables in LDP? there is a schema in LDP called "local", and LDP user has full permission to create tables there. This relates to another issue - http://folio-org.atlassian.net/browse/UXPROD-1884. So that's the general issue, and the specific need to translate a location to an aggregate might be a perfect use case for that.
Topics for Future MeetingsAll

Review and update Topics for Future Reporting SIG Meetings 


Action items

  • Reporting Data Privacy Group to flag reports that may have user data
  • Reporting Data Privacy to follow up with Jesse, Tod, and Cate, and Chair-Elect to determine approach to audit trails in LDP
  • Reporting Data Privacy to email the Reporting SIG to ask for feedback on which reports contain personal info Joyce Chapman
  • Sharon Beltaine will talk to lm15@cornell.edu to get details about some of the local table examples that Cornell would want in their LDP
  • Sharon Beltaine will review Data Privacy red-flag reports, figure out how best to address in full Reporting SIG meeting