2018-04-30 Reporting SIG notes

Date

Attendees

Present?NameOrganizationPresent?NameOrganization
XSharon BeltaineCornell University
Peter MurrayIndex Data

Elizabeth BerneyDuke University
Erin NettifeeDuke University

Joyce ChapmanDuke UniversityXKaren NewberyDuke University

Elizabeth EdwardsUniversity of Chicago
Tod OlsonUniversity of Chicago
XClaudius Herkt-JanuschekSUB HamburgXScott PerryUniversity of Chicago
XDoreen HeroldLehigh University
Robert SassQulto
XAnne L. HighsmithTexas A&MXSimona TabacaruTexas A&M

Vince BareauEBSCO
Mark VekslerEBSCO

Harry KaplanianEBSCOXKevin WalkerThe University of Alabama

Ingolf KusshbzXCharlotte WhittIndex Data

Lina LakhiaSOAS

Michael Winkler

OLE
XJoanne LearyCornell University
Christine WiseSOAS
XMichael PatrickThe University of Alabama
Holly MistlebauerCornell University

Emma BoettcherUniversity of Chicago
Andrea LoigmanDuke

Uschi KluteGBVXLynn Whittenberger

Additional guests: Christie Thomas (UChicago, MM SIG), Charlotte Whitt (Index Data, Inventory App Dev)


Discussion items

ItemWhoNotes
Assign Notetaker, Take Attendance, Review agenda

Sharon Beltaine

Previous Notetaker: Joanne Leary

Today's Notetaker: Claudius Herkt

Thanks to Tod Olson for convening the April 23 Reporting SIG Meeting!

WolfCon Update: At this point, the Monday 5/7/18 9am Reporting SIG meeting is CANCELED.

Reporting SIG members should be able to attend the WolfCon conference remotely via Zoom links for sessions. Stay tuned for more information on this soon.

Inventory and MARC Data for Reporting

Christie Thomas, Charlotte Whitt

Christie Thomas and Charlotte Whitt from Metadata Management will join us for a discussion of Inventory and MARC data required for reporting. Here are some topics/questions to consider:

  • (from Sharon) What is the current plan for MARC data in Folio? Will we be able to report on all local and standard MARC data?

(from Christie) It is important that everyone understand the difference between the Folio instance record and the source MARC record. The Folio instance record may have data that is native to the Folio data model or may be derived from a source record. The difference is important in this context because it may be that data that has historically been sourced from MARC may be better drawn from the Inventory record, especially once Folio supports source bibliographic metadata formats other than MARC. (If the data is pulled from the Inventory it will already be somewhat normalized and reports will not need to be created for individual formats such as MARC and BibFrame.) Note: If you are attending WolfCon, see "How MARC cataloging app and inventory fit together (MM)" session on Tuesday at 2pm.

  • (from Christie) How will we include data elements for holdings and item data?

MM SIG has developed a spreadsheet that contains the instance, holdings and items metadata for the inventory, listing the metadata already completed in alpha version and the missing elements (see MM Link below).

  • (Christie) Discussion about UChicago's local audit of their reports that rely on bibliographic / MARC, holdings, and item metadata (see link from Christie below)
  • (from Sharon) What is the most efficient and effective process for the Reporting SIG to use to verify the inclusion of data elements needed for reporting in Folio?

A list of elements used for certain reports similar to U Chicago (see Christie's link below) would be helpful. Should be presented to all the SIGs.

  • (from Sharon) Have the MM SIG and/or Inventory developers discussed ways we can ensure referential integrity across Folio apps to support reporting functionality?

No. Reporting SIG should write this down as a main requirement and communicate to developers. Should be done after the WolfCon session on this issue next week.

  • (from Sharon) What are the most important design considerations from an Inventory and MM perspective for building a data lake/warehouse environment for Folio?
  • Other questions/topics?

Related Links:

(from Christie) At Chicago we have started to select some representative reports and ad hoc queries and systematically identify the data that is required. You can see how we are doing this here: https://docs.google.com/spreadsheets/d/1dVAejtiqzc1E5nFLW85XZvHHrUER58EHrlwCK2ZrhTI/edit?usp=sharing

Reporting SIG Master Spreadsheet

https://docs.google.com/spreadsheets/d/1svUM74Dkg4KvTXLzKZK_2k_SxeukX-87NnYf8CaTrYQ/edit?usp=sharing

Data Migration Subgroup Google Drive folder

https://docs.google.com/spreadsheets/d/15khBvG7hsUbIIofh6ZyrH_-sl2kngXdcQ24sBswAe_k/edit?usp=sharing

MM SIG Working Groups - Inventory Metadata Elements (please do not edit!)

https://docs.google.com/spreadsheets/d/1kdYx63J0KoqR3-LUHuPAzERgj8WE0OQ08rzuCaJaHWs/edit#gid=952741439


WolfCon Reporting TopicsAll

Review current plans for WolfCon and brainstorm ways Reporting SIG can make best use of the conference time and resources.

Important WolfCon Scheduling question: Reporting session on Data Confidence currently scheduled for Thursday morning, when some attendees may be catching flights back home. Shall we make a request to schedule this session for another time? If so, when is the best time? Wednesday morning?

Sharon will make a request for wednesday morning or tuesday afternoon.

KN:Open in Google sheets to view the entire schedule.

WolfCon Topic Proposals: https://docs.google.com/spreadsheets/d/17JmVl-XUaALDYtyqEzPGGYyHAbH8JQM63XPdl9ZxKFU/edit?usp=sharing
Current WolfCon Schedule (View Only): https://docs.google.com/spreadsheets/d/1jmjRAKBXJN2i1qPLOjdbeqi5xndTPyKzrDKuB8MeeBQ/edit?usp=sharing

WolfCon organizers are working to make the May 7-10 conference sessions available by ZOOM so all project participants may join WolfCon sessions remotely. More details on this coming soon.

German Folio DaysIngolf Kuss

April 25th/26th at Göttingen. Ingolf Kuss gave a presentation about the FOLIO Reporting plans. Ingolf shares this slide from the presentation with you (translated to English): 2018-04-26-kuss-FOLIO-Reporting_Data_Lake.pptx .

Here are all the other slides from the talks held at the conference (partially in German): https://www.folio-bib.org/?page_id=63

German FOLIO Day on Twitter: https://twitter.com/hashtag/FOLIODay?src=hash

There was a strong interest in the Data Lake concept. The cases for this were the following:

  • the data in the Lake "gets old". The concern, that data in the lake that are several years old can not anymore be interpreted.
  • the data in the Lake miss metadata
  • suggestion: the Lake will be used for short-term and mid-term reports
  • the need for the Lake is to get FOLIO going. If it is running, it is wiser to import the data into a (more formal, more structured) data warehouse.
  • another concern questioned with the Lake was referential integrity

Questions to discuss in the SIG:

  • Will there be an additional date warehouse environment next to the FOLIO data lake?

Probably, every institution using FOLIO will be able to set up one or more additional data warehouses if needed.

  • Will the data lake infrastructure be part of the FOLIO core? Or will this only be true for the functionality that gets the data out of the system ("the river") and not the storage ("the lake")?

This is still an open question, but it is very important for the users (libraries), as most of them will not be familiar with setting up a data lake on their own! Should be discussed at WolfCon Data Lake session.

Reporting Deliverables
All

***Saved for future meetings

(In case this is needed to revisit or for reference.)

We will discuss strategies for translating reporting requirements into reporting deliverables

KN: See Reporting SIG master spreadsheet: https://docs.google.com/spreadsheets/d/1svUM74Dkg4KvTXLzKZK_2k_SxeukX-87NnYf8CaTrYQ/edit#gid=312878932

-thank you for adding names to each report (just a few missing at this point)

-review updated Column E ("Legacy System Data Element Sources") and new Column F ("Folio Data Element Sources Notes")

-Reporting SIG members and new Reporting PO to work with other POs and SIGs to identify data elements needed for each report

-Reporting SIG members and new Reporting PO to work with other POs and SIGs to identify candidates for "in app" reports

-please continue to add your Report Requirements Contacts in Reporting SIG Master Spreadsheet

-report contacts need to be reachable, if not part of Folio project, please add person's email address

Topics for Future MeetingsAll

***Saved for future meetings

Review and update Topics for Future Reporting SIG Meetings

Action items

  •