2018-02-19 Reporting SIG Notes

Date

Attendees

Present?NameOrganizationPresent?NameOrganization

Vince BareauEBSCO
Katalin Lovagne SzucsQulto
XSharon BeltaineCornell University
John McDonaldEBSCO

Elizabeth BerneyDuke University
Peter MurrayIndex Data

Ginny BoyerDuke University
Erin NettifeeDuke University

Joyce ChapmanDuke UniversityXKaren NewberyDuke University

Elizabeth EdwardsUniversity of ChicagoXTod OlsonUniversity of Chicago

Claudius Herkt-JanuschekSUB Hamburg
Scott PerryUniversity of Chicago
xDoreen HeroldLehigh University
Robert SassQulto
XAnne L. HighsmithTexas A&MXSimona TabacuruTexas A&M

Filip JakobsenIndex DataXMark VekslerEBSCO

Harry KaplanianEBSCO
Kevin WalkerThe University of Alabama
XIngolf Kusshbz
Charlotte WhittIndex Data

Lina LakhiaSOAS
Michael WinklerCornell University
XJoanne LearyCornell University
Christine WiseSOAS

Michael PatrickThe University of Alabama



Goals

  • Discuss Data Lake Proof of Concept Project Design & Goals
  • Review Reporting Tools
  • Resource/Format working group needs rep(s) from Reporting SIG
  • plan future topics

Discussion items

ItemWhoNotes
Assign Notetaker, Take Attendance, Review agendaSharon

Previous Notetaker: Doreen Herold

Today's Notetaker: Simona Tabacaru

 POC/Data Lake ProjectSharon, Tod, Anne, Mark, Vince, Joanne, Doreen, Scott, Karen
  • structure of  POC/Data Lake Project
  • current status of Tod's data loader Python script
    • Tod Olson has a script for creating loans on github
      https://github.com/todolson/folio-loan-tool.git

    • It's not complete, but it does this:

      • 1. authenticates

      • 2. gets a list of users
      • 3. gets a list of items
      • 4. creates a list of user/ item pairs to loan
      • 5. POSTS loan request - not yet working It's on GitHub, so people can see how it works.
    • Each step is simplistic. There are a number of things marked TODO, so some suggestions for further work if we can get some help fleshing it out.
    • schedule another small group meeting?
  • important considerations for a data lake environment

questions?

Notes:

  • Sharon gave a short overview of the POC/Data Lake Project.
  • A small work group met to discuss the Proof of Concept (POC) for a Data Lake on 2/13/2018.
  • Minutes notes and recording of the meeting are available.
  • The POC/Data Lake is a 3-week project; we are in the 2nd week with a deadline for completion – March 2nd.
  • The goal is to design a Data Lake environment. The group clarified what kind of data will go into the Data Lake and what type of report should be built. The report should include information from all three areas of FOLIO (patron, circulation and inventory).
  • Tod Olsen will write a Python script to load the data.
  • The working group decided to use an open source tool, BIRT, as the reporting tool for this project. Chris Creswell will write the BIRT report.

Reporting will be done in 2 steps:

  • EBSCO will give Chris a data extract from the Data Lake
  • Chris will try to connect BIRT to the Data Lake

Tod Olson shared his notes about the Python script. The scope of the script is to create loans automatically. The idea is to:

  • pull users from user storage
  • pull items from item storage
  • make random loans

Tod will connect with Matt Reno to work on this.

Someone from Texas A&M has some Python skills and could take a look at the script.

  • Sharon will help setting up some meetings to help Tod with the Python script:
  • One meeting between Tod and the person from Texas A&M
  • One meeting between Tod and Matt Reno
  • One meeting between Matt Reno & Chris Creswell

Update on this project at the next meeting.


Current Reporting ToolsAll

Review of Current Reporting Tools used by Reporting SIG participants

  • any additional tools?

Most of the Reporting SIG participants are using SQL in relational database system to generate reports

  • Duke University uses a combination of SQL + PERL + IBM Cognos + some ExLibris canned reports. Data source: ALEPH
  • University of Alabama: same
  • University of Chicago: Access + Excel. The assessment librarian uses Tableau

Sharon: We don’t have experience with these open source tools. Do we need to do an analysis of the current reporting tools? We should create a short list of tools that will work well in this environment.

  • Is there someone from FOLIO/EBSCO/Index Data that can help us come up with the short list of tools?
  • We will probably be tasked to provide training on these tools. This will be part of our roles.

Mark: This request should go to the Product Council – they might be able to assign resources to help with the tool analysis request.

  • Our goal is to recommend either commercial reporting tools or open source reporting tools. This would likely be a decision made by each institution participating in FOLIO.
  • We need to determine what skills are needed to use the open source reporting tools.
  • We need to have a list of tools that we are recommending.

Questions to consider:

  • How does BIRT work?
  • Should we test each tool with the current set-up?
  • Do we need a list of current issues (top 3 or top 5 issues) known for each reporting tool? I.e.: One known problem with Cognos is that data goes through 2 transformations; for invoicing packages it’s difficult to get item data.
  • Is it worth it for us spending time with coming up with a list of issues since we are going to a new environment?

Some issues are documented on the Master Spreadsheet. For example, on the Metadata management tab it is discussed the data integrity and consistency checking (a category of reports) – we need to be able to do these kind of reports in FOLIO. We’ll use the Master Spreadsheet as source for analysis.


Resource/Format Working GroupSharon

update from 2/13/18 meeting

  • This group is working on an inventory set-up (how the data will be structured and how the data will look)
  • Sharon is attending as a place holder, but we need someone from Reporting SIG to attend their meetings
  • We’ll try to build a report using the structure that they are using
  • We need to look through those data elements

More info at a next meeting.

WishlistSharonReporting SIG Master Spreadsheet
  • including separate tab for wishlist functionality (what you'd like but do not actually have yet)

Reporting SIG Master Spreadsheet/ Import-export tab

  • On the Import-Export tab – the column Link to sample/Wishlist has been edited to Link to sample only.
  • The Wishlist is separated on another spreadsheet/tab on the Master Spreadsheet
Additional Topics?AllOther topics? - None suggested
Future TopicsSharon

Topics for Future Reporting SIG Meetings

  • Email Sharon or add your topic directly on the wiki page – Topics for Future Reporting SIG Meetings
  • Ask around what are some good reporting tools that we can use.
  • Sharon will get in touch with Mark to see if someone is documenting the data architecture set-up.

Action items

  •