Discussion items



20Demo Cornell Reports

Lydia shows Annex Moves, moving books into long term storage (50% of the current reporting activities; 3 mill. books in the past 15 years, and expect to move 1 mill. books the next 5-8 years).

  • 40% of Cornell's stack is in foreign languages (250+ different languages), tremendous need for diacritics
  • Holding Statements can be quite long (more than 1 page), very large data sets

  • Lydia shows databases in Voyager Access.
  • Using MARC 852 866 867 868 780 785, ... for holdings data. Tag 880 for Chinese character set. These tags are concatenated together to create holdings.
  • Collection Development: Voyager Fund Selectors, managing about 100 Funds. Open orders with commitments. Question: which items have been received and payed for ?

    Calculate amount and subtotals in each section. Need to link persons as the owners of the fund. Each of the selectors have a copy of this database.

  • Circulation Summary

  • Circulation Statistics, 26 different queries are combined into one screen of circulation statistics.
Capture of screen dumps
10Presentation - Reporting Database

Nassib Nassar (Index Data) presents IndexData's approach to the Reporting Service.

FOLIO will have a reporting service separate from modules that actually provide data. Data for reporting will be accessed by a report generating tool. This tool will have direct access to a database schema.

FOLIO generally reduces the dependeny on a database schema, but in reporting there is no way of getting around this.

Two approaches:

1st approach: replicate the operational database; operational database = "aggregates", a collection of databases

2nd approach: make two distinct databaes. 1. operational database 2. distcinct database for reporting, "data warehouse"

IndexData favors the data warehouse approach, they want to have two different schemas.
For reporting: store data in a relational database in a de-normalized form which does not acquire views. Needs to be read-only. Communicates with the modules only via their documented service interface.

1st approach has points of failure, for example if operational database schema is changed.

Reporting database is going to look quite diffenrent from the operational database.
Reporting database can nevertheless be synchronized within not much less than real time.
Anything which needs real data should use the operational database.

20Discussion re: Rpt DB

Ann has two remarks:
"1." read-only will not be convenient / will not fit us. We will want to build apps on top of the reporting database and store data.
"2." don't see a reason why not all the data should be in the reporting database.

Nassib replies:

A solution for "1." would be to replicate the reporting database.

"2." : the operational database does not only store data which are interesting for reporting (e.g. administrative metadata to the data sets).

Lydia: it is not unusual, in annex reports, to selec bibs and items. One of the specifications might be: Has this item circulated within the last 7 years ? I.e., it is not unusual to cross moduls, i.e. cataloging data and circulation data. It is not clear to Lydia what "aggregating" the data means (in Nassibs talk).

Nassib: Aggregation = Pulling the data form the different modules into a single (relational) database. But the tables in a relational database are virtually independent. Nassib will not use the term "aggregation" for this anymore.

Lydia: All the holdings should be pulled over to the reporting database.

Nassib: The issue "diacritics" falls under "internationalization".
Charlotte: if all bibliograhical data is in UTF-8 in the Voyager database, it needs to be in UTF-8 in FOLIO as well.
Michael: we want to go beyond that, e.g. support right-to-left written languages.

5Planning for next sessioin

20 minutes Reporting Demo by Ann Highsmith.

Lydia retires within the next weeks, we are looking for a convener until Cornell hires someone to replace Lydia (also as a convener).


Action items