Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


Problem definitionBusiness impactSteps (Proposed Solution)PriorityComplexityExisting Jira issues
1DI relies on internal identifiers for SRS records
  • DI does not support differentiation of records based on external identifiers (ISBN or barcode numbers).
  • The criteria that we have to distinguish whether MARC record already exists in SRS or not is UUID stored in 999 ff. If incoming record has no 999 ff field we consider it as new, save it and assign new UUID. If 999 ff field is present - we increment the generation and save it as a new and actual version of the record. The problem is that sometimes incoming records does not have 999 ff field even though they already exist in SRS and have corresponding inventory instances linked.
  • Gather requirements - which fields of the incoming MARC Bib contain external identifiers, what about MARC Holdings and MARC Authority (should we make changes for them too?)
  • Design new mechanism for versioning of records  - based on external identifiers
  • Consider performance implications
  • Decide what to do with duplicates that already exist in SRS


Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyMODSOURMAN-848

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyMODSOURCE-530

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyMODSOURMAN-898

2

DI profile actions can sometimes lead to other actions that are implicit / disposable incoming records

  • We don’t have an explicit action to save the SRS MARC record, it is implicit and happens (almost… we already have a couple of exceptional cases, which were added later as “bug fixes”) for each incoming file. When it was designed we thought of an incoming file as a new and valid record that should be saved prior to any other actions and serve as a single source of truth. In fact, what we have now - there are indeed records that are coming and should be saved in SRS and referenced by other entities that are derived from it. However, there are also multiple use cases (usually some kind of updates or creates on Holdings and/or Item), where incoming MARC record is considered to be disposable, it might contain only partial data, and if we save it we end up either with lost data (when original record is overridden) or with messed up links to corresponding inventory entities (when we save the record as new one)
  • Consider making Create/Update SRS MARC Bib explicit - a separate step in the profile
  • Alternatively some kind of a check box should be added when profile is constructed specifying whether MARC Bib is supposed to be saved or not


Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyMODSOURMAN-891

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyMODSOURMAN-907

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyMODSOURMAN-819

3

Performance results in production environments is different than on PTF

  • Slower imports for big files, especially for Updates
  • Gather information on configs and amount of allocated resources for DI modules, background activity, other factors
  • It might be caused by a complexity of the profile, complex matching conditions. Get examples of profiles and maybe files to try them on PTF env and compare with results measured with our base cases



4

UI regression bugs

  • Incorrect names in Edit mode for mapping profiles
  • Missing associated profiles on editing screen
  • Issues with shortcuts
  • Include test cases for editing profiles in the critical path


Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyUIDATIMP-1302

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyUIDATIMP-1296

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyUIDATIMP-1233

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyUIDATIMP-1300

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyFAT-3438

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyFAT-3437

5

Intermittent failures of Karate tests

  • Some issues could be resolved earlier in dev cycle
  • Kafka config adjusted
  • Added number of retries before test completion
  • Resolved issues related to reference data
  • Continue investigation for tests failing with incorrect job status


Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyFAT-3397

Jira
serverFOLIO Issue Tracker
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyFAT-2302


Source code is missing when debug data import modules.
  • Decreased maintainability
  • Add sources to Data Import packages

S

Reduce/remove the need for post processing in data import flows.
  • increases duration of data import job i.e. decreased performance
  • Makes data import flows more complex and prone to errors.





Remove incomplete data import job monitoring process from mod-source-record-manager. Or implement a working monitoring if there is a business. We are currently incurring the cost without the benefits.
  • Increases communication to the database for every kafka event which has downstream effects on data import performance, database utilization & database storage.





Update job_execution_progress table in mod-source-record-manager without doing a SELECT FOR UPDATE, then an update. The row is locked after SELECT FOR UPDATE which is causing contention for multiple SRM instances.

  • Decreased data import job performance.





Data Import Processing Core refactored to be refactored. It refactoring should allow a clear and concise API that FOLIO developers in other module areas can hook into data import system cleanly. For example, Inventory mapping should be stored in mod-inventory instead of data import processing core.






6mod-data-import can only have one instance in a folio cluster due to its interaction with file storage. This has caused responsibilities it may have had to be moved to mod-source-record-manager.
  • limited availability for API endpoints serviced by mod-data-import




7mod-source-record-manager has too many responsibilites.
  • High resource requirement for SRM instances
  • Over 200 threads even in idle state, raising potential for instability.
  • Higher cognitive load since the module is managing more than source records.



Jira
serverFOLIO Issue Tracker
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId6ccf3fe4-3301-368a-983e-20c466b11a49
keyMODSOURMAN-851


Generic backend error messages are returned to the user upon failures in data import. Data Import should employ error codes and specific error messages for issues that occur frequently.
  • Troubleshooting is harder for data import users as well as developers.