Overview
When Data Import runs it consumes a lot of resources, especially via data import modules like mod-source-record-manager (mod-srm), mod-source-record-storage (mod-srs). Additionally, mod-inventory and mod-inventory-storage also carries out Data Import's tasks of creating or updating inventory records. Therefore, when mod-srs and mod-srm work hard, these inventory modules also work hard as well. Subsequently, other workflows in FOLIO such as circulation's check in/out, requests, etc.. suffer when Data Import is running in the background because of physical resources starvation since they also require the inventory modules to carry out their tasks. The solution outlined below is to create another set of mod-inventory and mod-inventory-storage just to do Data Import work and leave the original set of mod-inventory and mod-inventory-storage alone to work on other non-Data Import tasks.
Deployment Schemes
Current Deployment
All modules communicate to each other via Okapi, denoted by solid arrows. mod-inventory, mod-srs, and mod-srm also subscribes and publishes messages to Kafka. This is denoted by dashed arrows.
Traffics Diversion
In this traffics diversion scheme, a new set of mod-inventory and mod-inventory-storage services are created. In the diagram below, they are depicted by boxes of "mod-inventory-storage-x" and "mod-inventory-x" in blue font.
- These additional services do not register with Okapi so no other modules in the ecosystem are aware of these services' existence, thus no messages will be sent to them.
- These "x-services" publishes and subscribe to Kafka DI topics for messages as usual in place of the original mod-inventory and mod-inventory-storage services, which stop pubing/subing with Kafka so that they'd exclusively take requests from other modules/Okapi.
- mod-inventory-x now communicates with mod-inventory-storage-x directly to ask it to create or update records.
With the traffics diversion in place, check-in check-out response times improve by about 60% in Lotus and nearly 100% in Kiwi. mod-inventory-x and mod-inventory-storage-x takes on the brunt of DI traffics leaving mod-inventory and mod-inventory-storage resources to carry out check-in check-out activities.
10 Comments
Ann-Marie Breaux
This is really helpful to know about, Martin Tran . Thank you for creating this page!
Jennifer Eustis
Agreed! This is super helpful in understanding what's happening and how this diversion helps. Is this solution Martin Tran going to be implemented or more tests needed?
Martin Tran
Jennifer Eustis This solution is already implemented for FSE customers after lots of testing in FSE environments.
Jennifer Eustis
Awesome. Thanks for the info.
Wayne Schneider
How do you configure mod-inventory-storage and mod-inventory not to communicate with Kafka? How does this work for circumstances in which those modules need to publish to Kafka (e.g. for informing mod-search or mod-remote-storage of data changes)?
Martin Tran
Wayne Schneider Only mod-inventory does not communicate with Kafka. We set the following Data Import environment variables to 0. That being said, there is a small amount of DI traffics handled by mod-inventory but we do not know how to completely got them off.
inventory.kafka.DataImportConsumerVerticle.instancesNumber
inventory.kafka.MarcBibInstanceHridSetConsumerVerticle.instancesNumber
inventory.kafka.QuickMarcConsumerVerticle.instancesNumber
Marc Johnson
Martin Tran Ann-Marie Breaux
Does that mean that this configuration is going to be used for official distribution testing in BugFest?
And that this configuration is the official recommended (by EBSCO and EPAM / Folijet) and supported way that data import should be run by hosting providers?
Martin Tran
Marc Johnson AFAIK there is no official recommendation on applying it to bugfest. In the future (Morning Glory?) DI will have a throttling mechanism so that it doesn't hog most of the resources and there will be no need to do this.
Marc Johnson
Martin Tran Thank you for the guidance.
Given this isn't intended to be official guidance or used in the official environments. And it wasn't considered as an architectural decision (as the supporting changes were made prior to this documentation). What is the purpose of this documentation?
Martin Tran
Marc Johnson Folijet and Ann-Marie were curious about this implementation so I wrote this page to describe what we did.