The following resources are used:

6 m4large EC2 spot instances for Kubernetes cluster;
1 db.r5.xlarge instance for RDS service (writer)
one m5.large per 2 zones for kafks on MSK

Previous Lotus testing performance results:

Lotus Snapshot Performance testing

Modules:

Data Import Module (mod-data-import-2.5.0-SNAPSHOT.231)

Source Record Manager Module (mod-source-record-manager-3.4.0-SNAPSHOT.621)

Source Record Storage Module (mod-source-record-storage-5.4.0-SNAPSHOT.426)

Inventory Module (mod-inventory-18.2.0-SNAPSHOT.537) - mod-inventory-18.0.0

Inventory Storage Module (mod-inventory-storage-23.1.0-SNAPSHOT.692)

Data Import Converter Storage (mod-data-import-converter-storage-1.14.0-SNAPSHOT.202)

Invoice business logic module (mod-invoice-5.4.0-SNAPSHOT.306)

Data Export Module (mod-data-export-4.5.0-SNAPSHOT.319)

Performance-optimized configuration:

Folio

MAX_REQUEST_SIZE = 4000000 (for all modules)

Kafka

2 Tasks for all DI Modules (except mod-data-import)

2 Partition for all DI Kafka topics

Please Notice: an environment should be configured in such a way that for every Kafka topic there are as many partitions as many instances created for a module connected to that topic

Examples:

Delete old topic

./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --delete --topic perf-eks-folijet.Default.fs09000000.DI_ERROR

recreate topic with "--partitions 2 --replication-factor 1"

./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --create --topic perf-eks-folijet.Default.fs09000000.DI_ERROR --partitions 2 --replication-factor 1

Due to limitations in metric names, topics with a period ('.') or underscore ('_') could collide. To avoid issues it is best to use either, but not both.
Created topic perf-eks-folijet.Default.fs09000000.DI_ERROR.

get topic info

./kafka-topics.sh --bootstrap-server=<kafka-ip>:9092 --describe --topic perf-eks-folijet.Default.fs09000000.DI_ERROR

Created Topic

Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR PartitionCount: 2   ReplicationFactor: 1    Configs: min.insync.replicas=1,message.format.version=2.6-IV0,unclean.leader.election.enable=true
    Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR Partition: 0    Leader: 1   Replicas: 1 Isr: 1
    Topic: perf-eks-folijet.Default.fs09000000.DI_ERROR Partition: 1    Leader: 2   Replicas: 2 Isr: 2 (edited)

JVM

mod-data-import: -XX:MaxRAMPercentage=85.0 -XX:+UseG1GC / cpu: 128m/192m | memory: 1Gi/1Gi

mod-source-record-manager: -XX:MaxRAMPercentage=65 -XX:MetaspaceSize=120M -XX:+UseG1GC / DB_MAXPOOLSIZE = 15 / DB_RECONNECTATTEMPTS = 3 / DB_RECONNECTINTERVAL = 1000 / cpu: 512m/1024m | memory: 1844Mi / 2Gi

mod-source-record-storage: -XX:MaxRAMPercentage=65 -XX:MetaspaceSize=120M -XX:+UseG1GC / DB_MAXPOOLSIZE = 15 / cpu: 512m/1024m | memory: 1296Mi/1440Mi

mod-inventory: -XX:MaxRAMPercentage=80 -XX:MetaspaceSize=120M -XX:+UseG1GC -Dorg.folio.metadata.inventory.storage.type=okapi / DB_MAXPOOLSIZE = 15 / cpu: 512m/1024m | memory: 2592Mi/2880Mi

mod-inventory-storage: -XX:MaxRAMPercentage=80 -XX:MetaspaceSize=120M -XX:+UseG1GC / DB_MAXPOOLSIZE = 15 / cpu: 512m/1024m | memory: 1024Mi/1200Mi

Tests:

env	profile	records number	time in Morning Glory	time in Lotus	Kafka partition number	module instance number	CPU	description
MG Perf Rancher	PTF Create - 2	5000	7 min	8 min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-10T07:44:27.576+00:00 2022-06-10T07:51:11.140+00:00
MG Perf Rancher	PTF Create - 2	5000	7 min	8 min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ -Ddi.flow.control.enable=false 2022-06-14T10:24:44.093+00:00 2022-06-14T10:31:54.725+00:00
MG Perf Rancher	PTF Update - 1	5000	11 min	13 min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-20T06:18:46.748+00:00 2022-06-20T06:30:02.991+00:00
MG Perf Rancher	PTF Create - 2	10`000	16 min	19 min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-10T07:54:23.720+00:00 2022-06-10T08:08:48.484+00:00
MG Perf Rancher	PTF Create - 2	10`000	16 min	19 min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ -Ddi.flow.control.enable=false 2022-06-14T10:36:41.482+00:00 2022-06-14T10:53:03.556+00:00
MG Perf Rancher	PTF Update - 1	10`000	22 min	25 min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-20T07:07:00.594+00:00 2022-06-20T07:28:54.905+00:00
MG Perf Rancher	PTF Create - 2	50`000	59 min	1h 25min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-10T08:12:29.178+00:00 2022-06-10T09:11:34.642+00:00
MG Perf Rancher	PTF Update - 1	50`000	1h 42 min	2h 17min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-20T09:11:41.701+00:00 2022-06-20T10:54:29.378+00:00
MG Perf Rancher	PTF Create - 2	100`000	2h 20min	2h 24min (22 errors)	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-13T09:30:35.574+00:00 2022-06-13T12:26:52.484+00:00
MG Perf Rancher	PTF Update - 1	100`000	2h 49min	4h 40min (tests were made for 1 instance number and partition number	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 2022-06-21T11:46:43.175+00:00 2022-06-21T14:36:05.532+00:00 57 errors Inventory/Inventory-storage errors: io.netty.channel.StacklessClosedChannelException, io.vertx.core.impl.NoStackTraceThrowable: Connection is not active now, current status: CLOSED io.vertx.core.impl.NoStackTraceThrowable: Timeout
MG Perf Rancher	PTF Create - 2	500`000	14h 46min (60 errors)	15h 37min (31 errors)	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.659+ 60 errors 2022-06-13T14:27:40.568+00:00 2022-06-14T05:14:27.458+00:00
MG Bugfest	Default Marc Bib Create	5000	4min		2	2	512/1024	mod-source-record-manager - Xmx = 2G "startedDate" : "2022-08-15T14:57:13.753+00:00", "completedDate" : "2022-08-15T15:01:29.365+00:00",
MG Bugfest	Default Marc Bib Create	10000	10min		2	2	512/1024	mod-source-record-manager - Xmx = 2G "startedDate" : "2022-08-15T15:03:07.364+00:00", "completedDate" : "2022-08-15T15:13:28.827+00:00"
MG Bugfest	Create SRS MARC Authority	5000	5min		2	2	512/1024	mod-source-record-manager - Xmx = 2G "startedDate" : "2022-08-16T00:15:55.396+00:00", "completedDate" : "2022-08-16T00:20:19.240+00:00",
MG Bugfest	Create SRS MARC Authority	10000	8 min		2	2	512/1024	mod-source-record-manager - Xmx = 2G "startedDate" : "2022-08-16T14:52:53.191+00:00", "completedDate" : "2022-08-16T15:00:12.723+00:00"
MG Bugfest	Create SRS MARC Authority	50000	34min		2	2	512/1024	mod-source-record-manager - Xmx = 2G "startedDate" : "2022-08-16T15:01:15.360+00:00", "completedDate" : "2022-08-16T15:35:37.028+00:00",

Results before flow control fix: MODSOURMAN-811

env	profile	records number	time	time in Lotus	Kafka partition number	module instance number	CPU	description
MG Perf Rancher	PTF Create - 2	5000	7 min	8 min	2	2	512/1024	mod-source-record-manager-3.4.0-SNAPSHOT.621 2022-05-27T12:58:30.331+00:00 2022-05-27T13:05:08.683+00:00
MG Perf Rancher	PTF Update - 1	5000	10 min	13 min	2	2	512/1024	2022-05-27T13:22:35.123+00:00 2022-05-27T13:32:35.344+00:00
MG Perf Rancher	PTF Create - 2	10`000	21 min \| 27min	19 min	2	2	512/1024	-Ddi.flow.control.enable=false 2022-05-30T09:51:13.876+00:00 \| 2022-05-31T18:13:05.977+00:00 2022-05-30T10:12:33.982+00:00 \| 2022-05-31T18:40:58.928+00:00
MG Perf Rancher	PTF Update - 1	10`000	30 min	25 min	2	2	512/1024	-Ddi.flow.control.enable=false 2022-05-31T19:19:46.296+00:00 2022-05-31T19:49:59.651+00:00
MG Perf Rancher	PTF Create - 2	10`000	21 min	19 min	2	2	512/1024	-Ddi.flow.control.enable=true 2022-05-31T20:02:06.368+00:00 2022-05-31T20:23:19.490+00:00
MG Perf Rancher	PTF Update - 1	10`000	31 min	25 min	2	2	512/1024	-Ddi.flow.control.enable=true 2022-06-01T19:08:11.563+00:00 2022-06-01T19:39:58.803+00:00
MG Perf Rancher	PTF Create - 2	10`000	17 min	19 min	2	2	512/1024	-Ddi.flow.control.enable=true -Ddi.flow.control.max.simultaneous.records=100 -Ddi.flow.control.records.threshold=50 2022-06-03T09:20:07.654+00:00 2022-06-03T09:37:51.631+00:00
MG Perf Rancher	PTF Create - 2	30`000	1h 6 min	45 min	2	2	512/1024	2022-05-27T13:37:12.980+00:00 2022-05-27T14:31:52.595+00:00
MG Perf Rancher	PTF Update - 1	30`000	1h 26min	-	2	2	512/1024	2022-05-27T15:37:33.580+00:00 2022-05-27T17:03:15.702+00:00
MG Perf Rancher	PTF Create - 2	50`000	2h 37 min	1h 25min	2	2	512/1024	3 errors: `io.netty.channel.StacklessClosedChannelException` 2022-06-01T19:48:33.977+00:00 2022-06-01T22:25:59.700+00:00

60 errors (500K - PTF Create - 2):

Almost all errors with mod-inventory storage related to not having enough memory for instances (memory: 778Mi/846Mi). Instances of mod-inventory-storage were restarted 2 times.

io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[0].name","value":"null"}]}]}

io.vertx.core.impl.NoStackTraceThrowable: {"errors":[{"message":"must not be null","type":"1","code":"javax.validation.constraints.NotNull.message","parameters":[{"key":"contributors[2].name","value":"null"}]}]}

io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: Connection was closed: POST /holdings-storage/holdings

io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: Connection was closed: POST /instance-storage/instances

io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: finishConnect(..) failed: Connection refused: mod-inventory-storage.folijet.svc.cluster.local/172.20.250.48:80: POST /holdings-storage/holdings

io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: finishConnect(..) failed: Connection refused: mod-inventory-storage.folijet.svc.cluster.local/172.20.250.48:80: POST /instance-storage/instances

io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /holdings-storage/holdings

io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /instance-storage/instances

io.vertx.core.impl.NoStackTraceThrowable: proxyClient failure: mod-inventory-storage-23.1.0-SNAPSHOT.692 http://mod-inventory-storage: readAddress(..) failed: Connection reset by peer: POST /item-storage/items

Performance testing of update item scenario after removal of index (MODDATAIMP-697):

In the scope of FOLIO-3388 "item_status_name_idx_gin" index for the Item status field was removed from mod-inventory-storage. However, this index potentially can be used for matching by Item status during Item update.

Job profile structure to update item and matching item by status field:

Job profile
Match profile (902$a to Item HRID)
- For matches: sub match profile (Static match of "Available" to Item Loan and Availability Status)
  - For matches: action profile (action = update; Folio record type = Item)
    - Mapping profile (Folio record type = Item)

Test results:

	records number	time	description
Testing with index	5000	6 \| 40 \| 6 \| 6
Testing without index	5000	6 \| 11 \| 6 \| 6

Analysis of the query for matching item by status field:

Example of the CQL query built while processing the sub-match profile for matching by status:

status.name == "Available" AND id == "4ae2603d-1f71-457f-b69a-3eed820d6cfb"

This CQL query is translated by mod-inventory-storage to the following SQL:

SELECT id, jsonb, creation_date, created_by, holdingsrecordid, permanentloantypeid, temporaryloantypeid, materialtypeid, permanentlocationid, temporarylocationid, effectivelocationid
FROM fs09000000_mod_inventory_storage.item
WHERE (
	CASE WHEN length(lower(f_unaccent('Available'))) <= 600 
		 THEN left(lower(f_unaccent(item.jsonb->'status'->>'name')),600) LIKE lower(f_unaccent('Available')) 
		 ELSE left(lower(f_unaccent(item.jsonb->'status'->>'name')),600) LIKE left(lower(f_unaccent('Available')),600) AND lower(f_unaccent(item.jsonb->'status'->>'name')) LIKE lower(f_unaccent('Available')) 
	END	
) AND lower(f_unaccent(item.jsonb->'status'->>'name')) LIKE lower(f_unaccent('Available')) END) AND (id='4ae2603d-1f71-457f-b69a-3eed820d6cfb')
LIMIT 2 OFFSET 0

For the particular case when matching by item status is used as sub match profile no indexes of the Item status field are used. Instead, a more efficient algorithm is applied to perform data lookup using the index for the id field.

query plan

"Limit  (cost=0.56..8.84 rows=1 width=1156) (actual time=0.038..0.040 rows=1 loops=1)"
"  ->  Index Scan using item_pkey on item  (cost=0.56..8.84 rows=1 width=1156) (actual time=0.038..0.039 rows=1 loops=1)"
"        Index Cond: (id = '4ae2603d-1f71-457f-b69a-3eed820d6cfb'::uuid)"
"        Filter: ("left"(lower(f_unaccent(((jsonb -> 'status'::text) ->> 'name'::text))), 600) ~~ 'available'::text)"
"Planning Time: 0.195 ms"
"Execution Time: 0.053 ms"

During the testing item update scenario it was observed that the "item_status_name_idx_gin" index deletion does not impact the performance of matching Item by status. According to the results of analysis, this index is not used for matching Item by status field during data import.

Folio Development Teams

Folijet - Morning Glory Snapshot Performance testing