IN PROGRESS
Overview
In the scope of PERF-524 it's needed to run tests to answer questions:
- Can the current System accommodate an average load? Load model described here NLA load model investigation and creation.
- If not PERF-525
- What happened at peak times when all workflows are running at once?
- Typical KPIs:
- Service CPU
- Service Memory
- DB CPU
- DB Memory
- Response times
- Duration of long workflows
- Recommendations to improve on scaling up/out modules to accommodate peak times
Summary
Recommendations & Jiras (Optional)
Jiras
Test Runs & Results
Test # | # configuration | Test duration | comments |
---|---|---|---|
1 | All workflows started at the same time | 1 hour | Res 500 ERROR during Data Import process |
2 | Data import and data export started with 15 min delay | 1 hour | Res 500 ERROR during Data Import process |
3 | Data import and data export started with 1 min delay | 1 hour | Res 500 ERROR during Data Import process |
4 | Data import started with 1 min delay and data export started with 20 min delay | 35 min | Res 500 ERROR during Data Import process |
5 | Same as Test# 4 Jenkins configuration with increased DB_MAXPOOLSIZE = 200 for mod_srs and mod_srm | 30 min | Res 500 ERROR during Data Import process. Data import duration decreased 2 times. |
6 | Test without Data import | 1 hour | NO errors |
Test results from 1st test run (1st, 2nd and 3rd test run results are similar):
Test # | Workflow name | Total time it takes to complete workflow | Comment | |||
---|---|---|---|---|---|---|
Avg With DI | 95th pct with DI | Avg no DI | 95th pct no DI | |||
1 | Checkin | 1.591 | 1.054 | |||
2 | Checkout | 1.948 | 1.650 | |||
3 | View invoices | 0.763 | 0.913 | |||
4 | Create invoices | 1.174 | 1.370 | |||
5 | Edit invoices | 1.581 | 1.897 | |||
6 | Delete invoices | 0.804 | 0.927 | |||
7 | Approving Invoices | 1.453 | 1.940 | |||
8 | View Authority records | 0.289 | 0.381 | |||
9 | View MARC tag table | 0.987 | 1.284 | |||
10 | View holdings records | 1.526 | 1.922 | |||
11 | View Bib | 0.841 | 1.168 | |||
12 | View patron records | 0.566 | 0.883 | |||
13 | Delete patron records | 0.638 | 1.070 | |||
14 | Update patron records | 1.043 | 1.625 | |||
15 | Create patron records | 1.098 | 1.261 | |||
16 | View Ledger | 0.050 | 0.088 | |||
17 | Create ledger | 0.616 | 0.761 | |||
18 | Edit ledger | 0.054 | 0.085 | |||
19 | Delete a ledger | 0.046 | 0.080 | |||
20 | Export bib "Default instances export job profile" | - | - | |||
21 | Export holdings "Default holdings export job profile" | - | - | |||
22 | Export authority records "Default authority export job profile" | - | - | |||
DI "DISC HRID match" | - | - | ||||
DI "DS LA edeposit records update" | - | - | ||||
DI "DISC New edeposit records" | - | - | ||||
DI "DISC New NON edeposit records" | - | - | ||||
View item records | 1.289 | 1.649 | ||||
update item records | 0.998 | 1.250 | ||||
delete item records | 0.927 | 1.099 | ||||
Monitoring Pick Slips and Requests GET /circulation/requests | 0.359 | 0.480 | ||||
Monitoring Pick Slips and Requests GET /circulation/pick-slips/ | 0.112 | 0.256 | ||||
Monitoring Pick Slips and Requests | 0.303 | 0.303 | ||||
Users loan renewal | 1.467 | 1.661 | ||||
Item-level requests | 0.669 | 0.973 | ||||
View vendor records | 0.713 | 1.165 | ||||
Edit vendor records | 5.199 | 6.190 | ||||
Create vendor records | 1.064 | 1.200 | ||||
Delete vendor records | 0.412 | 0.522 | ||||
Create purchase orders | 1.625 | 1.733 | ||||
View purchase orders | 1.205 | 1.435 | ||||
Edit purchase orders | 2.076 | 2.984 | ||||
Delete purchase orders | 1.432 | 1.830 | ||||
Retrieving instances and holdings | 0.035 | 0.073 | ||||
Edit MARC tag table | 3.424 | 4.257 | ||||
Fiscal close - end of FY rollover | ||||||
Blacklight: View an inventory record JMeter script | 0.821 | 1.042 | ||||
Blacklight: Create a Request JMeter script | 1.122 | 1.404 | ||||
Blacklight: Create a View Patron record JMeter script | 0.073 | 0.110 |
Memory Utilization
Memory utilization:
- mod-source-record-manager
- mod-source-record-storage
- mod-inventory-storage
All other modules behave stable during Data Import.
*This test was performed after a run of 2 sets of the same jobs (1k, 5k, 10k, 22.7k, 50k records twice)
Service CPU Utilization
*On chart below - each little spike corresponds to each DI job performed.
**Some of spikes is shorter than the others - because of differences in number of records imported.
**Test #1 has higher CPU usage because it has background activities (CICO 5 users + DI )
Most CPU-consuming modules:
- mod-quick-marc - 79%
- mod-source-record-storage - 74%
- mod-inventory - 69%
- mod-source-record-manager - 67%
- others - usage less than 30%
Instance CPU Utilization
RDS CPU Utilization
Predictable that each DI job is consuming a lot of DB CPU (each spike here corresponds to each DI job).
Approximately DB CPU usage is ± 96%
Appendix
Infrastructure
PTF -environment ncp3
- 9 m6i.2xlarge EC2 instances located in US East (N. Virginia)us-east-1
- 2 instances of db.r6.xlarge database instances, one reader, and one writer
- MSK ptf-kakfa-3
- 4 m5.2xlarge brokers in 2 zones
Apache Kafka version 2.8.0
EBS storage volume per broker 300 GiB
- auto.create.topics.enable=true
- log.retention.minutes=480
- default.replication.factor=3
- Kafka topics partitioning:
- DI_RAW_RECORDS_CHUNK_READ -2
- DI_RAW_RECORDS_CHUNK_PARSED -2
- DI_PARSED_RECORDS_CHUNK_SAVED -2
- DI_SRS_MARC_AUTHORITY_RECORD_CREATED -2
- DI_COMPLETED -2
Modules memory and CPU parameters
Modules | Version | Task Definition | Running Tasks | CPU | Memory | MemoryReservation | MaxMetaspaceSize | Xmx |
---|---|---|---|---|---|---|---|---|
mod-data-import | 2.7.1 | 8 | 1 | 256 | 2048 | 1844 | 512 | 1292 |
mod-di-converter-storage | 2.0.2 | 5 | 2 | 128 | 1024 | 896 | 128 | 768 |
mod-source-record-storage | 5.6.5 | 24 | 2 | 1024 | 4096 | 3688 | 512 | 3076 |
mod-source-record-manager | 3.6.2 | 14 | 2 | 1024 | 4096 | 3688 | 512 | 3076 |
mod-inventory-storage | 26.0.0 | 10 | 2 | 1024 | 2208 | 1952 | 384 | 1440 |
mod-inventory | 20.0.4 | 8 | 2 | 1024 | 2880 | 2592 | 512 | 1814 |
Methodology/Approach
To test Baseline DI and DI with CICO 5 concurrent users the JMeter scripts were used.
Multitenant testing
- test 1-5: testing DI on each tenant consecutively (5 jobs from 3 tenants = 15 test runs)
- test 6-8: testing DI jobs from two tenants simultaneously with 1 min ramp-up.
- test 9: testing DI jobs from 3 tenants simultaneously with 1 min ramp-up.