Spike: Support large-scale bulk renewal overrides

Tickets:

Batch API

Existing API endpoints to provide batch API alternative of:

  • POST /circulation/renew-by-barcode
  • POST /circulation/renew-by-id

For bulk renewals a single endpoint could be used:

  • POST /circulation/renew-by-id-batch

This endpoint would follow a similar design structure as to those described here:

References:

The Design

The existing renewal process is too slow. However, having a batch perform one operation at a time has also proven problematic. The suggested design is to have a batch process renewals and when it has accrued a designated number of successful renewals, submit them to the storage module as a sub-batch.
The business logic can perform a pre-process where it checks to see if any of the renewals would fail for business logic reasons. Hold all renewals that pass the business logic test and then when the sub-batch size is reached make a request to the storage module using the sub-batch. This is potentially problematic in that if something fails in the sub-batch the entire sub-batch might be considered a failure. If this is undesired then a sub-batch size of 1 must be used. The size of the sub-batch will need to be determined, and could be configurable or a property specified in the request if desired. This approach would require a bulk update endpoint to be available from the storage module.

This refers to the sub-batch as a subset of renewals that have passed the business logic processing.

To restrict the amount of memory consumption there should be a max batch size of 10,000. When more than the maximum batch size is sent to the API endpoints then an HTTP 413 "Payload Too Large" should be immediately returned without processing anything.

The PO wishes for the entire batch to process and that all success and failures be returned.

The resulting structure might look as follows:

{
  "success": [
    {
      "id": "ef642903-c1c3-45da-95d9-3e0e3db617ca",
      ...
    },
    {
      "id": "a0dcc225-12ed-4db9-badf-1b579bccacc7",
     ...
    },
    {
      "id": "b8d0d431-b535-4bc4-a1dd-5bee2ab2d49c",
     ...
    }
  ],
  "failure": [
    {
      "id": "f087206e-1622-4fa4-a317-9a53ec5b49c3",
      ...
    },
    {
      "id": "4a63f8af-3e9c-45f3-88d4-6a5515d62b18",
     ...
    }
  ],
  "error": [
    {
      "id": "f087206e-1622-4fa4-a317-9a53ec5b49c3",
      ...
    }
  ],
  "totalSuccess": 3,
  "totalFailure": 2,
  "totalError": 1
}

The "totalSuccess", "TotalFailure", and "totalError" is not strictly needed and may not need to be implemented.

The "failure" represents business logic failures.

The "error" represents technical failures, such as an SQL failure.


JPG: circ-573.jpg

Example Multi-update SQL

Postgresql can update multiple rows in a single statement, such as:

Multi-Update SQL
UPDATE table AS t SET
  field1 = v.field1,
  field2 = v.field2
FROM (values
  (1, 'field1 row1', 'field2 row1'),
  (2, 'field1 row2', 'field2 row2')
) AS v(id, field1, field2)
WHERE v.id = u.id;

Ticket FOLIO-1156 - Getting issue details... STATUS  describes active work to in this regard that might provide a way to do the above or something similar.