Page tree
Skip to end of metadata
Go to start of metadata

Requirements - https://issues.folio.org/browse/MODORDSTOR-34

In order to proceed with this story we need to come up with a solution on how to generate unique sequential numbers for PO Lines.
Namely, PO lines might be added in parallel by several users and we may have several instances of the module running. So implementing it without a mechanism to guarantee uniqueness in these conditions, there is a risk to end up with duplicate numbers. The things are further complicated by the fact that we need to keep separate sequence per Order.

Investigation

Postgres provides the SEQUENCE command, which can be used to provide a unique sequential number series akin to MySQL's AUTOINCREMENT feature.  Unlike AUTOINCREMENT, which is applied directly to table columns, a sequence is created outside of a table.

Example:

> CREATE SEQUENCE phonebook_id_seq;
> CREATE TABLE phonebook(id INTEGER DEFAULT NEXTVAL('phonebook_id_seq'), phone VARCHAR(32), firstname VARCHAR(32), lastname VARCHAR(32));

This example first creates a sequence, then uses that sequence to define the behaviour of the id column in the phonebook table.  This is typically how Postgres users recreate MySQL's AUTOINCREMENT functionality to generate unique object ids.  Since sequences exist as separate database entities from tables, they can be used independent of tables as well.


Example:

> CREATE SEQUENCE ponumber_seq;
> SELECT * FROM NEXTVAL('ponumber_seq');
>
> nextval
> ---------
> 2

Here the sequence is called directly as part of a query, with its current value being incremented by NEXTVAL() before it is returned.  When called this way, the result is guaranteed to be unique.


In order to verify that SEQUENCE is supported in the FOLIO environment, the following code was run from within a unit test:

PostgresClient dbClient = PostgresClient.getInstance(vertx, "test_tenant");
CompletableFuture<ResultSet> selectCompleted = new CompletableFuture<>();
String sql = "CREATE SEQUENCE ponumber_seq; SELECT * FROM SETVAL('ponumber_seq',13);SELECT * FROM NEXTVAL('ponumber_seq');";
dbClient.select(sql, result -> {
  if(result.succeeded()) {
    logger.info("--- mod-vendors-test: Result success!!! ");
    selectCompleted.complete(result.result());
  }
  else {
    logger.info("--- mod-vendors-test: Result failure!!!");
    selectCompleted.completeExceptionally(result.cause());
  }
});
ResultSet results = selectCompleted.get(5, TimeUnit.SECONDS);
JsonObject mismatchedRowCount = results.toJson();
logger.info("--- mod-vendors-test: Finished SQL call ... " + mismatchedRowCount);


The output:
INFO: --- mod-vendors-test: Finished SQL call ... {"columnNames":["nextval"],"numColumns":1,"numRows":1,"results":[[14]],"rows":[{"nextval":14}]}


Reference - Postgresql Sequences

https://www.ntchosting.com/encyclopedia/databases/postgresql/sequence/

Solution Proposal

  1. When a Purchase Order is created, create a corresponding sequence in the DB using the po_number as the name of the sequence
    1. CREATE SEQUENCE <po_number>_seq MINVALUE 1 MAXVALUE 999;


  2. When a PO Line is created, call the sequence with the corresponding po_number to get a unique suffix value
    1. SELECT * FROM NEXTVAL('<po_number>_seq');


  3. When a Purchase Order has a status change to "OPEN", delete the sequence from the DB
    1. DROP SEQUENCE <po_number>_seq;





  • No labels

7 Comments

  1. I'm not sure how common this would be, but what happens if the order is put back into PENDING state?  We'll have to ask Dennis Bridges if that's allowed, and how often it might occur.

    I'm also a little curious if there are any resource or performance implications associated with having a large number of these sequences... In theory we could wind up with hundreds or thousands of them for each tenant at any given point.  It may be worth doing a little research, or even some quick POC testing to determine if this is something we should be concerned about

  2. I think we may also need to incorporate the tenantID in the sequence name as the numbers only need to be unique with a tenant, not across all tenants, so we need to avoid collisions.  

    The postgresClient might take care of this for you if it creates the sequence under the tenant-specific schema (e.g. diku_mod-orders_storage)

  3. In theory, this should never happen, but I think it's technically a possibility.... We'll need to figure out what to do if a sequence already exists... if using the SQL above, the call would fail.  Do we want to handle this in storage module?  BL module?  or simply add IF NOT EXISTS to the SQL?  

  4. One more thing which should be taken into account that if user edits the PO Number the sequence should be renamed as part of PUT purchase_order step i.e.

    ALTER SEQUENCE [ IF EXISTS ] name RENAME TO new_name
    1. I suggest that we name the sequences with the UUID of the PO instead of the po_number.  This would avoid the scenario entirely since the PO's UUID will never change.

  5. We discussed how to handle a failure to remove the sequence when transitioning the PO from PENDING to OPEN.  I think the simplest approach would be to also check for, and remove the sequence when transitioning to CLOSED.  This would in theory be needed anyway for the case where an order goes from PENDING directly to CLOSED.

    Of course, the problem still exists... e.g. we fail to remove the sequence when transitioning to CLOSED, but I think that's something we can address later.  One though would be to have some periodic process that checks for and removes "zombie" sequences.

  6. I did a baseline selection using EXPLAIN ANALYZE on the query, then created 1000 sequences and did the same selection.  Then I created 10,000 sequences and ran the selection again.  Results were:


    0 Sequences

      Planning time: 0.055 ms
      Execution time: 0.038 ms


    1000 Sequences

      Planning time: 0.199 ms
      Execution time: 0.030 ms


    10000 Sequences

      Planning time: 0.249 ms
      Execution time: 0.030 ms


    We can see that while the query execution time remains relatively constant, that the planning time increases as the number of sequences in the system increases.  I did manage to find a post in Stackoverflow related to creating a large number of sequences:

    Each sequence will appear in the system catalog (pg_class) which also contains all of the tables, views, etc. Having 100k rows there is sure to slow the system down dramatically. The amount of memory required to hold all of the data structures associated with these sequence relations would be also be large.

    https://stackoverflow.com/questions/23889619/anything-wrong-about-having-many-sequences-in-postgres


    I suspect we would experience the same impact that having a large number of small tables in the DB would produce.






Write a comment…