How to implement a simple CRUD API

This article describes general guidance to implement a CRUD API for an RMB based module (mod-inventory-storage is used as an sample module). Usually, such API is created for so-called reference data entities - supporting data, that has a list of predefined values, but can be updated/added by institutions in future (e.g. instance statuses, holdings record sources, identifier types, patron groups, etc.).

The API development can be expressed in the following steps:

Model API and domain objects.
Declare the API in module descriptor.
Define DB table for the new entity.
Apply necessary DB constraints and indexes.
Implement the API logic.
Create system predefined data for the API.
Prepare integration tests for the new API.
Make sure the predefined data is loaded.

0. Feature overview

For the guidance let's build CRUD API for instance sources. Instance source defines source of the instance - right now we have only MARC and FOLIO sources, but institutions can introduce some custom sources. An instance source usually have an id, name of the source and source - which defines who has created the record - either folio (it is a system source) or local - if it is added by a user, there is also the metadata property which is a system property that holds some technical info about when the record was created and updated, and the user is who has created and updated record. The instance schema has to be updated with new sourceId that refers to the new instance-source.

1. API and domain objects modeling

FOLIO uses JSON schema to define structure of a domain object, this definition is converted to a java class(es) by a maven plugin automatically. RAML is used to define an API (request/response structure, query parameters, possible response statuses, endpoint names, etc.), it is also used to generate API documentation that is published here.

Usually, a CRUD API needs following endpoints:

POST endpoint with entity as request - to create an entity, it has
GET by id endpoint;
GET collection of entities by a CQL query supporting pagination;
PUT endpoint - to update an entity by id;
DELETE endpoint - to remove an entity by id.

We usually use following URI structure, within core-functional modules: /{domain-name}-storage/{domain-name} - e.g. /instance-sources-storage/instance-sources. In this case, the API endpoints for the instance-sources API will have following URIs:

POST /instance-sources-storage/instance-sources
GET /instance-sources-storage/instance-sources/{id}
GET /instance-sources-storage/instance-sources?query=<a-query>&limit=<a-limit>&offset=<an-offset>
PUT /instance-sources-storage/instance-sources/{id}
DELETE /instance-sources-storage/instance-sources/{id}

1.1. Define instance-source schema

Let's define JSON schema for the instance-source.

Instance sources

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "description": "...",
  "type": "object",
  "properties": {
    "id": {
      "type": "string",
      "description": "..."
    },
    "name": {
      "type": "string",
      "description": "..."
    },
    "source": {
      "type": "string",
      "description": "..."
    },
    "metadata": {
      "type": "object",
      "$ref": "raml-util/schemas/metadata.schema",
      "readonly": true
    }
  },
  "additionalProperties": false,
  "required": [
    "name",
    "source"
  ]
}

Please pay attention that the metadata property is read-only, there is also additionalProperties - it tells that no other properties are allowed, otherwise the validation is failing. We have added name and source to the required list - it means that validation is also fails if those properties are missing in request.

Create a collection schema for the domain - which will be used as response for get by CQL endpoint. You can use an existing domain as an example, the schema basically defines two properties - array of the domain objects and totalRecords - an integer saying how many records matches the criteria.

1.2. Define instance-source API

Next, we need to define RAML for the API.

instance-sources API

#%RAML 1.0
title: Instance Sources API
version: v1.0
protocols: [ HTTP, HTTPS ]
baseUri: http://localhost

documentation:
  - title: Instance Sources API
    content: This documents the API calls that can be made to query and manage instance sources

types:
  instanceSource: !include instance-source.json #name of the file with domain schema
  instanceSources: !include instance-sources.json #name of the file for get by query response
  errors: !include raml-util/schemas/errors.schema

traits:
  pageable: !include raml-util/traits/pageable.raml
  searchable: !include raml-util/traits/searchable.raml
  language: !include raml-util/traits/language.raml
  validate: !include raml-util/traits/validation.raml

resourceTypes:
  collection: !include raml-util/rtypes/collection.raml
  collection-item: !include raml-util/rtypes/item-collection.raml

/instance-sources-storage/instance-sources: # the common part of the URI
  type:
    collection: # the predefined type, that generates GET many, POST
      exampleCollection: !include examples/<example-file-for-get-many-response>.json # replace with actual name 
      exampleItem: !include examples/<example-file-for-single-item>.json # replace with actual name of single item example
      schemaCollection: instanceSources # the one we defined in the traits section
      schemaItem: instanceSource # the one we defined in the traits section
  get:
    is: [
      searchable: {description: "with valid searchable fields", example: "name=aaa"}, # searchable here means to define "query" param
      pageable # pageable means to define "offset" and "limit" integer params with validation (more than 0, and less then max integer)
    ]
    description: Return a list of instance sources
  post:
    description: Create a new instance source
    is: [validate]
  /{id}: # mapping for /:id resources
    description: Pass in the instance source id
    type:
      collection-item: # This is predefined type, it will add GET by id, PUT by id, DELETE by id
        exampleItem: !include examples/<example-file-for-single-item>.json # replace with actual name
        schema: instanceSource #the type from traits section

As you can see, there are already some predefined resourceTypes - collection (generates GET many and POST resources), collection-item (generates GET by id, PUT by id and DELETE by id resources).

2. Declare the new API in module descriptor

Module descriptor - is a special file that contains information about what API endpoints and their versions the module provides and what permissions a consumer needs, to call the API, what API and their versions are required for the module, launch descriptor - used by OKAPI to start the module. This module descriptor is sent to OKAPI in order to make the module available.

For the new API we need to create permissions for each endpoint and define the new API that we will expose. You can use instance-preceding-succeeding-titles API as an example. Keep in mind, that permissions usually ends with item.<http-method> (e.g. inventory-storage.instance-sources.item.post) if the API deals with a single record (GET/PUT/DELETE by id, POST) and collection.<http-method> (e.g. inventory-storage.instance-sources.collection.get) if the endpoint returns/processes list of items.

3. Update instance schema to support new property

As was mentioned in the overview, we need to be able to set the sourceId for an instance. For this, let's update the instance.json file and add the new property. Nothing special here, you can use a sample string property that already present in the schema.

When the schema for an API is changed, we need to update API version to reflect that change. There two kinds of API changes on FOLIO project:

minor change (non-breaking) - a change when something new have been added to the schema and API consumers have not to adopt to the version of the API;
breaking change - the previous version of the API is not compatible with the new, so that API consumers need to change their logic to handle the change properly (a required property added, existing property has been renamed, a constraint for property has been added, etc.).

The breaking change is out if scope for this change and it requires updating major API version and updating all the API consumers to support the new schema. The minor change requires updating only API minor version (e.g. current instance-storage version is 7.5 then we have to bump minor version - 7.6). The version should be changed in both RAML file and ModuleDescriptor-template.json file.

4. Define DB table for the new entity and add necessary constraints

There is the schema.json file where all DB tables/scripts/indexes/constraints are defined. Here you can find more details. For this change we need following:

Define DB table;
Define unique key for the name property (because we don't want to have duplicate sources);
Define foreign key for the instance.sourceId property.

instance_source table definition

{
   "tables": [
    ...
    {
       "tableName": "instance_source",
       "withMetadata": true, // auto-generate metadata - date record was last updated/created/etc.
       "withAuditing": false, // Do not create auditing table
       "uniqueIndex": [
         {
           "fieldName": "name",
           "tOps": "ADD"
         }
       ]
    }
   ]
}

Also, update the instance table and add FK constraint:

FK sourceId constraint

{
   "fieldName": "sourceId",
   "targetTable": "instance_source",
   "tOps": "ADD"
}

5. Implement the API logic

Before starting development, we have to run mvn clean install -DskipTests command to let maven plugin generate domain objects and API definitions using the RAML and JSON schema we've defined on the previous steps. Once it is completed, everything that we need is to implement the generated API interface, which will contain set of methods for each endpoint that we've defined in the RAML file (GET/PUT/DELETE by id, POST and GET many with query).

Please also note, that RMB already provides method for all the common endpoints, the org.folio.rest.persist.PgUtil utility class has following methods:

get(...) - implements GET many endpoint;
getById(...) - implements GET by id endpoint;
post(...) - implements POST endpoint;
put() - implements PUT by id endpoint;
deleteById(...) - implements DELETE by id endpoint.

For a regular CRUD API, everything that you need is to delegate call to the appropriate PgUtil method.

6. Create system predefined data

The system predefined data is usually called as reference data. In the reference-data directory create instance-sources directory and add instance sources JSON files that have to be created on startup (FOLIO and MARC) the files should be valid for the defined schema.

Once the files are created, go to TenantRefAPI.java and register you're reference data directory: before the sample data set-up add following code:

tl.add("instance-sources", "instance-sources-storage/instance-sources")

The first argument - is name of the directory with ref-data files and the second - is URI for the API to execute.

7. Prepare API integration tests

Create an API integration class, usually it extends the TestBaseWithInventoryUtil class and verifies all success cases + some validation cases, etc. Also, do not forget to add the new class to StorageTestSuite suite class, otherwise the tests will not be executed on Jenkins.

8. Prepare reference data test case

We need also a test case to verify that the new reference data is correctly loaded during module startup. We usually add a test case to org.folio.rest.api.ReferenceTablesTest class, but you can also add it to the class with API tests.