Skip to end of metadata
Go to start of metadata

MODKBEKBJ-360 - Getting issue details... STATUS  IN REVIEW

Problem: mod-agreements communicates with mod-kb-ebsco-java to retrieve information about the resources. This interaction happens through the endpoint which accepts a single id parameter and returns a single entity.  In case of loading n resources the mod-agreements should perform n requests to mod-kb-ebsco-java, which may cause the delay for display information.

This spike is addressed to provide a solution for reducing a number of requests to  mod-kb-ebsco-java and  introduction of an endpoint to accept list of resources.

Findings

  • Option 1 - Introduce new GET endpoint  with body 

GET /resources/bulk

with body

{ "resources": ["0-1-12345", "0-123-12345", "01-23-34522", "0-12-234567"] }

Advantages:

Based on the specification there is no restriction on usage payload for GET request 

A payload within a GET request message has no defined semantics;
sending a payload body on a GET request might cause some existing
implementations to reject the request.



Disadvantages

  1. It is not commonly used approach
  • Option 2 - Use GET endpoint with additional parameter 'id'

GET /resources?id=resource_id_1,...,resource_id_n

or

GET /resources/bulk?id=resource_id_1&...&id=resource_id_n

Advantages

  1. No need in new endpoint creation

Disadvantages

  1. HTTP protocol does not defined limitation for URI according to specification but the server and browsers do. General recommendation is no longer than 2000 characters.
   HTTP does not place a predefined limit on the length of a
   request-line, as described in Section 2.5.  A server that receives a
   method longer than any that it implements SHOULD respond with a 501
   (Not Implemented) status code.  A server that receives a
   request-target longer than any URI it wishes to parse MUST respond
   with a 414 (URI Too Long) status code (see Section 6.5.12 of
   [RFC7231]).
   Various ad hoc limitations on request-line length are found in
   practice.  It is RECOMMENDED that all HTTP senders and recipients
   support, at a minimum, request-line lengths of 8000 octets.
  • Option 3 - Introduce new POST endpoint 

POST /resources/bulk

with body 

{
	"resources": ["0-1-12345", "0-123-12345", "01-23-34522", "0-12-234567"]
}

Advantages

  1. HTTP protocol does not defined limitation for the body of the POST method but server and browsers do.

Disadvantages

  1. The semantics of POST is intended to create entries but not fetching them.

Selected solution:

During the presentation of the spike it was decided that the best approach to solve current issue is to use POST method to load resources. Below you may find the sequence diagram.

According to the mod-agreements file  mod-agreements is interested in following attributes:

  • data.type
  • data.attributes.publicationType
  • data.attributes.name
  • data.attributes.providerName
  • data.attributes.titleCount
  • data.attributes.customCoverages
  • data.attributes.managedCoverages

some of them are related to package and some to a resource

package attributesresource attributes
  • data.type
  • data.attributes.name
  • data.attributes.providerName
  • data.attributes.titleCount
  • data.attributes.customCoverages
  • data.type
  • data.attributes.publicationType
  • data.attributes.name
  • data.attributes.providerName
  • data.attributes.customCoverages
  • data.attributes.managedCoverages

We also had an assumption that usage of holdings table can simplify the work of loading but some of the properties are absent 

here is an example of loaded holding 

publicationTitle=Interpretation: A Journal of Bible and Theology, printIdentifier=0020-9643, onlineIdentifier=2159-340X, 
dateFirstIssueOnline=1994-01-01, 
numFirstVolOnline=, 
numFirstIssueOnline=, 
dateLastIssueOnline=2014-10-01, 
numLastVolOnline=, 
numLastIssueOnline=, 
titleUrl=https://search.proquest.com/publication/41487, 
titleId=968683, 
embargoInfo=, 
coverageDepth=, 
notes=, 
publisherName=SAGE Publications, 
publicationType=serial, 
dateMonographPublishedPrint=, 
dateMonographPublishedOnline=, 
monographVolume=, 
monographEdition=, 
firstEditor=, 
parentPublicationTitleId=, 
precedingPublicationTitleId=, 
accessType=P, 
packageName=Research Library, 
packageId=4643, 
vendorName=Proquest Info & Learning Co, 
vendorId=22, 
resourceType=Journal
publicationTitle=Advances in Computer Science, Intelligent System and Environment, 
printIdentifier=978-3-642-23776-8, 
onlineIdentifier=978-3-642-23777-5, 
dateFirstIssueOnline=, 
numFirstVolOnline=, 
numFirstIssueOnline=, 
dateLastIssueOnline=, 
numLastVolOnline=, 
numLastIssueOnline=, 
titleUrl=https://link.springer.com/10.1007/978-3-642-23777-5, 
titleId=968675, 
embargoInfo=, 
coverageDepth=, 
notes=, 
publisherName=Springer Berlin Heidelberg, 
publicationType=monograph, 
dateMonographPublishedPrint=2011, 
dateMonographPublishedOnline=2011, 
monographVolume=, 
monographEdition=, 
firstEditor=, 
parentPublicationTitleId=, 
precedingPublicationTitleId=, 
accessType=P, 
packageName=Springer eBooks (Engineering 2011), 
packageId=4769, 
vendorName=Springer Nature, 
vendorId=36, 
resourceType=Book

seems that for managedCoverages dates two types of parameters are used, they are dateMonographPublishedPrint and dateFirstIssueOnline and  it is not clear what property is used for customCoverages if it is present. Also titleCount property is not present. So, the holding table is likely can not be used as a source of the truth. 

Jira Issues created -  MODKBEKBJ-385 - Getting issue details... STATUS   MODKBEKBJ-386 - Getting issue details... STATUS

 Sequence Diagram


Related links:

  • No labels

2 Comments

  1. HTTP protocol does not defined limitation for URI according to specificationbut the server and browsers do. General recommendation is no longer than 2000 characters.

    For FOLIO, this typically means that a CQL query can only contain approximately 50 UUIDs before the limit is reached. (When circulation fetches many records, it separates the IDs into batches and issues a request for each one)

    Based on thespecificationthere is no restriction on usage payload for GET request 

    One of the reasons why GET with a body are not commonly used is that the body is typically not included when deciding whether to cache responses.


  2. I would immediately rule out the GET with a body. This section https://tools.ietf.org/html/rfc2616#section-4.3 of the older spec states that:

    ... if the request method does not include defined semantics for an entity-body, then the message-body SHOULD be ignored when handling the request.

    This means that because (as you rightly say) "A payload within a GET request message has no defined semantics", that responses should not change based on the body contents of a GET request.

    Option 2 and 3 are the best fit IMO. There is, of course, also the options of comma separating the ids under a single URL parameter which saves you the character space when redefining the properties as numbered entries, or a none JSON POST (content type 'application/x-www-form-urlencoded').

    If this endpoint might be used by a browser in any way (either now or in the future) then I would definitely implement as a POST of some kind, because of the URL length limitations that are sometimes imposed.