SPIKE: MODSOURMAN-526 Verify persist value in DB during parsing 004 field

MODSOURMAN-526 - Getting issue details... STATUS

Participants:
Solution Architect
Product Owner
Java Lead

Goal and requirements

Provide a solution for verify that the 004 value is an Instance record's HRID in the database

Requirements

  • Ensure a MARC holdings record always has a 004 value AND only one such value
  • Ensure that the 004 value is an Instance record's HRID
  • Ensure the 004 value does not contain a subfield delimiter
  • Cannot have multiple 004 values on a MARC Holdings record
  • Ensure if an invalid 004 value is set in the MARC Holdings record then return an error message and do not allow the record to be created/saved to SRS
  • Ensure that a valid 004 value links an Instance record to the MARC Holdings record as shown in the above screenshots

Example

  • A Holdings record represents the location where one will find a title (referred to in FOLIO as an instance)
    • Example: Book Title Harry Potter is held at the Main Library - Dekin Wing
      • Harry Potter is a FOLIO instance record
      • Main Library - Dekin Wing is a Holdings record in FOLIO
  • Every MARC Holdings record must have only one 004 value
  • The 004 value is the Instance record HRID value that the Holdings record is linked
  • Having a valid Instance record HRID in the 004 field is the only way that a user can view the Holdings record on FOLIO (see below examples)
    • Without a valid Instance record HRID, the Holdings record is not discoverable via FOLIO and it is a meaningless record if an instance is not linked

Create MARC bib record

First of all, we should create marc bib record. To initiate records parsing one should send POST request containing RawRecordsDto, which contains raw records list ("initialRecords" field) to /change-manager/jobExecutions/{jobExecutionId}/records The list of records can contain records in different formats ("MARC_RAW", "MARC_JSON", "MARC_XML").
{jobExecutionId} - JobExecution id, which can be retrieved from response of previous request.

  • Post request on creation MARC bib

    POST /change-manager/jobExecutions/{jobExecutionId}/records
    curl -w '\n' -X POST -D - \
    	-H "Content-type: application/json" \
    	-H "Accept: text/plain, application/json" \
    	-H "x-okapi-tenant: diku" \
    	-H "x-okapi-token: eyJhbGciOiJIUzI1NiJ9.eyJzdWIiOiJkaWt1X2FkbWluIiwidXNlcl9pZCI6IjQwZDFiZDcxLWVhN2QtNTk4Ny1iZTEwLTEyOGUzODJiZDMwNyIsImNhY2hlX2tleSI6IjMyYTJhNDQ3LWE4MzQtNDE1Ni1iYmZjLTk4YTEyZWVhNzliMyIsImlhdCI6MTU1NzkyMzI2NSwidGVuYW50IjoiZGlrdSJ9.AgPDmXIOsudFB_ugWYvJCdyqq-1AQpsRWLNt9EvzCy0" \
    -d @rawRecordsDto.json \
    https://folio-testing-okapi.dev.folio.org:443/change-manager/jobExecutions/647c2dee-70a8-4ae8-aba4-81579ee17e58/records
    

     

  • example of rawRecordsDto.json to parse marc records in json format:

    json format
    {
    	"id": "22fafcc3-f582-493d-88b0-3c538480cd83" // for each chunk we need to have and unique uuid
    	"recordsMetadata": {
    		"last": false,
    		"counter": 1,
    		"total": 1,
    		"contentType":"MARC_JSON"
    	},
    	"initialRecords": [
    		{
    		"record": "{\"leader\": \"00648cam a2200193 a 4500\",\r\n \"fields\": [\r\n {\r\n \"001\": \"FOLIOstorage\"\r\n },\r\n {\r\n \"008\": 	\"960521s1972\\\\\\\\\\\\\\\\se\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\0\\\\\\\\\\\\swe\\\\\\\\\"\r\n },\r\n {\r\n \"041\": {\r\n \"ind1\": \"1\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"swe\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"096\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"y\": \"Z\"\r\n },\r\n {\r\n \"b\": \"TAp Chalmers tekniska h\u00F6gskola. Inst. f\u00F6r byggnadsstatik. Skrift. 1972:4\"\r\n },\r\n {\r\n \"s\": \"g\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"100\": {\r\n \"ind1\": \"1\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"Sahlin, Sven\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"245\": {\r\n \"ind1\": \"0\",\r\n \"ind2\": \"0\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"P\u00E5lslagning\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"260\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"c\": \"1972\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"300\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"19 bl.\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"440\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"Skrift, Chalmers tekniska h\u00F6gskola, Institutionen f\u00F6r byggnadsstatik\"\r\n },\r\n {\r\n \"x\": \"9903909802 ;\"\r\n },\r\n {\r\n \"v\": \"72:4\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"907\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \".b11154585\"\r\n },\r\n {\r\n \"b\": \"hbib \"\r\n },\r\n {\r\n \"c\": \"s\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"902\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"190206\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"998\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"b\": \"0\"\r\n },\r\n {\r\n \"c\": \"990511\"\r\n },\r\n {\r\n \"d\": \"m\"\r\n },\r\n {\r\n \"e\": \"b \"\r\n },\r\n {\r\n \"f\": \"s\"\r\n },\r\n {\r\n \"g\": \"0\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"909\": {\r\n \"ind1\": \"0\",\r\n \"ind2\": \"0\",\r\n \"subfields\": [\r\n {\r\n \"a\": \"m\"\r\n },\r\n {\r\n \"c\": \"a\"\r\n },\r\n {\r\n \"d\": \"b\"\r\n }\r\n ]\r\n }\r\n },\r\n {\r\n \"945\": {\r\n \"ind1\": \"\\\\\",\r\n \"ind2\": \"\\\\\",\r\n \"subfields\": [\r\n {\r\n \"l\": \"hbib3\"\r\n },\r\n {\r\n \"a\": \"TAp Chalmers tekniska h\u00F6gskola.Inst. f\u00F6r byggnadsstatik. Skrift 72:4\"\r\n }\r\n ]\r\n }\r\n }\r\n ]\r\n }"
    		}
    	]
    }
    
  • Or you can download via UI with 1record.mrc

Create MARC Holdings record with valid 004 field

During creation Marc Holdings with 004, will be executed verification in the SRM module, which will call request with parameter 004 field into SRS. If the 004 value is the Instance record HRID value that the Holdings record is linked and this record is located in database - we should successfully save Marc Holdings record without any error.

Example of Marc Holding raw record:

Marc Holdings with 004 field
00182cx a22000851 450000100090000000400080000900500170001700800330003485200290006710245123992837120170607135730.01706072u 8 4001uu 09011280 bfinehN7433.3i.B87 2014

During testing on the rancher, MARC Holdings with VALID 004 field by uploading the file - one marc holdings.mrc As result we found record from SRS and MARC Holdings will be saved correctly:

Create MARC Holdings record without/with invalid 004 field

During creation Marc Holdings with 004, will be executed verification in the SRM module, which will call request with parameter 004 field into SRS. If the 004 value is the Instance record HRID value that the Holdings record is NOT linked and this record is NOT located in database - we will not save Marc Holdings record and receive error.

Example of Marc Holding raw record:

Marc Holdings without 004 field
00162cx a22000731 45000010009000000050017000090080033000268520029000591024512320170607135730.01706072u 8 4001uu 09011280 bfinehN7433.3i.B87 2014

During testing on the rancher, file CornellFOLIOExemplars_Holdings.mrc use to load MARC Holdings and the 004 field has an HRID that does not exist in the database. As result we can see error in console:

When we import MARC Holdings with available and not available MARC Bib ids, so we receive the next log:
 

Module changes

SRM

  • Change logic for ChangeEngineServiceImpl by adding SRS client for retrieving record by 001 field from MARC bib.

    Example of logic
    private void postProcessMarcHoldingsRecord(Record record, InitialRecord rawRecord, OkapiConnectionParams okapiParams) {
    	var controlFieldValue = getControlFieldValue(record, TAG_004);
    	if (isBlank(controlFieldValue)) {
    		LOGGER.error(HOLDINGS_004_TAG_ERROR_MESSAGE);
    		record.setParsedRecord(null);
    		record.setErrorRecord(new ErrorRecord()
    			.withContent(rawRecord)
    			.withDescription(new JsonObject().put("message", HOLDINGS_004_TAG_ERROR_MESSAGE).encode())
    		);
    	} else {
    		SourceStorageStreamClient sourceStorageStreamClient = getSourceStorageStreamClient(okapiParams);
    		MarcRecordSearchRequest marcRecordSearchRequest = new MarcRecordSearchRequest();
    		marcRecordSearchRequest.setFieldsSearchExpression("001.value = '" + controlFieldValue + "'");
    		try {
    			sourceStorageStreamClient.postSourceStorageStreamMarcRecordIdentifiers(marcRecordSearchRequest, asyncResult -> {
    			if (asyncResult.succeeded()) {
    				var body = asyncResult.result().body();
    				LOGGER.info("Response from SRS with MARC Bib 001 field: {} and body: {}", controlFieldValue, body);
    				var object = new JsonObject(body);
    				var records = object.getJsonArray("records");
    				if (records.isEmpty()) {
    					LOGGER.error(HOLDINGS_004_TAG_ERROR_MESSAGE);
    					record.setParsedRecord(null);
    					record.setErrorRecord(new ErrorRecord()
    						.withContent(rawRecord)
    						.withDescription(new JsonObject().put("message", HOLDINGS_004_TAG_ERROR_MESSAGE).encode()));
    				}
    			} else {
    				LOGGER.error("Error during call post request to SRS");
    			}
    		});
    	} catch (Exception e) {
    		LOGGER.error("Error during call post request to SRS ", e.getCause());
    	}
    }
    }
    
    private SourceStorageStreamClient getSourceStorageStreamClient(OkapiConnectionParams okapiParams) {
    	var token = okapiParams.getToken();
    	var okapiUrl = okapiParams.getOkapiUrl();
    	var tenantId = okapiParams.getTenantId();
    	return new SourceStorageStreamClient(okapiUrl, tenantId, token);
    }
  • Write tests for cover new logic.

SRS

  • Implement new endpoint for retrieving invalid marc bib ids
    • Create separate endpoint for searching invalid marc bib
    • Create new DTO for response 
    • Extend raml file by new endpoint
    • Create service and dao layer
    • Write tests for new functionality
    • Query for receiving invalid marc bib ids from database

      Query for receiving invalid marc bib ids
      SELECT marc.hrid FROM 
      (SELECT unnest(ARRAY['222222222222','in00000000313','111111111111','in00000000316']) as hrid) as marc
      LEFT JOIN diku_mod_source_record_storage.records_lb lb
      ON (lb.instance_hrid = marc.hrid and lb.record_type = 'MARC_BIB')
      WHERE lb.instance_hrid IS NULL

Testing process

Changes should be tested on the rancher environment.


Problems

The main problem during investigation is:

  • When we load MARC Bib with 001 field, for example: 366832. After MARC Bib will be loaded, 001 field will be moved to 035, and 001 will be replaced by new generation HRID, for example: in00000012415. Then MARC Holdings will loaded with 004 field 366832. In this case, proposed approach will find in the SRS by 366832 value and MARC bib will be not find (because MARC Bib will saved by new HRID: in00000012415). As result MARC Holdings is not loaded.

Questions

QuestionAnswer

Which status do we need to return after data import was imported MARC Holdings with invalid 004 field? Complete with errors? Failed?

From Khalilah: Failed
If file MARC Holdings partially  consists of valid and invalid 004 field. For example: we have file with 3 records, on of them is correct, other - not. Will we save only one record? And which kind of status will be? Complete with errors? Failed?

From Khalilah: Completed with errors
one record will be saved into database,
two records - not


Data import must support a similar requirement today. Data import supports the ability to create/update Holdings record with the source = FOLIO. We need to find out what is being done that links a Holdings record to an Instance record and/or MARC bib record currently. Any validation in place?

Stories

StoryJiraHigh level estimation(story points)
MODSOURMAN-544 Validate MARC Holdings 004 field from MARC Bib HRID 

MODSOURMAN-544 - Getting issue details... STATUS

5
MODSOURCE-351 Endpoint to verify invalid MARC Bib ids in the system

MODSOURCE-351 - Getting issue details... STATUS

3