Notes on scalability of mod-data-import

MODDATAIMP-706 - Getting issue details... STATUS


For general information on mod-data-import functionality please refer to the module's documentations by the link.

mod-data-import is responsible for uploading files and initiating their processing (which takes place in other modules). To start the processing of uploaded file a user should choose a Job Profile - basically a list of instructions of what to do with the records in the file. However, the process of choosing the Job Profile is separate from the process of file upload and comes to backend as a separate request. 


In current implementation of the module uploaded file is being stored in the LOCAL_STORAGE (file system of the module). It has a couple of implications:

  1. The request to initialise processing of the file can be handled only by the same instance of the module that stores the uploaded file. Therefore, mod-data-import that uses only the local storage cannot be scaled horizontally.
  2. File size that can be uploaded is limited to the java heap memory allocated to the module. It is required to have the size of the java heap equal to the expected max file size plus 10%.

Mentioned issues can be resolved by configuring an external storage. mod-data-import read the following settings from mod-configuration (and uses default if they are not set up): 

  • data.import.storage.type - type of data storage used for uploaded files. Default value is LOCAL_STORAGE.
  • data.import.storage.path - path where uploaded file will be stored

To allow multiple instance deployment, for every instance the same persistent volume must be mounted to the mount point defined by the value of data.import.storage.path property.

Related issues:

ARCH-19 - Getting issue details... STATUS


MODDATAIMP-392 - Getting issue details... STATUS