NOTICE
This decision has been migrated to the Technical Council's Decision Log as part of a consolidation effort. See: DR-000002 - Tenant Id and Module Name Restrictions
Truncation problem
PostgreSQL silently truncates identifiers after 63 bytes.
The PostgreSQL schema name is <tenant id>_<module name>. Truncation may result in the same schema name, for example for mod-inventory and mod-inventory-storage.
Name clash problem
The PostgreSQL schema name is <tenant id>_<module name> with minus/hyphen converted to underscore.
Tenant id foo
and module name bar-baz
result in schema name foo_bar_baz
.
Tenant id foo-bar
and module name baz
result in the same schema name foo_bar_baz
resulting in a name clash.
Module name uniqueness problem
A PostgreSQL schema name is case insensitive, and the module converts minus/hyphen to underscore because minus/hyphen is not allowed in a PostgreSQL schema name.
However, Okapi's module name is case sensitive and may contain lower and upper case letters. And it allows both minus/hyphen and underscore.
The module names mod-foo
and Mod_Foo
result in the same PostgreSQL schema name when created for the same tenant.
Reserved key word problem
PostgreSQL has a few reserved key words with underscore:
CURRENT_CATALOG
CURRENT_DATE
CURRENT_ROLE
CURRENT_TIME
CURRENT_TIMESTAMP
CURRENT_USER
SESSION_USER
Trying to use them as a schema name results in a syntax error.
Example: If a module has name user
then enabling it for tenant id current
or session
will fail.
Kubernetes and AWS ECS label problem
Kubernetes label names must follow the DNS label standard:
- contain at most 63 characters
- contain only lowercase alphanumeric characters or '-'
- start with an alphabetic character
- end with an alphanumeric character
Labels like mod-inventory-storage-23-0-2 are commonly used and may exceed the maximum length if a long module name is combined with a long version number.
AWS ECS has these restrictions:
- Service Name: Up to 255 letters (uppercase and lowercase), numbers, hyphens, and underscores are allowed.
- TargetGroup Name: A maximum of 32 alphanumeric characters including hyphens are allowed, but the name must not begin or end with a hyphen.
The existing module name mod-data-import-converter-storage already exceeds this 32 character limit. Sysops have assigned a special TargetGroup Name for this module, such a workaround should be avoided.
Other restrictions
The problems stated above need to be solved.
More restrictions exist that are already enforced:
- Tenant id must not begin with a digit because PostgreSQL schema names must not begin with a digit. This is already enforced by PostgreSQL.
- Okapi restricts tenant id letters to be lower case and a-z. Accented letters and unicode characters are not allowed. (Okapi's regexp)
- Okapi doesn't allow a module name to contain a minus/hyphen followed by a digit because this starts the version suffix (Okapi's ModuleId parsing).
Solution
Proposed solution:
Limit tenant id to 31 bytes. Disallow underscore in tenant id. Regexp: [a-z][a-z0-9]{0,30}
Restrictions for back-end modules:
- Module name can contain only lowercase letters, digits and minus. Start with a letter. Disallow minus followed by a digit or a minus.
- Limit module name to 31 bytes. Disallow uppercase letters. Disallow underscore.
- Regexp:
[a-z]([a-z0-9]|-(?=[a-z])){0,30}
- Disallow these module names:
catalog
,date
,role
,time
,timestamp
,user
Migration
Some FOLIO installations use tenant ids with underscore that is no longer allowed.
One back-end module fails the new module name restrictions:
- mod-data-import-converter-storage is 33 bytes long exceeding the length limit of 31 bytes.
Tenant name migration
To reduce the downtime in multi-tenant installations it must be possible to migrate one tenant at a time.
Okapi and the modules should provide APIs and/or scripts to do the migration.
mod-data-import-converter-storage rename
A shorter name for this repository and module name might be mod-data-import-conv-storage with only 28 bytes will be mod-di-converter-storage with only 24 bytes (MODDICONV-259).
Follow guide https://dev.folio.org/guides/rename-module/
When the renamed module executes the tenant upgrade it checks whether the schema name with the old module name still exists. If yes it is renamed, the PostgreSQL ROLE is renamed and a new ROLE password is assigned if needed.
References
13 Comments
Jason Root
I have hit the character limit on labels in Rancher/K8s already with mod-data-import-converter-storage and the length of my namespaces. Glad this is being addressed.
Jason Root
My thoughts/comments from Slack...
Re: the “mod-data-import-converter-storage rename” - will this change correspond to some Folio flower release?
At Tamu Libraries we don’t currently use underscores in our tenant IDs, and if the module’s repo is renamed, and the module name in the okapi-install.json/install.json is also renamed, should be good?
My K8s deployment script generates the module’s discovery entries for Okapi using the okapi-install.json file by the name of the module, what version it is, and what port is listed in the deployment descriptor section of the module descriptor - and just adds the dashes where appropriate.
I.E.
mod-data-import-converter-storage-1.11.4
becomeshttp://mod-data-import-converter-storage-1-11-4
:port
Since our tenant id is short, and our namespaces short, I’ve not hit the character limit yet in K8s. But I almost did in my testing with this hugely log offensive module name. I imagine this is probably of great concern to hosting providers running multiple tenants in K8s.
mod-data-import-conv-stge
would be even better. Why not rename all modules withstorage
in their name tostge
?David Crossley
Please do not encourage to generally "rename a repository", especially just by following GitHub instructions. There are many ramifications when a module repository must be renamed if the module is in active use. The document guidelines for Naming conventions encourages people to consider very carefully. That document links to another document for the various steps, when one must be renamed (including retaining the old repository).
David Crossley
I gather that it is possible for the module name to be different to the repository name. However normally they are the same. Being different could also have some other effects. So the full renaming procedure would be better.
Mike Taylor
I don't really agree with David here. Renaming a module is one big task; renaming a repository is another. Doing either one in no way necessitates doing the other, and we all need to careful not to make unnecessary additional work for ourselves and (equally important!) each other.
Wayne Schneider
FWIW, I think David Crossley (and my) concern has to do with CI tooling which depend on particular conventions being followed.
Mike Taylor
I would like to know what those conventions are. To me it seems wrong-headed to impose constraints like this, and I would like to see them loosened.
Wayne Schneider
This is obviously a side issue that is worth discussing, perhaps just not here. Guidelines are actually fairly well documented in several documents maintained by David Crossley, e.g.:
Asking for constraints to be loosened is reasonable, recognizing that generalizing tooling that was developed with specific assumptions can be a larger effort than it may seem to someone not involved in maintaining the tools.
Wayne Schneider
I have some concerns about attempting to provide a tenant ID migration facility in Okapi, for a few reasons.
skipSchemaRename
option (query parameter?)tenant_mod_whatever
naming convention, but other modules (mod-agreements, "Springway"-based modules) create only a schema and not a role.I don't think Okapi is well-positioned to successfully execute this kind of migration. It may be able to orchestrate a migration if the modules provide a system interface (e.g. the
_tenant
interface) to manage their own migration (recognizing that this approach is considerably more work and harder to maintain).Jason Root
Howdy all,
Apologies I was not trying to "encourage" anything, just asking what the process might look like from a timeline perspective, and if it's going to be done - to give some more headroom for bumping up against character limits.
Perhaps a better option might be some sort of validator tool then? If we do not want to be very prescriptive to the operator (but still have standards, as Wayne Schneider provided the links to) and have "loose" restrictions, we could ask (Okapi?) "Hey I want to validate my naming conventions for A,B,C (tenant id/module name/schema name) against X,Y,Z (K8s labels/reserved names/schema requirements).
)- is to have a giant red blurb in Okapi's readme documentation regarding tenant id/module/schema naming conventions that says "DON'T DO THIS!" Then make sure that is available/pointed out to operators and developers of Folio.
To hit on some of the points Wayne mentions re: Okapi tenant migration tool...
At Tamu for Dev and Test Folio envs we do use entirely separate (containerized) Postgres DB instances for Okapi. For Pre and Prod we use the same instance of (VM) Postgres, but a different DB within that. I agree that Okapi currently has no way of knowing anything about what is stored in my module database... Seems like a large ask.
One last way this could be mitigated, albeit through what lies between the chair and keyboard (we all know how reliable that tool is
Marc Johnson
Does this mean that you (or other system operators) have tenants in production (or production like environments) today with tenant IDs longer than 31 characters?
Jason Root
Marc Johnson , that specific comment was more about the naming conventions for modules in an environment enabled for a tenant, and less about the tenant id itself. Currently, my tenant ids are all short, with no special characters.
Marc Johnson
Jason Root Thank you for clarifying that for me