The consensus for handling multilingual metadata fields seems to be to follow the way ckanext-fluent handles them.
For custom fields this looks uncontroversial:
{
...
"favourite_fruit": {
"en": "Peach",
"ca": "Préssec",
"es": "Melocotón"
}
...
}
There is an issue though with using this approach on existing core fields like title or notes. Up until this point these have always had a single string value, and changing this to return a dict on some occasions would break things, both in core (eg in templates) or in extensions.
-
For dealing with this in the current API version I'm proposing that for these particular core fields that need to be translated we add a separate field:
{
...
"title_lang": {
"en": "Annual Budget",
"ca": "Pressupost Anual",
"es": "Presupuesto Anual"
}
...
}
I have no strong preference for what the _lang suffix actually is called (_trans, _i18n ...)
The list of fields that would need these is:
- Dataset title
- Dataset notes
- Resource name
- Resource description
- Resource format
- Tag name (these are likely to be handled separately)
Just to clarify, for now it would be ckanext-fluent (or any other extension) who will create these fields, no changes on ckan core would be made.
-
Moving forward, on a new API version, we can decide whether the same pattern can be applied directly to core fields as well:
{
...
"title": {
"en": "Annual Budget",
"ca": "Pressupost Anual",
"es": "Presupuesto Anual"
}
...
}
The main issue I see is what happens with those CKAN instances (the majority) that don't handle multilingual metadata. Do we allow string and dict values on the same field? (that doesn't sound like a good idea tbh). Do we always enforce a language key, even if there is only one for the instance default locale? eg:
{
...
"title": {
"en": "Annual Budget"
}
...
}
or
{
...
"title": {
"es": "Presupuesto Anual"
}
...
}
The consensus for handling multilingual metadata fields seems to be to follow the way ckanext-fluent handles them.
For custom fields this looks uncontroversial:
There is an issue though with using this approach on existing core fields like
titleornotes. Up until this point these have always had a single string value, and changing this to return a dict on some occasions would break things, both in core (eg in templates) or in extensions.For dealing with this in the current API version I'm proposing that for these particular core fields that need to be translated we add a separate field:
I have no strong preference for what the
_langsuffix actually is called (_trans,_i18n...)The list of fields that would need these is:
Just to clarify, for now it would be ckanext-fluent (or any other extension) who will create these fields, no changes on ckan core would be made.
Moving forward, on a new API version, we can decide whether the same pattern can be applied directly to core fields as well:
The main issue I see is what happens with those CKAN instances (the majority) that don't handle multilingual metadata. Do we allow string and dict values on the same field? (that doesn't sound like a good idea tbh). Do we always enforce a language key, even if there is only one for the instance default locale? eg:
or