-
-
Notifications
You must be signed in to change notification settings - Fork 36
[FEEDBACK] Interoperability concerns and normative-optional features #978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I don't really see what the user benefit might be of needing to explicitly opt-in to extensions that are supported by the formatter. Isn't that information equally well conveyed in the message source? In your code example above, your constructor call is missing the second required argument, i.e. the message source. Recall that that's provided initially to the constructor, and so its information (e.g. whether it includes a |
Opting-in the extensions makes it clearer what messages your formatter is capable of processing. Enabling |
I don't get it? I don't see a lot of benefit to making developers write into their code what optional-to-implement functionality is "enabled". Bear in mind that we would like to see user-defined and implementation-defined functions (ICU has a whole host of formatters/selectors that should be enabled in messages, for example, using the While total message portability would be great, we envision wide adoption by many languages, some of which will be seriously constrained. The REQUIRED default functions are necessarily minimal, as a result. MF2 would be kind of a failure if we were limited to those functions, since the ability to do inflection, person name formatting, duration handling, list management, and so forth might never appear in the (insert environment name here, e.g. Yes, this means that there will be "dialects" of MF2. But they should all work with translation tooling in a consistent way. Developing a machine-readable function description was a deliverable that we dropped along the way, but which could be resurrected in the post-47 world. Full message portability isn't that common anyway. Your JS code doesn't typically use the same resource files as your Android app. The syntactical and conceptual consistency between frameworks/platforms is an important feature and the ability to port messages that use the most common features is important (it's the reason we have strong guidance to do these functions in a specific way). But there is some flexibility inherent in the design and that flexibility is a feature. |
I wouldn't find it as elegant, but an alternative that aligns with my problem statement would be to basically have a MessageFormat Validator, like the W3C HTML Validator, that checks your syntax and returns which version / optional extensions are being used. There can be an ICU4X Profile that fails validation of messages that use optional extensions not implemented in ICU4X. Then if clients want to use ICU4X, they can validate that their messages are conformant ahead of time. |
@sffc Do you mean including such a validator somehow in the spec, or just having it exist in general, possibly even outside Unicode? I'm in the middle of refactoring Mozilla's Pontoon translation platform to internally store all source and target messages using an MF2-ish representation, and using that representation in the API between its server and client. As MF2 is more powerful than most of our supported formats, that'll require serverside validation that each new translation isn't using features that its source format doesn't support. That sounds an awful lot like the validator you're talking about? The Python implementation work for that will be in moz.l10n, so it won't be limited to just Pontoon's use cases. |
Validator doesn't need to be in the spec, but it is important for it to exist and be utilized. Another thought: the message bundle format could have a header listing which optional features are required to consume the message bundle. Then the validator just needs to make sure the messages are consistent with the header, and ICU/ICU4X can check the header before parsing messages. |
I guess the main theme through these various solutions is that there should ideally be a naming scheme for optional features. We can't talk about optional features unless they have names, we can't list them in a message bundle header, we can't list them in an API, and we can't configure them in a validator. That naming scheme should, I think, live in MF2. Many of my concerns would be alleviated if, as an implementer, I could see a single centralized list of "these are the optional features, this is what they are called, and this is where you can read more about them." |
We could use one of the 25 reserved single-letter namespaces for that, without inventing any new syntax. Say o:calendar, for example? |
And an optional value could also have an o:... prefix |
I went through the spec, and the following are the potentially testable OPTIONAL or RECOMMENDED features for implementations and function handlers that I could find: Options for ImplementationsImplementations are not required to evaluate all parts of a message when parsing, processing, or formatting. In particular, an implementation MAY choose not to evaluate or resolve the value of a given expression until it is actually used by a selection or formatting process. Implementations MAY include additional fields in their formatting context. An implementation MAY perform additional processing when resolving the value of an expression that consists only of a variable. An implementation MAY pass additional arguments to the function handler, as long as reasonable precautions are taken to keep the function interface simple and minimal, and avoid introducing potential security vulnerabilities. An implementation MAY define its own functions and their handlers. An implementation MAY allow custom functions to be defined by users. Implementation-defined functions SHOULD use an implementation-defined namespace. Implementations MAY provide a mechanism for the function handler to provide additional detail about internal failures. Errors MAY be emitted [during option resolution], but such errors MUST NOT be fatal. When a message contains more than one error, or contains some error which leads to further errors, an implementation which does not emit all of the errors SHOULD prioritise Syntax Errors and Data Model Errors over others. Implementations SHOULD provide a way for function handlers to emit (or cause to be emitted) any of the types of error defined in this section. Implementations MAY also provide implementation-defined Message Function Error types. Such Implementations providing a formatting target other than a concatenated string SHOULD support [the Implementations SHOULD support [the Options for FunctionsSpecifically, if the cause of the failure was that the datatype, value, or format of the operand did not match that expected by the function, the function SHOULD cause a Bad Operand error to be emitted. A function handler MAY include resolved options in its resolved value. A function handler SHOULD emit a Bad Operand error for operands whose resolved value or type is not supported. If a formatted expression itself contains spans with differing directionality, its formatter SHOULD perform any necessary processing, such as inserting controls or isolating such parts to ensure that the formatted value displays correctly in a plain text context. Function handler access to the formatting context MUST be minimal and read-only, and execution time SHOULD be limited. :number & :integerImplementations MAY define an upper limit on the resolved value of a digit size option consistent with that implementation's practical limits. Implementations MAY also accept implementation-defined types as the value [of a digit size option]. If the option If the option :currencyAlthough currency codes are expected to be uppercase, implementations SHOULD treat them in a case-insensitive manner. Implementations MAY internally alias option values that they do not have data or a backing implementation for. :unitThe function The value of the operand's Implementations MAY support conversion to the locale's preferred units via the :datetime, :date & :timeISO 8601 date and datetime values not matching the following regular expression MAY also be supported. Furthermore, matching this regular expression does not guarantee validity, given the variable number of days in each month. (?!0000)[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])(T([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](\.[0-9]{1,3})?(Z|[+-]((0[0-9]|1[0-3]):[0-5][0-9]|14:00))?)? When the time is not present, implementations SHOULD use The [option An implementation MAY emit a Bad Operand or Bad Option error (as appropriate) when a variable annotated directly or indirectly by a An implementation MAY emit a Bad Operand or Bad Option error (as appropriate) when a variable annotated directly or indirectly by a |
@macchiati noted:
and also:
Optional-to-implement options need no prefix. If an implementation doesn't implement the option, it is just ignored. (Such an option might cause a Bad Option if its value doesn't resolve properly) Regarding optional functions, note that what is optional today might become required in the future. I'm not sure what the utility is in making developers, translators, and others have to decorate optional-to-implement functions all the time. Yes, there is some danger that a feature available in one runtime is not available in another. To the extent that the developer/message originator chooses the functions to use in a message, they are well-informed about local capabilities (note that this problem applies to locally-installed functions also). Translators have the problem that they rarely (never) work directly in the runtime environment and that messages all look pretty much the same to them. If they felt the need to add a function (such as, say, one pertaining to inflection) to a message, they wouldn't know for sure if it will work at runtime. Having I think @eemeli's list looks comprehensive. There may be a couple of items we should visit for stronger normativity. |
(split from #977)
I really want to avoid a situation where someone writes a message using some optional extensions, and the message works in ICU4C, and then when they go use an implementation that doesn't implement the same optional extensions (like ICU4X), their message doesn't work. Avoiding this kind of situation is, in my mind, the only job of a specification.
It would make me slightly happier if implementations with optional extensions required the client code to "opt in" to using them. For example, something like
And then messages with a
:datetime calendar="..."
option will work only if the stated extension is enabled. If someone tries building an ICU4X formatter with that extension, we'll know ahead of time in the constructor that the extension is not available, instead of waiting to see the message payloads.The text was updated successfully, but these errors were encountered: