Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[FEEDBACK] Interoperability concerns and normative-optional features #978

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
sffc opened this issue Jan 14, 2025 · 11 comments
Open

[FEEDBACK] Interoperability concerns and normative-optional features #978

sffc opened this issue Jan 14, 2025 · 11 comments
Labels
Agenda+ Requested for upcoming teleconference Preview-Feedback Feedback gathered during the technical preview

Comments

@sffc
Copy link
Member

sffc commented Jan 14, 2025

(split from #977)

I really want to avoid a situation where someone writes a message using some optional extensions, and the message works in ICU4C, and then when they go use an implementation that doesn't implement the same optional extensions (like ICU4X), their message doesn't work. Avoiding this kind of situation is, in my mind, the only job of a specification.

It would make me slightly happier if implementations with optional extensions required the client code to "opt in" to using them. For example, something like

let mf = new Intl.MessageFormat("de", { extensions: ["calendarOption"] })

And then messages with a :datetime calendar="..." option will work only if the stated extension is enabled. If someone tries building an ICU4X formatter with that extension, we'll know ahead of time in the constructor that the extension is not available, instead of waiting to see the message payloads.

@sffc sffc added the Preview-Feedback Feedback gathered during the technical preview label Jan 14, 2025
@eemeli
Copy link
Collaborator

eemeli commented Jan 14, 2025

I don't really see what the user benefit might be of needing to explicitly opt-in to extensions that are supported by the formatter. Isn't that information equally well conveyed in the message source?

In your code example above, your constructor call is missing the second required argument, i.e. the message source. Recall that that's provided initially to the constructor, and so its information (e.g. whether it includes a :datetime with a calendar option) is already known at that time.

@sffc
Copy link
Member Author

sffc commented Jan 15, 2025

Opting-in the extensions makes it clearer what messages your formatter is capable of processing. Enabling extensions: ["calendarOption"] does not mean that the messages being processed must utilize that extension. But what it does mean is that if you migrate between implementations, the implementations need to support the same set of extensions.

@aphillips
Copy link
Member

I don't get it?

I don't see a lot of benefit to making developers write into their code what optional-to-implement functionality is "enabled". Bear in mind that we would like to see user-defined and implementation-defined functions (ICU has a whole host of formatters/selectors that should be enabled in messages, for example, using the icu: namespace). Having to list these every time I use MF2 would be super inconvenient. (The inverse: the ability to query if functionality is installed might be an interesting implementation feature)

While total message portability would be great, we envision wide adoption by many languages, some of which will be seriously constrained. The REQUIRED default functions are necessarily minimal, as a result. MF2 would be kind of a failure if we were limited to those functions, since the ability to do inflection, person name formatting, duration handling, list management, and so forth might never appear in the (insert environment name here, e.g. awk or Lua or PHP) version, but be seriously useful... and since the default function set already is available via MF1.

Yes, this means that there will be "dialects" of MF2. But they should all work with translation tooling in a consistent way. Developing a machine-readable function description was a deliverable that we dropped along the way, but which could be resurrected in the post-47 world.

Full message portability isn't that common anyway. Your JS code doesn't typically use the same resource files as your Android app. The syntactical and conceptual consistency between frameworks/platforms is an important feature and the ability to port messages that use the most common features is important (it's the reason we have strong guidance to do these functions in a specific way). But there is some flexibility inherent in the design and that flexibility is a feature.

@sffc
Copy link
Member Author

sffc commented Jan 21, 2025

I wouldn't find it as elegant, but an alternative that aligns with my problem statement would be to basically have a MessageFormat Validator, like the W3C HTML Validator, that checks your syntax and returns which version / optional extensions are being used. There can be an ICU4X Profile that fails validation of messages that use optional extensions not implemented in ICU4X.

Then if clients want to use ICU4X, they can validate that their messages are conformant ahead of time.

@eemeli
Copy link
Collaborator

eemeli commented Jan 21, 2025

@sffc Do you mean including such a validator somehow in the spec, or just having it exist in general, possibly even outside Unicode?

I'm in the middle of refactoring Mozilla's Pontoon translation platform to internally store all source and target messages using an MF2-ish representation, and using that representation in the API between its server and client. As MF2 is more powerful than most of our supported formats, that'll require serverside validation that each new translation isn't using features that its source format doesn't support.

That sounds an awful lot like the validator you're talking about? The Python implementation work for that will be in moz.l10n, so it won't be limited to just Pontoon's use cases.

@sffc
Copy link
Member Author

sffc commented Jan 21, 2025

Validator doesn't need to be in the spec, but it is important for it to exist and be utilized.

Another thought: the message bundle format could have a header listing which optional features are required to consume the message bundle. Then the validator just needs to make sure the messages are consistent with the header, and ICU/ICU4X can check the header before parsing messages.

@sffc
Copy link
Member Author

sffc commented Jan 21, 2025

I guess the main theme through these various solutions is that there should ideally be a naming scheme for optional features. We can't talk about optional features unless they have names, we can't list them in a message bundle header, we can't list them in an API, and we can't configure them in a validator. That naming scheme should, I think, live in MF2.

Many of my concerns would be alleviated if, as an implementer, I could see a single centralized list of "these are the optional features, this is what they are called, and this is where you can read more about them."

@macchiati
Copy link
Member

We could use one of the 25 reserved single-letter namespaces for that, without inventing any new syntax.

Say o:calendar, for example?

@macchiati
Copy link
Member

And an optional value could also have an o:... prefix

@eemeli
Copy link
Collaborator

eemeli commented Feb 18, 2025

I went through the spec, and the following are the potentially testable OPTIONAL or RECOMMENDED features for implementations and function handlers that I could find:

Options for Implementations

Implementations are not required to evaluate all parts of a message when parsing, processing, or formatting. In particular, an implementation MAY choose not to evaluate or resolve the value of a given expression until it is actually used by a selection or formatting process.

Implementations MAY include additional fields in their formatting context.

An implementation MAY perform additional processing when resolving the value of an expression that consists only of a variable.

An implementation MAY pass additional arguments to the function handler, as long as reasonable precautions are taken to keep the function interface simple and minimal, and avoid introducing potential security vulnerabilities.

An implementation MAY define its own functions and their handlers.

An implementation MAY allow custom functions to be defined by users.

Implementation-defined functions SHOULD use an implementation-defined namespace.

Implementations MAY provide a mechanism for the function handler to provide additional detail about internal failures.

Errors MAY be emitted [during option resolution], but such errors MUST NOT be fatal.

When a message contains more than one error, or contains some error which leads to further errors, an implementation which does not emit all of the errors SHOULD prioritise Syntax Errors and Data Model Errors over others.

Implementations SHOULD provide a way for function handlers to emit (or cause to be emitted) any of the types of error defined in this section. Implementations MAY also provide implementation-defined Message Function Error types.

Such u: options MAY be removed from the resolved mapping of options.

Implementations providing a formatting target other than a concatenated string SHOULD support [the u:id] option.

Implementations SHOULD support [the u:dir] option.

Options for Functions

Specifically, if the cause of the failure was that the datatype, value, or format of the operand did not match that expected by the function, the function SHOULD cause a Bad Operand error to be emitted.

A function handler MAY include resolved options in its resolved value.

A function handler SHOULD emit a Bad Operand error for operands whose resolved value or type is not supported.

If a formatted expression itself contains spans with differing directionality, its formatter SHOULD perform any necessary processing, such as inserting controls or isolating such parts to ensure that the formatted value displays correctly in a plain text context.

Function handler access to the formatting context MUST be minimal and read-only, and execution time SHOULD be limited.

:number & :integer

Implementations MAY define an upper limit on the resolved value of a digit size option consistent with that implementation's practical limits.

Implementations MAY also accept implementation-defined types as the value [of a digit size option].

If the option select is set to plural, the rules applied to selection SHOULD be the CLDR plural rule data of type cardinal.

If the option select is set to ordinal, the rules applied to selection SHOULD be the CLDR plural rule data of type ordinal.

:currency

Although currency codes are expected to be uppercase, implementations SHOULD treat them in a case-insensitive manner.

Implementations MAY internally alias option values that they do not have data or a backing implementation for.

:unit

The function :unit is proposed to be a RECOMMENDED formatter for unitized values, that is, for numeric values associated with a unit of measurement.

The value of the operand's unit SHOULD be either a string containing a valid Unit Identifier or an implementation-defined unit type.

Implementations MAY support conversion to the locale's preferred units via the usage option. Implementing this option is optional. Not all usage values are compatible with a given unit. Implementations SHOULD emit an Unsupported Operation error if the requested conversion is not supported.

:datetime, :date & :time

ISO 8601 date and datetime values not matching the following regular expression MAY also be supported. Furthermore, matching this regular expression does not guarantee validity, given the variable number of days in each month.

(?!0000)[0-9]{4}-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])(T([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](\.[0-9]{1,3})?(Z|[+-]((0[0-9]|1[0-3]):[0-5][0-9]|14:00))?)?

When the time is not present, implementations SHOULD use 00:00:00 as the time. When the offset is not present, implementations SHOULD use a floating time type (such as Java's java.time.LocalDateTime) to represent the time value.

The [option calendar] and its values are RECOMMENDED to be available on the functions :datetime, :date, and :time.

An implementation MAY emit a Bad Operand or Bad Option error (as appropriate) when a variable annotated directly or indirectly by a :date annotation is used as an operand or an option value.

An implementation MAY emit a Bad Operand or Bad Option error (as appropriate) when a variable annotated directly or indirectly by a :time annotation is used as an operand or an option value.

@aphillips
Copy link
Member

@macchiati noted:

We could use one of the 25 reserved single-letter namespaces for that, without inventing any new syntax.

Say o:calendar, for example?

and also:

And an optional value could also have an o:... prefix

Optional-to-implement options need no prefix. If an implementation doesn't implement the option, it is just ignored. (Such an option might cause a Bad Option if its value doesn't resolve properly)

Regarding optional functions, note that what is optional today might become required in the future.

I'm not sure what the utility is in making developers, translators, and others have to decorate optional-to-implement functions all the time. Yes, there is some danger that a feature available in one runtime is not available in another. To the extent that the developer/message originator chooses the functions to use in a message, they are well-informed about local capabilities (note that this problem applies to locally-installed functions also).

Translators have the problem that they rarely (never) work directly in the runtime environment and that messages all look pretty much the same to them. If they felt the need to add a function (such as, say, one pertaining to inflection) to a message, they wouldn't know for sure if it will work at runtime. Having o: doesn't really solve that for the default function set any more than for custom functions.


I think @eemeli's list looks comprehensive. There may be a couple of items we should visit for stronger normativity.

@aphillips aphillips added the Agenda+ Requested for upcoming teleconference label Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agenda+ Requested for upcoming teleconference Preview-Feedback Feedback gathered during the technical preview
Projects
None yet
Development

No branches or pull requests

4 participants