Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

eemeli
Copy link
Collaborator

@eemeli eemeli commented Sep 8, 2025

Adds an initial set of expression, markup, and message attribute definitions.

The proposed attributes are drawn from:

As noted in the text, this is not intended as a final list, but as a starting point. The text is not being currently proposed to be normative, but we could change that later.

@eemeli eemeli added the Agenda+ Requested for upcoming teleconference label Sep 8, 2025
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start. Lots of nit-picky comments.

Maybe a good question is: should these be directly incorporated? Or should all of these XLIFFy things be namespaced? Some of what XLIFF does doesn't apply to UMF messages and some of it would be much better on a message resource level (instead of cluttering up the message itself).

Comment on lines +17 to +20
As all _attributes_ with _reserved identifiers_ are reserved,
definitions are provided here for common _attribute_ use cases.
Custom _attributes_ SHOULD use a _custom identifier_,
preferably one with an appropriate _namespace_.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This paragraph feels weird? It's unclear if we're trying to say that the "definitions" found here are normative. This also tangles with the reserves/custom identifier bear in a novel way.

Perhaps:

Suggested change
As all _attributes_ with _reserved identifiers_ are reserved,
definitions are provided here for common _attribute_ use cases.
Custom _attributes_ SHOULD use a _custom identifier_,
preferably one with an appropriate _namespace_.
_Attributes_ defined by this specification use _reserved identifiers_.
Custom _attributes_ MUST use a _custom identifier_.
The use of a _namespace_ is RECOMMENDED for implementation-defined
or domain-specific _attributes_.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MUST you propose is actually stronger than what we've currently in the spec, where the strongest language we use is

Implementers and authors of _functions_ and _messages_,
including _functions_, _options_, and _variables_,
SHOULD avoid creating _names_ that could produce confusion or harm usability
by choosing _names_ consistent with the following guidelines.

This paragraph should is intended to match/recall what's already elsewhere in the spec, rather than adding new normative text. It could be dropped completely, if that would be clearer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The MUST is stronger, because it is effectively authoring advice. You're right that it probably should be a SHOULD. Perhaps something like:

Use a custom identifier for other attributes.


#### @translate

_Value:_ `yes` or `no`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indicate that yes is default?

Is there a reason attributes don't follow a similar structure to functions and their options here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we've agreement that yes is the default. In fact, for expressions, I would think that the general default might in fact be no to indicate that a translator is not expected to make any changes to the expression.

Considering this a bit more, maybe something like translate=input or translate=|input,minimumFractionDigits| would be better? That would indicate which parts are expected to be translatable.

Copy link
Member

@aphillips aphillips Sep 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value is no when the attribute is not present, but yes when the attribute is present and has no value, right?

I don't like the values yes/no, but they are inherited from XLIFF (and its friends, such as ITS) and we should probably remain consistent with them (for portability at least)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's a slightly different undrstanding of "default" than I'd had -- as in, the value that's applied if the attribute is not present at all.

I don't hate the yes/no as they're relatively legible and are perhaps easier to extend with other enum values than e.g. true/false would be. But as they're already in use by XLIFF, we should use the same values.


Indicates whether or not the _markup_ and its contents can be re-ordered.

#### @comment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just permit the "global" attributes on markup?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what this means.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're repeating attributes defined above. Why not make those like @comment global to both expressions and markup?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like an editorial fix we could apply later, if it does hold that the annotations continue to match on expressions and markup.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a bad idea for identically-named attributes to diverge. The sets aren't identical, of course.


#### @max-length

_Value:_ A strictly positive integer, followed by a space, followed by one of the following:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

digit size option?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's limited to max 99, and we need to allow for limits greater than that.

_Value:_ A strictly positive integer, followed by a space, followed by one of the following:
- `chars`
- `bytes`
- `lines`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good luck with this one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in, we should not include it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Measuring bytes will depend on some character encoding somewhere. Without an indication of the encoding (which this doesn't provide), there is no way to perform the measurement.

(FWIW, you're missing graphemes, which is another measurement (approximately "screen positions", but only approximately so).)

Lines depends on... font, font size, pixel width, line-breaking, hyphenation (insert more here) and are even harder to define that bytes.

Length limitations are a "fact of life" in localization, but badly defined mechanisms for them are not that helpful.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One option would be to leave out the units, and to let the implementation figure out what the limit means, something in the overlap of characters/code points/graphemes.

Co-authored-by: Addison Phillips <[email protected]>
@eemeli
Copy link
Collaborator Author

eemeli commented Sep 9, 2025

Maybe a good question is: should these be directly incorporated? Or should all of these XLIFFy things be namespaced? Some of what XLIFF does doesn't apply to UMF messages and some of it would be much better on a message resource level (instead of cluttering up the message itself).

During yesterday's call, @mihnita also expressed concern regarding cluttering up a message with multiple attributes. His thought was that it would often be preferable to attach a u:id to an expression or markup, and refer to that from a separate message-level block to attach attribute-y metadata to the relevant placeholder(s).

To me, this speaks of a need to have that capability also be well defined, so that it can be ergonomically done across resource formats. In other words, I think we need a JavaDoc-y syntax for message-level attributes.

@eemeli eemeli requested review from aphillips and mihnita September 9, 2025 09:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Agenda+ Requested for upcoming teleconference
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants