Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Refactor registry locale overrides #534

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Dec 5, 2023

Conversation

eemeli
Copy link
Collaborator

@eemeli eemeli commented Nov 27, 2023

Closes #410

At the moment, locale-specific overrides work via the locales attribute list of identifiers on <formatSignature> and <matchSignature>. This is a bit clumsy, because it's mixing together two different concerns:

  1. Some functions may have options that are only available in specific locales.
  2. Some selectors may have their matching depend on the locale.

So I propose separating the two. Let's remove the locales attribute from the signatures, and rather:

  1. Allow for a <override locales="..."> within a signature that may include rules for inputs and options that override any set generally for the formatter.
  2. Allow for both <match> and <match locales="...">, and specify that only one such rule (first matching one) is used at a time.

This should make it easier to implement and manage locale-specific overrides, as there's much less need for repetition.

If #532 is accepted, <alias> elements should also be allowed to include <override>s.

@eemeli eemeli added the functions Issue pertains to the default function set label Nov 27, 2023
@eemeli eemeli requested review from aphillips and stasm November 27, 2023 15:20
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts below. We should discuss some of these concepts in the group.

Comment on lines +109 to +110
<match locales="en" values="one two few other" validationRule="anyNumber"/>
<match values="zero one two few many other" validationRule="anyNumber"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

other or *?


locales="en" looks odd. Can locales contain multiple values? Space separated, I guess?

This has always struck me as a wobbly mechanism. Many locales have identical plural rules to English. Are we going to list them? Why do we need to repeat CLDR data here? Perhaps we should have a referencing mechanism to CLDR instead of replicating data. Note that other things besides plurals depend on CLDR data, such as date patterns and such.

Copy link
Collaborator Author

@eemeli eemeli Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly other, because these are literal values and * would in source need to be matched as |*|. Also, explicitly included because a selector could differentiate the other and * cases so that the former deals with numeric input matching the other category, while * handles non-numeric inputs. Furthermore, other must be supported in general because if it's left out then we can't properly deal with Eastern European languages like Polish which would prefer to default to many instead of other.


locales is NMTOKENS, so it may include a space-separated list of locale identifiers.

I agree that specifically for cardinal plurals & ordinals the CLDR includes data that we would like to refer to. This PR is not providing a solution for that; it's about providing a friendlier way to define local locale-specific overrides.

eemeli and others added 2 commits November 30, 2023 20:02
@eemeli eemeli requested a review from aphillips December 4, 2023 22:16
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change needed. Otherwise good.

Co-authored-by: Addison Phillips <[email protected]>
Copy link
Member

@aphillips aphillips left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge looks clean.

Co-authored-by: Eemeli Aro <[email protected]>
@aphillips aphillips merged commit 58e8fff into unicode-org:main Dec 5, 2023
@eemeli eemeli deleted the better-locale-overrides branch December 5, 2023 18:23
@stasm
Copy link
Collaborator

stasm commented Dec 16, 2023

I'm sorry that I missed this conversation when it happened. I hope we can still reconsider the design choices made in this PR. Or at least better substantiate it.

At the moment, locale-specific overrides work via the locales attribute list of identifiers on <formatSignature> and <matchSignature>. This is a bit clumsy, because it's mixing together two different concerns:

  1. Some functions may have options that are only available in specific locales.
  2. Some selectors may have their matching depend on the locale.

@eemeli Can you elaborate why you think these are two different concerns? In my mind, they are a single concern -- the selector function will be called in the context of a specific locale which impacts the function's input and output types, i.e. the function's signature.

I'd much prefer to build as flat a registry as possible, and then separately define how the data modelled by it should be used. So rather than nest overrides inside matchSignatures, I'd propose that multiple matchSignature be used and the rules of specificity be applied to choose the right ones.

<function name="number">
  <!-- A generic signature, common for all locales. -->
  <matchSignature>
    <input validationRule="anyNumber"/>
    <option name="type" values="cardinal ordinal"/>
    <option name="minimumIntegerDigits" validationRule="positiveInteger"/>
    <option name="minimumSignificantDigits" validationRule="positiveInteger"/>
    <option name="maximumSignificantDigits" validationRule="positiveInteger"/>
    <!-- Two match elements, because it's a union. -->
    <match validationRule="anyNumber"/>
    <match values="zero one two few many other"/>
  </matchSignature>

  <matchSignature locales="en">
    <!-- Narrow down the possible keys for the en locale. -->
    <match values="one other"/>
  </matchSignature>

  <matchSignature locales="pl">
    <!-- Narrow down the possible keys for the pl locale. -->
    <match values="one few many other"/>
  </matchSignature>
</function>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions Issue pertains to the default function set
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define locale-specific variants of function signatures in the registry
4 participants