-
-
Notifications
You must be signed in to change notification settings - Fork 33.2k
Fixed #30439 - Translations issues on upgrade due to unexpected changes in plural forms #12280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
docs/topics/i18n/translation.txt
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is inaccurate. Django uses whatever Transifex did to convert CLDR plurals to gettext plural forms. Gettext uses integers for the calculation, while CLDR specifies plurals for floats as well. That leads to plural forms defined in CLDR, but unreachable in gettext. Transifex did not choose to skip those, so there can be plurals which will never be used. There are other conversions which handle this case correctly (I'm aware of https://github.com/php-gettext/Languages, but there might be others as well).
I've noticed this for Czech, but the issue might be in other languages as well.
The plural form used by Transifex: nplurals=4; plural=(n == 1 && n % 1 == 0) ? 0 : (n >= 2 && n " "<= 4 && n % 1 == 0) ? 1: (n % 1 != 0 ) ? 2 : 3;
Plural form usually used by gettext: nplurals=3;plural=(n == 1) ? 0 : ((n >= 2 && n <= 4) ? 1 : 2)
You can see that many from CLDR is missing from second formula. The Transifex formula has it, but it's unreachable because n % 1 != 0 is never true and this the plural form never evaluates to 2. This brings inconsistency with rest of the software using gettext, additional effort for translators and brings exactly zero benefit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Django indeed uses plural forms from Transifex (and according to Transifex's documentation posted on the dev list, they follow Unicode). Following a standard does not mean implement it well.
The statement is about setting a policy about plural forms (no one objected it in the mailing list).
If the plural form is not an implementation of the CLDR specification, it should not be included in the catalogs distributed by Django.
If Django should maintain the plural forms independently of Transifex, I think that's another discussion. The PR is about ensuring consistency in plural forms and providing a way of modify them locally via LOCALE_ROOT.
changes in plural forms - Modify the catalog merging policy to "no-merge" if plural forms differ in translations - Add new LOCALE_ROOT setting - Add new --comply-plural-forms option to makemessages - Add new --collect-base-catalogs option to makemessages - Fix inconsistencies with plural forms to comply the new merging policy
|
@felixxm I can't reproduce the failing of timezones.tests.SerializationTests in my setup (sqlite3, python3.8) - though they were passing in Jenkins some hours ago (see the report on the first push |
|
I'm not sure if you should invest more time on that direction. I'm personally not convinced, you should find support from other team members first. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @math-a3k (@claudep) Thanks for the effort here.
I'm trying to get into the position where I can sensibly review this. I'm lacking the required domain knowledge. My issue is that I'm not at all sure whether the proposed solution (even if it works as expected) is correct? From the discussion it's hard to see that there's agreement that, yes, this is the way forward. I can spend more time, but it would be good to have an idea if we were on the right track.
At the very least here I think the must be some docs or release notes or migration notes missing. It's not very clear at all what the new LOCALE_ROOT setting if for (or more why it's needed) — same with the new options to makemessages.
Is there any way of breaking this down into smaller separate commits, that can be reviewed individually? I'm struggling to see the essence of the fix here...
@math-a3k Can you rebase on the latest master? (Tests pass locally so I'm wondering if the DST issue was unrelated.)
|
|
||
| W005 = Warning( | ||
| 'Inconsistent plural forms across catalogs for language {!r} (unmerged catalogs).', | ||
| id='translation.W005', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a hint to tell me what I should do about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's the warning at a system check level. You will see that when you have catalogs with inconsistent plural forms when the checks are run.
The system check level warning is needed to inform the user about the inconsistency in all the supported languages (set by the LANGUAGES settings). Because of how i18n works in Django, every translation is lazily-loaded and cached, that makes loading only the default language on start, the rest is of the load is done when something request it.
The system check resets the translation cache and triggers the loading all the languages supported to see if there are any run-time warnings (also needed when adding them "on-the-fly"). After that, it resets the cache again and leaves the situation as it was before.
Run-time warnings are done when merging the catalogs, in django.utils.translation.trans_real
| msgid "Jump to namespace" | ||
| msgstr "" | ||
|
|
||
| #: contrib/admindocs/templates/admin_doc/view_index.html:27 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the comments all removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the style of the .pos distributed with Django, almost all catalogs were generated with the "--no-location" option in makemesssages. This wasn't, and as its plural forms were updated and the file was being updatetd, I removed them to make it like the others. Those are comments with the places where string has been extracted, it doesn't affect the translations, no need to distribute them.
|
:)
The release notes are missing indeed. There aren't any things involved with migrations. The LOCALE_ROOT is for having the translation files in your project tree, customize them to your needs and maintain them in your release cycle. It addresses situations like posted by @nijel in his comment. If you wish to use the gettext plural forms for Czech with the current code base, you will have to:
This case is fairly simple, as it reduces the number of plurals, so no worry about missing translations. If you want to increase the number of plurals, you would have to modify all the bundled catalogs in the contrib apps to fill the gaps (addressed in the PR by "makemessages --comply-plural-forms"). This also won't be persistent across updates, next time Django is updated, it has to be done again. With the LOCALE_ROOT, you make your translation files independent from the Django release cycle. After you define it, you would run "makemessages --collect-base-catalogs --locale cs" and you will have all the bundled translations locally. You modify it (the plural forms), compile it and next time Django is updated, nothing breaks. You only may have to run the command again to collect any new messages in Django. You may also use the "makemessages --comply-plural-forms" option to align your catalogs with the new forms.
No, sadly I squashed them all as it was the way of presenting PRs stated in the docs. But, if you go to the "Files" tab in this PR on Github, there is a file filter, and make show only .py and .txt (doc). The .py files are 8, 4 are for tests. The code of the fix is in:
|
|
HI @math-a3k. Thanks for the reply. I understand how to view particular files, but that's not my point. Rather, there's a large change in the way I might handle translation files proposed here. I'm inclined to say it justifies the more formal approach of a DEP, with the discussion that entails. The proposal here may be the way to go, but it's not brought out sufficiently in the docs changes, and (separately) it's not clear that there is agreement that this is the way to go as the best response to the original issue, which was changes coming in from Transifex... — I was hoping there were a smaller set of changes available that we might be able to look at independently. |
:)
I understand, but I don't see the change to be that large. If the merging policy is set to "merge", then there won't be modifications in how Django handle translation files (besides the LOCALE_ROOT setting). Under that policy, if the warning is raised, all the catalogs bundled with Django will still need to be aligned - if not, that warning will make tests fail and users will have to fix it at a package level. This is where the tools for fixing it come in (--comply-plural-forms), a change that addresses an unequivocally bad situation (no one wants broken translations because of plural forms inconsistencies for sure) that will require user intervention comes with a tool so the users conveniently ensure it. If no tools were provided, IMO, the users will be leftover. Most of the code of the PR is for the tooling for making it easy for the users to handle the change. Only a subset of the users will be affected by this change, and all that users will have to do is run "--comply-plural-forms" once. This is why I don't see it as a "large change in the way of handling translation files" - yet it will have a great positive impact in users who are affected by this issue. The LOCALE_ROOT is a proper way of providing plural forms customization (also discussed in the issue) and "--collect-base-catalogs" the tool for making it convenient.
I will give the docs another iteration, for the sake of completeness - I was waiting for others to express but as that is not happening, I will continue
The problem that I see is that people who disagree don't expose their reasons, so there is nothing that I can do besides having an "intuition" about the disagreement
The thing that I can do is - after the docs iteration - create another branch and separate the code in commits (runtime warning, system checks, settings, tools, pos and mos). Will that be OK for you? |
|
Hi, we use Django 2.1 in Speedy Net and this issue has prevented us from upgrading to Django 2.2. I'm trying to understand this PR but it seems to me quite complicated. Will we be able to keep using 2 plural forms in Hebrew for our translations if this PR is accepted? Or will we have to convert our translations to 4 plural forms like is used in Django 2.2? For me it makes sense to use 2 plural forms (n==1, n!=1) because we don't need any complicated translations. I checked Django translations and almost/maybe in 100% of the Hebrew translations, 3 plural forms are identical. Do we have to convert our translations to 4 plural forms too? And if we want to keep using 2 plural forms in Hebrew, how do we do it while still using the latest version of Django? You can see my comments on #30439 and in the Django developers mailing list. |
451c823 to
1d03bea
Compare
|
@carltongibson Here is the branch with the code separated in commits: #12357 |
|
Hi @math-a3k. I'm going to close this in favour of #12332 at this stage. There's a lot going on here that's beyond the scope of a narrow bugfix. Happy if you want to re-propose bits, in separate changes if you think there's value. (For the suggestion of managing translations independently of Django, I suggest discussing on the list again, in a fresh thread, as it's a significant change from what we do now.) Thanks. |
ticket-30439
in translations