Codestin Search App

danielrainer · 2025-10-10T22:37:02Z

Introduces Fluent localization. Refer to the commit messages for details.

TODOs:

Replace gettext calls by new API, at least for a few instances so we can see it work in action.
naming convention for message IDs
coordination for porting messages. If people modify PO files while we port messages to Fluent, we would have to redo some porting effort manually. To avoid this, we should either coordinate with translators to ensure no major changes happen to PO files while we port messages to Fluent, or we should add tooling which records translations and is then able to apply them to PO files to automatically port the new changes to Fluent. In total, we have a bit over 500 messages in the Rust sources, so porting them all is not trivial, but given decent tooling it should not take that long either.
How much ID sharing do we want? Fluent gives us the opportunity to define different IDs even if the English versions of the localized strings are identical. This would allow translators to have more possibilities to use correct grammar. I think this only really affects messages which have placeholders, plus messages which have the same English value, but non-identical semantics of the value, e.g. if the same word is used as a noun in one place and as a verb in another. One concrete example is the version string. Currently, the commit fluent: add first message ports the version string to Fluent, but only for the fish executable, not fish_indent, nor fish_key_reader. Should all three use the same ID, or do we want a different ID for each?

danielrainer · 2025-10-15T23:20:06Z

The new commits run cargo run --package fish-fluent-check in check.sh and CI. I added it as separate jobs because running cargo in a test started from the test driver is not ideal, both because cargo can use multiple threads, which might result in more timeouts in CI, and because cargo prints to stderr, which is not ideal for checks, especially since the checks fail by panicking, which also prints to stderr. We could work around the latter issue by redirecting the output of the cargo command, checking cargo's exit status, print the redirected output on error, and delete the file we redirected to afterwards. (The last step might be handled by the test driver if we put it into the test's temporary HOME). Then, we should also exit with cargo's exit status, although the test driver doesn't care about that at the moment.

krobelus · 2025-10-16T11:17:23Z

The new commits run `cargo run --package fish-fluent-check` in

Haven't looked into the code but any reason why we can't use "cargo test" ? (I guess we'd need to declare the *.ftl files as inputs but we already do something similar for po/ in fish-gettext-maps) I wonder if it's realistic to (long-term) move away from test_driver.py completely and do everything with `cargo test` (not sure if that's idiomatic.. handling subprocesses without a shell can be tricky). I suppose we could use test_driver easier if we separated the "build" from the "run" step. BTW we could also write an xtask as a common interface for running tests cargo test-fish tests/checks/abbr.fish cargo test-fish tests/pexpects/bind.py cargo test-fluent Just an idea, for discoverability.

`check.sh` and CI.

It probably wouldn't hurt to start sharing logic between the two. But of course that's unrelated.

danielrainer · 2025-10-16T15:11:29Z

any reason why we can't use "cargo test" ?

That's a good idea. I added a test which just runs main in crates/fluent-check/src/main.rs. Since main indicates errors via panics, no additional logic is needed. For check.sh, we can still pass in an env var indicating where to look for extracted Fluent IDs, so the test does not need to recompile fish. In CI, no such mechanism exists for now, so there the test will recompile fish, but that also happened in the previous implementation, the difference being that before it only happened once in a dedicated job, whereas now it happens in every job which runs the tests.

I wonder if it's realistic to (long-term) move away from test_driver.py completely and do everything with cargo test

I think replacing test_driver.py with Rust code should be doable without too much effort, at least if it continues to be a dedicated program, instead of separate cargo tests for every script file. The harder part would be replacing littlecheck and pexpect. (I haven't looked into the latter at all.) From memory, I think the main challenges with having a cargo test per test script would be:

sharing compilation of the test helper
setting up parametrization which automatically creates a test case for every relevant script file

we could also write an xtask as a common interface for running tests

I'm generally in favor of replacing (or at least wrapping) the various shell scripts we have with xtasks, to have a unified interface for running everything. Regarding the interface, I'm not sure if it's better to have several different cargo aliases (e.g. cargo test-fish), or use cargo xtask for everything and add subcommands to that as desired. I chose the latter approach for ensuring that the fish version env var is set correctly for every cargo invocation, but I think it makes sense in general. E.g., that would make it easy to have cargo xtask help, which would be difficult to implement with multiple cargo aliases. It also reduces the likelihood of choosing an alias which might clash with a future built-in cargo command.

danielrainer · 2025-10-16T21:44:11Z

I changed unic-langid to 0.9.5, since 0.9.6 requires Rust 1.82+. While it might not make much of a difference for the concrete features in this specific instance, not updating our MSRV makes it increasingly painful to manage dependencies. It limits our ability to update existing dependencies and we miss out on many improvements made in more recent Rust versions. I'd really appreciate it if we don't wait until we have a concrete, urgent need to update some dependency which would then force us into a rushed MSRV update. Instead, we should finally come up with a sensible policy of updating our MSRV that's not just "we'll stick with 1.70 indefinitely". See #11679

krobelus · 2025-10-16T21:56:18Z

I changed `unic-langid` to `0.9.5`, since `0.9.6` requires Rust 1.82+.

yeah let's update to Debian Stable's 1.85 (and see if someone complains). AFAIK this means we still support "macOS 10.12 Sierra (First released 2016)".

danielrainer · 2025-10-16T22:01:23Z

yeah let's update to Debian Stable's 1.85 (and see if someone complains).

Sounds good. That version also allows us to migrate to the 2024 Rust edition if we want that. I just checked some of the other distros, and it seems that Fedora keeps up with stable on all releases and Ubuntu has Rust 1.85 as the default in 25.10. Should I open a PR?

krobelus · 2025-10-16T22:46:34Z

> yeah let's update to Debian Stable's 1.85 (and see if someone complains). Sounds good. That version also allows us to migrate to the 2024 Rust edition if we want that. I just checked some of the other distros, and it seems that Fedora keeps up with stable on all releases and Ubuntu has Rust 1.85 as the default in 25.10. Should I open a PR?

sure, both upgrading to 1.85 and 2024 edition sounds good This means we can get rid of a bunch of lints with the `// for old clippy` comment and things like `// TODO: if-let-chains`. Here's our poor man's renovatebot: #11960

danielrainer · 2025-10-16T22:54:50Z

both upgrading to 1.85 and 2024 edition sounds good

#11961

Changing the edition is not that straightforward and might require a fairly large commit, including manual updates, so for now I'll just address the things which have become available in Rust 2021 with the MSRV update.

faho · 2025-10-20T09:29:42Z

Parts of fluent are licensed apache2-only, which is incompatible with fish's GPLv2, so this is, as far as I can tell, legally unmergeable as-is.

cargo deny check licenses would catch that.

danielrainer · 2025-10-20T14:03:45Z

The code we use is from https://github.com/projectfluent/fluent-rs, which includes both an apache2 and a MIT license file. I'm no expert on the legal situation here, but it seems to me that using the software under the terms of the MIT License is allowed and that this license does not require adding copyright information to our binaries. AFAICT, we don't include copyright info for any of our dependencies, only for software where the fish repo itself contains code derived from that software.

danielrainer · 2025-10-20T14:05:18Z

Relevant issue: projectfluent/fluent-rs#31

faho · 2025-10-20T14:06:36Z

It's not about including copyright information, it's that some of the dependencies for fluent-rs are still apache2-only.

Apache2 is incompatible with GPLv2 because IIRC it includes a patent grant and the GPLv2 has a "no further restrictions" clause. So the combined product has a license that can't be followed.

danielrainer · 2025-10-20T14:08:54Z

If that's the case, why can fluent-rs be MIT-licensed, but we can't use it under that license?

faho · 2025-10-20T14:12:06Z

Because the MIT license and the Apache license don't conflict (neither of them is "viral" the way the GPL is). The GPLv2 and the Apache license do, though.

We can be using the fluent-rs crate under the MIT, but we can't be using some of its dependencies.

Edit: The offending dependencies are:

danielrainer · 2025-10-20T14:30:03Z

So MIT-licensed projects can use fluent-rs and its dependencies, including the apache-licensed ones, but we can't because fish is GPLv2 licensed?

Should we ask the two apache-licensed projects about making their projects available under a license that allows us to use their software in fish? fluent-langneg at least seems to be mostly written by people who also contribute to fluent-rs, so I would be surprised if they would object to dual-licensing. self_cell seems to be a fairly small project, both in terms of contributors and code size. If they are unwilling to use a compatible license, we could ask fluent-rs whether they would consider replacing the dependency.

Most of fluent-rs is already dual-licensed. This crate is not, which can make it harder for GPL-2-only projects to use fluent-rs. Fix that by allowing use under MIT lincense. All (or almost all?) nontrivial contributions seem to be from the same author, so this should be easy? Ref: projectfluent/fluent-rs#34 Ref: fish-shell/fish-shell#11928

Most of fluent-rs is already dual-licensed. This crate is not, which can make it harder for GPL-2-only projects to use fluent-rs. Fix that by allowing use under MIT lincense. All (or almost all?) nontrivial contributions seem to be from the same author, so this should be easy? Ref: projectfluent/fluent-rs#34 Ref: fish-shell/fish-shell#11928 Closes projectfluent#30

emilio · 2025-10-22T10:58:47Z

Isn't self_cell Apache OR GPLv2? If so wouldn't it be compatible?

Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable.

Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in fish-shell#11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable. Closes fish-shell#12125

Multiple gettext-extraction proc macro instances can run at the same time due to Rust's compilation model. In the previous implementation, where every instance appended to the same file, this has resulted in corruption of the file. This was reported and discussed in #11928 (comment) for the equivalent macro for Fluent message ID extraction. The underlying problem is the same. The best way we have found to avoid such race condition is to write each entry to a new file, and concatenate them together before using them. It's not a beautiful approach, but it should be fairly robust and portable. Closes #12125

danielrainer · 2025-12-25T23:04:35Z

Rebased on latest master and reworked a bit. See PR description for outstanding work.

Asan does not seem to like the intentional leaks for creating &'static strs. A similar approach is already in use for gettext, and I remember getting leak warnings previously when refactoring the gettext code as well. Not sure how to best address these failures.

krobelus · 2025-12-26T09:49:49Z

Maybe the first step is to resurrect docker/jammy-asan.Dockerfile (ideally making it work with moving ubuntu versions). Assuming this is more convenient than github actions.
If we're reasonably sure it's an asan bug, we can maybe suppress allocations from the relevant functions via build_tools/lsan_suppressions.txt

danielrainer · 2026-01-08T18:47:33Z

Updates:

The detected leak is ignored as a false positive now.
ToFluentValue trait added, which allows using wide-string types and chars with the Fluent macros without having to add a to_string() call. For char, it might make sense to add an upstream implementation (impl From<char> for FluentValue)
fluent_ids! macro added for declaring Fluent IDs to be used in multiple places. Similar to the localizable_consts! macro we have for gettext.
Some commits were squashed together.

danielrainer · 2026-01-16T17:28:52Z

Rebased on latest master. The only change to this PR is that unnecessary allocations for FTL file name strings have been removed. The lsan suppression is still necessary.

The extracted function takes the parts which are used by gettext-extract, as well as the upcoming fluent-extract, and puts it into its own crate. This will allow having simpler proc macros for both localization systems, since it minimizes duplicated code.

Add an implementation allowing to use Fluent for localization in Rust. Fluent is significantly more expressive than gettext. It uses message IDs which, unlike in gettext, are not necessarily the default message string. This allows for proper support of messages which happen to be identical in English, but not in other languages. In gettext, this could be solved to some extent with contexts, but our gettext implementation does not support that. In Fluent, arguments to the message are specified as key-value pairs, which gives translators more semantic information and allows reordering the arguments in the translation, which is impossible with gettext. Fluent also allows for more complex grammatical features, such as different plural forms, grammatical cases, and adapting phrases to the correct gender. This commit only introduces the infrastructure for using Fluent instead of gettext, with the goal of eventually replacing gettext for localization in Rust. Making use of the new infrastructure is left to follow-up commits. To localize a message with Fluent, the new `localize!` macro should be used. Its first argument is a Fluent message ID. This can either be a string literal, or a constant defined via the `fluent_ids!` macro. The remaining arguments are key-value pairs, with the keys being Fluent argument/variable identifiers, and the values their corresponding values in the localization. Instead of using one key-value pair for variables, it is also possible to pass a reference to a single `FluentArgs` struct, which is defined in the `fluent` crate. This might be useful if repeated invocations of `localize!` with similar variable values are desired. The following example demonstrates the syntax: `localize!("some-id", string_arg = "a string", number = 42)` The result will be a `String`, formatted according to the rules in the relevant FTL file. On errors, this macro panics. At runtime, Fluent will look up the message ID in a Fluent Translation List (FTL) file, according to the user's language settings. These files are stored in `localization/fluent`. There is one file per language. Language selection works the same as for our gettext implementation. Because the source code does not contain a default version of the message, it is the developer's responsibility to add an entry for the message to the `en.ftl` file. Otherwise, localizing the message would fail at runtime. To prevent this, automated checks are added which extract the Fluent IDs defined in the source code and compare it to the ones defined in `en.ftl`. It is considered an error if these two sets of IDs are not identical. Checking this at build time allows us to rely on always having the message available in English. Similar to gettext msgid extraction, there is a proc-macro defined in `fluent-extraction` which extracts message IDs into a directory specified via the `FISH_FLUENT_ID_DIR` environment variable if the `fluent-extract` feature is active. To avoid recompilation, `build_tools/check.sh` caches the extracted IDs. In CI, no corresponding caching mechanism exists, so there the test checking the IDs will invoke Cargo to build fish, extracting the IDs. The `fish-fluent-check` crate performs these ID checks. `rust-embed` is used to make the FTL files available to the binary at runtime. Files will only be parsed if they are specified in the language precedence list, so users don't have to pay the parsing cost for languages they don't want to see. `en.ftl` is always parsed, since it is our implicit last fallback option. Because the Fluent ecosystem currently lacks some tooling, we use our own. It is implemented in an external library crate (currently hosted as a personal repo), and made available via `cargo xtask fluent` subcommands. The currently supported commands are: - `check`: Checks the FTL files, ensuring that they can be parsed without errors, that no duplicate IDs are specified, that they are formatted correctly, and that there are no extra IDs, i.e. IDs not present in `en.ftl`, which is expected to be complete. More rigorous checks could be added, such as checking whether the same set of variables are used for a certain ID in all languages. The complexity of Fluent's syntax makes this non-trivial, which is the reason it's not already implemented. - `format`: Formats the specified FTL files (or all by default). Also has a mode suitable for editor integration to format files from the editor. Examples for setting that up in Vim are provided in the `CONTRIBUTING.rst` docs. - `rename`: Renames IDs or associated variables across all FTL files. - `show-missing`: Shows which IDs don't have a translation yet. The external crate contains one additional tool for converting messages from gettext to Fluent. This is intentionally not added to fish, since it is only useful for the transition. Once we have ported all messages to Fluent we won't have a use for it anymore. If you are interested in using it to port messages, it's the `po-convert` binary in the `fluent-ftl-tools` package. The CLI is somewhat convoluted, but can be simplified by wrapping it with a script which hard-codes the path to the relevant PO and FTL file directories. Then, the remaining information which needs to be specified is: - a line number in a PO file to identify the message to be ported - the new message ID - the name of each variable, in the order the formatting specifiers appear in the gettext msgid. Specifying the line number and invoking the wrapper script can be partially automated by using a custom editor shortcut. The tool will port the msgstr for each language which has one defined, and always for English, where it can use the msgid if no msgstr exists. The tool does not edit Rust code, but suggests a Rust code snippet on stdout based on the specified message ID and variable names. This tooling relies on features of the `fluent` package which are not exported by default, so we use a fork which changes that until our PR for adding it upstream is accepted.

This migrates the fish version info message from gettext to Fluent. It can be used to see Fluent-based localization in action. Because this commit adds new FTL files, these languages show up in the Fluent language precedence, requiring an update to the corresponding tests.

Reword zh_CN as suggested in fish-shell#11833 (comment) fish -c 'for LC_MESSAGES in fr zh_CN zh_TW argparse h- end'

danielrainer · 2026-02-28T02:45:30Z

This is a fairly significant update. Tooling is now integrated and documented in CONTRIBUTING.rst. I also ported some more messages and removed features which turned out to not be particularly useful, such as macros wrapping localize!. We might also want to drop support for passing &FluentArgs to localize!. I'm not sure yet if we have a use case for it.

I think now would be a good time for both developers and translators to check out the PR and test the functionality relevant to them. Reviewing the updated "Contributing Translations" section in CONTRIBUTING.rst should be a good start. Then, working with the relevant subcommands of cargo xtask fluent should provide an impression of the capabilities of the existing tooling.

krobelus mentioned this pull request Oct 13, 2025

Discussion about removing fuzzy translations #11940

Open

danielrainer force-pushed the fluent_localization branch from f8ca253 to fb6d7ad Compare October 15, 2025 23:10

danielrainer force-pushed the fluent_localization branch from fb6d7ad to 4346737 Compare October 16, 2025 14:38

danielrainer force-pushed the fluent_localization branch 2 times, most recently from 11d1099 to 2e9b78a Compare October 16, 2025 21:34

danielrainer mentioned this pull request Oct 18, 2025

mktemp does not exist on some edge devices #11972

Open

danielrainer force-pushed the fluent_localization branch from 2e9b78a to 1f55024 Compare October 18, 2025 17:51

danielrainer force-pushed the fluent_localization branch from 1f55024 to 00de838 Compare October 20, 2025 15:04

This was referenced Oct 21, 2025

GPLv2 compatible license projectfluent/fluent-langneg-rs#30

Closed

GPLv2 compatible license Voultapher/self_cell#69

Closed

krobelus mentioned this pull request Oct 21, 2025

Dual-license under Apache 2.0 and MIT projectfluent/fluent-langneg-rs#31

Merged

4 tasks

danielrainer force-pushed the fluent_localization branch from cd43fd7 to 0966c7c Compare December 25, 2025 22:42

danielrainer force-pushed the fluent_localization branch 3 times, most recently from cd13c9f to fbe929d Compare January 5, 2026 15:15

danielrainer mentioned this pull request Jan 5, 2026

check: add sanitize option #12282

Open

1 task

danielrainer force-pushed the fluent_localization branch 3 times, most recently from bd00c0c to 13f9192 Compare January 8, 2026 18:40

danielrainer force-pushed the fluent_localization branch from 13f9192 to d7e5239 Compare January 16, 2026 17:26

danielrainer force-pushed the fluent_localization branch from d7e5239 to 368f3ed Compare January 18, 2026 20:07

danielrainer force-pushed the fluent_localization branch from 368f3ed to 51b41a0 Compare February 5, 2026 23:25

danielrainer force-pushed the fluent_localization branch from 51b41a0 to fdba4c3 Compare February 28, 2026 02:22

Daniel Rainer and others added 5 commits February 28, 2026 03:32

l10n: add system tests for Fluent

eaa50ca

l10n: port an argparse message from gettext to Fluent

8effcf5

Reword zh_CN as suggested in fish-shell#11833 (comment) fish -c 'for LC_MESSAGES in fr zh_CN zh_TW argparse h- end'

l10n: port more argparse messages to Fluent

f056b74

danielrainer force-pushed the fluent_localization branch from fdba4c3 to f056b74 Compare February 28, 2026 02:37

Uh oh!

Conversation

danielrainer commented Oct 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TODOs:

Uh oh!

danielrainer commented Oct 15, 2025

Uh oh!

krobelus commented Oct 16, 2025 via email

Uh oh!

danielrainer commented Oct 16, 2025

Uh oh!

danielrainer commented Oct 16, 2025

Uh oh!

krobelus commented Oct 16, 2025 via email

Uh oh!

danielrainer commented Oct 16, 2025

Uh oh!

krobelus commented Oct 16, 2025 via email

Uh oh!

danielrainer commented Oct 16, 2025

Uh oh!

faho commented Oct 20, 2025

Uh oh!

danielrainer commented Oct 20, 2025

Uh oh!

danielrainer commented Oct 20, 2025

Uh oh!

faho commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielrainer commented Oct 20, 2025

Uh oh!

faho commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielrainer commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

emilio commented Oct 22, 2025

Uh oh!

danielrainer commented Dec 25, 2025

Uh oh!

krobelus commented Dec 26, 2025

Uh oh!

danielrainer commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielrainer commented Jan 16, 2026

Uh oh!

danielrainer commented Feb 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

danielrainer commented Oct 10, 2025 •

edited

Loading

faho commented Oct 20, 2025 •

edited

Loading

faho commented Oct 20, 2025 •

edited

Loading

danielrainer commented Oct 20, 2025 •

edited

Loading

danielrainer commented Jan 8, 2026 •

edited

Loading