Localisation of more strings#2902
Conversation
Babel does not support f-strings, which prevents from localising a few strings in the code, like "Usage: " and "Try". The changes in this commit rewrites f-strings into the format syntax. Please note some f-strings remain as they are not expected to be translatable. This commit has to deactivate the UP032 rule on a few files. There seems to be an unreported bug in either Ruff or pyupgrade: The UP032 rule is not deactivated on multi-line commands, including multi-line strings.
Babel does not support f-strings, which prevents from localising a few strings in the code, like "Usage: " and "Try". The changes in this commit rewrites f-strings into the format syntax. Please note some f-strings remain as they are not expected to be translatable. This commit has to deactivate the UP032 rule on a few files. There seems to be an unreported bug in either Ruff or pyupgrade: The UP032 rule is not deactivated on multi-line commands, including multi-line strings.
Please note Windows may requires extra configuration (as it may not set variables expected by gettext). Click does not perform the extra for some reason, and deemed out of scope.
Some f-strings changed to the format method in earlier commits do not need localisation. This commit restores them to reduce the amount of changes.
|
I don't think we are willing to limit the use of f strings in the project at this time. Thoughts @davidism? |
| # always force f-strings. The latter are unfortunately not supported yet | ||
| # by Babel, a localisation library. | ||
| # | ||
| # Note: Using `# noqa: UP032` on lines has not worked, so a file |
There was a problem hiding this comment.
None of these noqa marks are needed. Ruff is not trying to autoupgrade anything if I remove the comments, inline or file-level.
There was a problem hiding this comment.
Updated from former Ruff 0.8.1 to the latest 0.14.10, and as you reported, the UP032 do not trigger anymore.
All files cleaned from the comments, as well as the noqa marks.
| f"It is not possible to add the group {cmd_name!r} to another" | ||
| f" group {base_command.name!r} that is in chain mode." | ||
| ) | ||
| message = _( |
There was a problem hiding this comment.
We're only translating user-facing messages at this time, not developer-facing. Please remove all such translation markings.
There was a problem hiding this comment.
I have done a pass on all new translations and removed the ones that looked like dev-facing. How does it look now? I am not sure about a couple.
|
For user-facing messages, yes we can use the The We're only translating user-facing messages at this time, not developer-facing. Please remove all such translation markings. |
| else "(DEPRECATED)" | ||
| else _("(DEPRECATED)") | ||
| ) | ||
| text = _("{text} {deprecated_message}").format( |
There was a problem hiding this comment.
This shouldn't have ever been marked for translation. It should be removed, since you've added the translation to the proper location above.
|
Most of the changes here need to be rolled back, they are not part of the user-facing output. |
|
Hey thanks for the review and feedback. A quick note to let you know I expect to improve the work here soon, most likely in November. Sorry for anyone watching and expecting this sooner. |
Co-authored-by: Carmen Bianca BAKKER <carmen@carmenbianca.eu>
* After a round of code review from the team. * Possibly incomplete, as based on the author's evaluation of what is for UX and what is for developers.
AndreasBackx
left a comment
There was a problem hiding this comment.
This looks fine now. Though I'd love for someone else to give this a final eye because I've never really touched too much gettext stuff.
|
For the record and contribute to the review, we have been using this branch for a year in our CLI, without noticeable problem so far: The CLI is for robotics engineering: (if you try the package, the output will differ as working on it, but the translation is in use for a year already) |
|
@ic how hard would it be to add a couple of unittests? At least to prove and lockdown that Click is properly translating strings? With unittests we can force future contributors to be aware of this support. Other than that this PR looks good for inclusion into 8.4.0. |
|
@kdeldycke Thank you for the round of review.
|
No worries, take your time! :) There is a couple of other PRs aligned for 8.4.0 that are going to take more time to land so we're good here! :)
Your attention to punctuation and separation of concern is the right one. See my comment on the review discussion at: #2902 (comment) . My proposal is to simply reduce the diff to make the merge to upstream even more easier.
It's better to keep this PR self-contained. Now that this PR has both your attention and mine, I will keep tracking it and helping you so it lands in the upcoming 8.4.0. If you need more time no worries. Just tell us with a quick comment here. If you find the merge too hard to do let me know. I can take over this PR to cleanup and prepare it for merging.
He is right that only user-facing strings should be translated. Re-reviewing the one I pointed out, yes, they should be translated as these are user-facing strings in the help screen: they're about options or parameters that are marked as deprecated. |
|
@kdeldycke Thank you for the detail. Proceeding accordingly. A thought on unit testing: A couple generic tests on the |
|
I don't really think this needs to be unit tested. Seems like a lot of work just to get this in. Maybe we can do that in another talk, but I'm not sure what those tests would reveal anyway. |
What I was thinking about is a kind of test just doing what @ic is demonstrating in its comment #2902 (comment) . And have that test called So @ic, no need to go to a complicated route like you proposed with |
|
Starting to proceed with the changes.
Note: All tests pass according to CI here. They also pass on a local Linux machine. But on a Mac Intel, I have a single error with Python 3.14t, 3.13 stress, and tried also the random environment in Tox. It looks unrelated, but for reference: |
The code looks clear from the context.
After review from core members.
Tentative approach, but looks fine.
| prompt_suffix, | ||
| show_default, | ||
| "y/n" if default is None else ("Y/n" if default else "y/N"), | ||
| _("y/n") if default is None else (_("Y/n") if default else _("y/N")), |
There was a problem hiding this comment.
Maybe it's too much but why not split this into:
{_("y")/_("n")}{_("y").upper()/_("n")}{_("y")/_("n").upper()}
The idea is to reuse the same _("y") and _("n") items you translated below? Or is this too much splitting?
There was a problem hiding this comment.
Do people typically want Y/N in this type of CLI translated? E.g. do French speakaers really want O/N instead, or German speakers J/N? (I can tell you that as a German speakers I would never want this!)
There was a problem hiding this comment.
Do people typically want Y/N in this type of CLI translated? E.g. do French speakaers really want O/N instead, or German speakers J/N? (I can tell you that as a German speakers I would never want this!)
I personally agree with you and find that awkward. But it is user facing, so in the name of applying a blanket translatable policy, let's allow this to be translated and changed. In the end this is about taste. So at least with this we allow CLI developers choose to apply the translation or not. The capability exist and is consistent, the rest is developers preferences. 🤷
There was a problem hiding this comment.
But does a CLI developer get to choose? If they enabled translations in general, they'd also get this one. Which is also kind of a breaking change because it changes the behavior of the CLI. I would make it opt-in somehow.
There was a problem hiding this comment.
I just tested in a docker container with debian and a german translation and how the rm -i ... prompt behaves: it accepts j and ja as yes, but y and yes also work. likewise with french where o and oui are acccepted, but also the english defaults.
So I strongly suggest we do the same here and at least always accept the standard values on top of additional ones coming from translations.
There was a problem hiding this comment.
I just tested in a docker container with debian and a german translation and how the
rm -i ...prompt behaves: it acceptsjandjaas yes, butyandyesalso work. likewise with french whereoandouiare acccepted, but also the english defaults.So I strongly suggest we do the same here and at least always accept the standard values on top of additional ones coming from translations.
Oh I see. Sorry I did not got your point in your first comment. Thanks for checking other CLIs and illustrating them. And so yes, I agree with you on that front and we should accept both the translated _("y") and _("yes") as well as the original y and yes as valid answer in the code below.
There was a problem hiding this comment.
Oh I got caught very lazy on this one. Thank you both for the feedback. I was actually wondering how people were workin in a different language... In non-alphabet languages, it seems to me developers (a tiny fraction of a population, right?) work with and expect y/n.
Interestingly asked Qwen (an LLM developed in China) how it is preferred there. It replies they expect y/n. The example it reports:
是否继续?(Y/n):
确认安装?(y/N):
删除文件?(Y/n):
Same for Japanese. It goes as far as to list "best practices" like: "Never translate y/N or Y/n: Maintains cross-language consistency and avoids terminal width issues."
Ok let's see that point in a future PR. Thanks for your feedback on the tests, we will see that topic later in a future PR. I just have a last comment on the |
|
Experimented with translations on Another issue is default format. Language without notion of capitalisation do not match the # src/click/termui.py
# in the confirm function
def confirm(...):
yes_mark = _("y")
no_mark = _("n")
def_mark = (
"" if default is None else (f" ({yes_mark})" if default else f" ({no_mark})")
)
prompt = _build_prompt(
text, prompt_suffix, show_default, f"{yes_mark}/{no_mark}{def_mark}"
)
while True:
try:
value = _readline_prompt(visible_prompt_func, prompt, err).lower().strip()
except (KeyboardInterrupt, EOFError):
raise Abort() from None
if value in ("y", "yes", _("y"), _("yes")): # <= here accepting y and translations
rv = True
elif value in ("n", "no", _("n"), _("no")): # <= here accepting n and translations
rv = False
...This would lead to output like (chose round brackets for readability): As much as I'd like full localisation, this looks regular enough but not great. If Click users are essentially developers, y/n really looks the way to go. On option for the rare users who want customisation is best, though, perhaps out of scope of this PR (stashed the above, but need a mechanism to switch modes). |
|
@kdeldycke Conclusion on the geeking out here: I propose to rollback the y/n commit (0b75b01) for now, and freeze the PR scope. The thinking:
|
Oh yes. That's full of edge-cases. Let's keep things simple for now. You can revert that commit, and I will merge after that. Thanks a lot for your deep tests! |
This reverts commit 0b75b01.
Rollback completed. |
|
Thanks @ic for your patience and for your work! Merged upstream and will be part of Click 8.4. |
Problem and proposal
Using Click in a multi-lingual package, we would like to localise more strings than currently possible.
In our work before this PR, we reach like:
The package we develop yields the
artefactscommand, based on Click, works alright to translate our strings, as well as a range of strings in Click wrapped withgettext.With this PR, we get:
Which looks like possibly full coverage of meaningful strings in Click.
Note this PR addresses two issues:
gettextLimitation, side-effects and discussion
It looks like part of this PR may be desired, as localisation is already in place---this mainly covers more of the strings (I'd say all the strings, but localisation does not always make sense).
However in the current approach I had to "deal" with f-strings and PyUpgrade, which may be undesired change for the project. On top of that it seems there is a bug in either Ruff or PyUpgrade in accepting ignore rules on multi-line commands. So a bunch of files get a file-global deactivation of the U032 rule, which means these files will not get checked for f-strings.
F-strings are assumedly desired, but given (1) PyBabel does not and may not support them, and (2) Click is library code, the project may (have to) accept the format method everywhere localisation is needed.
Recent discussions mention f-strings may work in PyBabel from Python 3.12. We did not confirm it, as we want to support down to the oldest supported Python3. So the changed proposed here may be transient for a couple years (then PyUpgrade may be reactivated and let work).
Related work
This PR only aims at localising more strings, so related to i18n issues.
On the way to this PR, we have considered a couple alternatives, notably trying to use the class API of Python's
gettext. Some elements of discussion here may be useful to:In fact, Carmen's post helped solve an issue in using catalogues from different domains at runtime (thanks!).
Checklist on CONTRIBUTING
.. versionchanged::entries in any relevant code docs.At submission time, nothing checked here, as first would like to make sure this PR target is acceptable (likely not as-is). The tox-based checks all pass, though (i.e. running the
toxcommand returns all green, except the skipped tests).This PR supersedes #2890, because of a problem with GitHub.