Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Automatic code formatting #5387

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
pekkaklarck opened this issue Mar 27, 2025 · 12 comments
Open

Automatic code formatting #5387

pekkaklarck opened this issue Mar 27, 2025 · 12 comments

Comments

@pekkaklarck
Copy link
Member

We should take automatic code formatting into use. I don't personally need it that much, but it would certainly make it easier for contributors to format their code and that saves my time in reviewing PRs.

I've experimented with Black already years ago, but back then it did some changes that I really disliked and that caused huge amount of noise in history. The situation is nowadays better in that regard, and the amount of unnecessary (in my opinion) and annoying (again, my opinion) changes is pretty small. The Ruff formatter is also so fast that formatting is instantaneous even with our relatively big project. Overall, auto-formatting benefits are nowadays bigger than problems even in my opinion.

Adding auto-formatting to an existing projects basically has two parts:

  • Format the code, adapt it if needed, and commit it.
  • Automate formatting and/or linting so that it happens on CI and with PRs.

This issue concentrates fully on the first part. My plan is basically this:

  • Create a simple ruff.toml and commit it to the project root.
  • Run ruff format.
  • Go through the changes and adapt formatting where needed. This is quite a big task, but needs to be done only once.
  • Commit formatting changes as a single commit.
  • Create --ignore-revs-file with the above commit and configure GitHub to use it.
  • Instruct in contribution guidelines that contributors should run ruff format before creating a PR.

Further automation can then be done later.

@LucianCrainic
Copy link
Contributor

Hi Pekka,

with this you are only looking to format Python Code ? What about the robot code that we use for atesting in the repo ?

@pekkaklarck
Copy link
Member Author

Yeah, this is only for Python code. atest/robot side could be formatted with Robocop, but atest/testdata absolute shouldn't, because it contains lot of funky data on purpose.

@pekkaklarck
Copy link
Member Author

My biggest annoyance with Black was that it removed the empty line after the class declaration in cases like this:

class Example:

    def __init__(self, arg):
        ...

Black has recently stopped doing that, but apparently Ruff hasn't. That means we'll use Black and possibly migrate to Ruff once its behavior changes in this regard.

@pekkaklarck
Copy link
Member Author

I've been going through changes by Black and for most parts they are fine. There are, however, cases where I don't like the results too much. I'll go through the most important cases here as separate comments and also explain how I plan to handle them. In the end I plan to explain these in our contribution guidelines as well. Feel free to comment if you have other ideas how to handle these cases. You can also just use 👍 to indicate that you agree with my reasoning and 👎 if you think I should just go with what Black does by default.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Apr 14, 2025

Problem 1: Three different ways to format signatures

How Black formats function signatures depends on the signature length:

  1. If everything fits into a single line, the signature is just a single line:
    def example(first_arg: Iterable[int], second_arg: int = 0) -> int:
        ...
  2. If everything doesn't fit into a single line, but all arguments do, formatting is like this:
    def example(
        first_arg: Iterable[float], second_arg: float = 0.0, third_arg: bool = True
    ) -> int:
        ...
  3. If arguments don't fit into a single line on their own, they all get their own lines:
    def example(
        first_arg: Iterable[float], 
        second_arg: float = 0.0, 
        third_arg: bool = True,
        fourh_arg: bool = False,
    ) -> int:
        ...

The first and the last formats look good to me. I like simple signatures using just a single line, and also like how all arguments of a long signature gets their own lines. I don't, however, understand the middle formatting at all. It's hard to read with all those argument names, types and default values, and the signature is also reformatted, creating a large diff, immediately if it gets a bit longer. Having three formatting approach, possibly even in subsequent functions, also feels pretty inconsistent to me.

Unfortunately there isn't, AFAIK, a way to tell Black to automatically avoid the middle formatting. That can, however, be done manually by adding a comma after the last argument. The nice thing is that you can add a trailing comma to a single line signature and Black will automatically explode the signature when it is run again. Needing to add the comma manually is annoying nevertheless.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Apr 14, 2025

Problem 2: Inline commend handling

I don't like Black formatting inline comments used like

TypeHint = Union[
    type,                     # Actual type.
    str,                      # Type name or alias.
    UnionType,                # Union syntax (e.g. `int | float`).
    'tuple[TypeHint, ...]'    # Tuple of type hints. Behaves like a union.
]

to

TypeHint = Union[
    type,  # Actual type.
    str,  # Type name or alias.
    UnionType,  # Union syntax (e.g. `int | float`).
    'tuple[TypeHint, ...]',  # Tuple of type hints. Behaves like a union.
]

I really think Black should be clever enough to see that comments have been aligned, figure out that there's probably a reason for that, and leave them alone. Anyway, it doesn't do that so we need alternatives. This is what I've done:

  1. In some cases I've changed inline comment to full line comments above the code. For example:
    a = 1             # Explanation for 'a'.
    bbbbbbbbb  = 2    # Explanation for 'bbbbbbbbb'.
    ccc = 3           # Explanation for 'ccc'.
    # Explanation for 'a'.
    a = 1             
    # Explanation for 'bbbbbbbbb'.
    bbbbbbbbb  = 2    
    # Explanation for 'ccc'.
    ccc = 3           
  2. In other cases I decided to just live with the changes.
  3. In few cases, like with the TypeHint above, I disabled formatting by using # fmt: off and # fmt: on.
  4. I also submitted a style enhancement request to Black. I know they don't change their styles slightly and thus don't have too high hopes for the proposal to be accepted, but at least it wasn't closed as "invalid" or "wontfix" immediately.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Apr 14, 2025

Problem 3: Exploding of imports

When importing multiple items from a module, the import often gets so long that the line needs to be split. This is a slightly simplified example from src/robot/model/body.py:

from typing import (Callable, cast, Generic, Iterable, Type, TYPE_CHECKING, 
                    TypeVar, Union)

if TYPE_CHECKING:
    from .control import (Break, Continue, Error, For, ForIteration, Group, If,
                          IfBranch, Return, Try, TryBranch, Var, While, WhileIteration)

I really don't like how Black explodes these imports and produces this:

from typing import (
    Callable,
    cast,
    Generic,
    Iterable,
    Type,
    TYPE_CHECKING,
    TypeVar,
    Union,
)

if TYPE_CHECKING:
    from .control import (
        Break,
        Continue,
        Error,
        For,
        ForIteration,
        Group,
        If,
        IfBranch,
        Return,
        Try,
        TryBranch,
        Var,
        While,
        WhileIteration,
    )

I don't mind the above format too much in an __init__.py file that only contains imports, but in a "normal" module with some actual code, wasting a huge amount of vertical space at the top is rather annoying. The only positive side I see is that adding a new import only adds one line and a diff is simple, but that is a pretty small convenience especially when diff viewers can highlight inline changes pretty well anyway.

There have been requests to change Black's behavior or to make it configurable (see e.g. this issue), but it's unlikely this changes. One alternative to handle this, as discussed in the aforementioned issue, is using isort that would also sort imports for us automatically. Unlike Black, isort is very configurable and we could configure it to produce, for example this output:

from typing import (
    Callable, Generic, Iterable, Type, TYPE_CHECKING, TypeVar, Union, cast
)

if TYPE_CHECKING:
    from .control import (
        Break, Continue, Error, For, ForIteration, Group, If, 
        IfBranch, Return, Try, TryBranch, Var, While, WhileIteration
    )

A problem with this is that there's at least one case where I actually prefer each imported items to be on their own lines. There's a convention to specify public module/package using redundant import aliases like in src/robot/api/parsing.py:

from robot.parsing import (
    get_tokens as get_tokens,
    get_resource_tokens as get_resource_tokens,
    get_init_tokens as get_init_tokens,
    get_model as get_model,
    get_resource_model as get_resource_model,
    get_init_model as get_init_model,
    Token as Token,
)

In this case combining multiple items to the same row creates unreadable results:

from robot.parsing import (
    Token as Token, get_init_model as get_init_model,
    get_init_tokens as get_init_tokens, get_model as get_model,
    get_resource_model as get_resource_model,
    get_resource_tokens as get_resource_tokens, get_tokens as get_tokens
)

I guess what I really wanted was the one-item-per-line mode with import aliases using as, but multiple-items-per-line mode otherwise. I'm not sure does any tool support something like that and it could be considered inconsistent in general. My current thinking how to proceed is:

  • Live with Black's default import formatting at least for now.
  • Use # fmt: skip with especially annoying cases if needed.
  • Return to this when thinking about automatic import sorting. Based on my initial testing that isn't as straightforward as I hoped either.

UPDATE: It turned the desired import formatting can be accomplished like this:

  • First use Ruff to sort and organize imports so that all multi line imports are on their own lines.
  • Then use isort to organize multi line imports so that there can be multiple items on the same line, but exclude files containing redundant import aliases from this run. We have these aliased imports, that denote an explicit API, only in __init__.py files and in robot/api/parsing.py, so excluding is easy.

Isort supports multiple ways how to organize multi line imports. In our case Ruff will use the Vertical Hanging Indent mode and isort is configured to use the Hanging Grid Grouped mode. See configuration in pyproject.toml and usage in tasks.py for more information once the changes have been committed.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Apr 25, 2025

Problem 4: Handling generic classes with multiple parameters

robot.running.TestSuite and robot.result.TestSuite structures have a common robot.model.TestSuite base structure. We've used generics heavily to get them typed well and as the result we have classes like this:

class Branches(BaseBranches['Keyword', 'For', 'While', 'Group', 'If', 'Try', 'Var',
                            'Return', 'Continue', 'Break', 'Message', 'Error', IT]):
    __slots__ = ()

When the above is formatted with Black, the result is this:

class Branches(
    BaseBranches[
        'Keyword',
        'For',
        'While',
        'Group',
        'If',
        'Try',
        'Var',
        'Return',
        'Continue',
        'Break',
        'Message',
        'Error',
        IT,
    ]
):
    __slots__ = ()

I understand why it's formatted like that, but the result is still rather odd. It doesn't even look like a class declaration and it also takes a lot of vertical space. My current thinking is keeping the current formatting and using # fmt: skip:

class Branches(BaseBranches['Keyword', 'For', 'While', 'Group', 'If', 'Try', 
                            'Var', 'Return', 'Continue', 'Break', 'Message', 
                            'Error', IT]):  # fmt: skip
    __slots__ = ()

UPDATE: In the end decided to use this format:

class Branches(BaseBranches[
    "Keyword", "For", "While", "Group", "If", "Try", "Var", "Return", "Continue",
    "Break", "Message", "Error", IT
]):  # fmt: skip
    __slots__ = ()

The motivation was to have no items after the final opening parentheses, because then items always start with consistent indentation regardless of the length of the class names. This is looks consistent with rest of the Black formatted code.

I decided to use the same format also in other cases where lists were long but individual list items were simple. For example, instead of

added_in_rf60 = {
    "bg",
    "bs",
    "cs",
    "de",
    "en",
    "es",
    "fi",
    "fr",
    "hi",
    "it",
    "nl",
    "pl",
    "pt",
    "pt-BR",
    "ro",
    "ru",
    "sv",
    "th",
    "tr",
    "uk",
    "zh-CN",
    "zh-TW",
}

I decided to use

added_in_rf60 = {
    "bg", "bs", "cs", "de", "en", "es", "fi", "fr", "hi", "it", "nl", "pl",
    "pt", "pt-BR", "ro", "ru", "sv", "th", "tr", "uk", "zh-CN", "zh-TW",
}  # fmt: skip

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Apr 27, 2025

Problem 5: Handling Boolean expressions

Black avoids parentheses around Boolean expressions and as the result formats code like

ext = (
    getattr(self.parser, 'EXTENSION', None)
    or getattr(self.parser, 'extension', None)
)

and

return (
    self._get_runner_from_resource_files(name)
    or self._get_runner_from_libraries(name)
)

to

ext = getattr(self.parser, 'EXTENSION', None) or getattr(
    self.parser, 'extension', None
)

and

return self._get_runner_from_resource_files(
    name
) or self._get_runner_from_libraries(name)

I consider the formatting in the last two examples horrible. There's an issue about changing this in Black's issue tracker, but no indication that the behavior would be changed. Luckily this doesn't happen too often and it's easy to use # fmt: skip.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented Apr 27, 2025

I've now locally gone through all formatting changes by Black and handled annoyances either by refactoring code or by using # fmt: skip or # fmt: off/on. The remaining tasks include:

  • API docs to the newly added invoke format task.
  • Linting. Linting should be done automatically before formatting by invoke format and there should also be a separate invoke lint task. This also includes fixing all linting issues.
  • Adding explicit APIs to all packages to avoid linting errors from unused imports. I submitted a separate issue Add explicit APIs to robot root package and to all sub packages #5414 about that.
  • Deciding how to handle imports. See the Problem 3 above. Decision was to reorganize "normal" multiline imports with isort so that there are multiple items per row to avoid using too much horizontal space.
  • Deciding quoting style. I'm starting to think we should switch from 'single quotes' to "double quotes". Decision was to switch to "double".

pekkaklarck added a commit that referenced this issue Apr 27, 2025
Use a dedicated method instead.

The main motivation is avoiding linting errors from unused imports
(see #5387), but this may also help with issues `pythonpathsetter` has
caused (#5384).
@pekkaklarck
Copy link
Member Author

Formatting changes committed in d2cdcfa.

@pekkaklarck
Copy link
Member Author

pekkaklarck commented May 2, 2025

Things to do:

  • Create --ignore-revs-file with the above commit and configure GitHub to use it.
  • Run Dialogs and Telnet library tests to make sure nothing got broken in the huge formatting comment. These libraries aren't tested automatically by our acceptance test system.
  • Update contribution guidelines. Most importantly, instruct contributors to run invoke format before creating a PR. Should also add instructions when to use the "magic comma" or # fmt: skip to avoid suboptimal automatic formatting, but this should be an optional task for contributors. Updating docs can wait until the release candidate is out.

pekkaklarck added a commit that referenced this issue May 2, 2025
GitHub will ignore commits in this file by defaul and Git can be
configured to ignore them locally as well.

Initially contains the code formatting commit done as part of #5387.
More commits, also past ones, can be added later if needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants