Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@TimothyEDawson
Copy link
Contributor

@TimothyEDawson TimothyEDawson commented Jul 14, 2025

Changes proposed in this pull request

  • Add Python stub (.pyi) files to the Cython interface which cover the public API for the Python code, including runtime-generated attributes.
  • Include the stub files in the installed Python package.
  • Add testing infrastructure to verify consistent typing and adequate coverage.

There was some discussion about the approach here, but in short I consider this to be a stepping stone toward adding static typing to the Cython interface. Some advantage of starting with stub files include:

  • The full public API is made immediately available for developers who utilize Cantera as a dependency in their Python projects (like me!).
  • Work on type hints is fully segregated from the actual Python code as many details are worked out and the testing infrastructure is set up.
  • It is substantially easier to add typing for the Python API without contending with Cython syntax.
  • It makes dynamically-generated parts of the API, such as pass-through attributes from a Solution to a Quantity or SolutionArray, trivial to type-hint, and may (?) be the correct way to do so.

There are some disadvantages, with the largest perhaps being the substantial increase in the amount of code to be maintained going forward. My hope is that the majority of the stub file content is gradually replaced with type hints within the source code, and I'm aware that I already have some redundant type hints in here.

If applicable, fill in the issue number this pull request is fixing

Potentially closes Cantera/enhancements#85.

Checklist

  • The pull request includes a clear description of this code change
  • Commit messages have short titles and reference relevant issues
  • Build passes (scons build & scons test) and unit tests address code coverage
  • Style & formatting of contributed code follows contributing guidelines
  • The pull request is ready for review

@TimothyEDawson
Copy link
Contributor Author

Some notes:

I covered a good chunk of the public API already, but I am aware of some holes here and there which I plan to fill in as time permits. I have deliberately omitted the various file conversion scripts (e.g. ck2yaml.py), but am not opposed to including them here.

I still need to figure out a good way to add unit testing for the type hints. I expect that would look like liberal use of typing.assert_type, as used in scipy-stubs, but to get and maintain thorough coverage may require some thought on infrastructure.

Several decisions were a little impulsive and I'm sure there are many improvements and changes needed. Some examples:

  • I have defined numerous TypeAliases for the sake of convenience, many of which should probably start with an underscore so they are not included as part of the public API.
  • Any is used in several places where it may be possible to replace it with an Unpacked TypedDict or other option.
  • I started to add __all__ in some areas when that should probably be a separate pull request to add it to the source files.
  • __init__.pyi is quite cumbersome. I attempted to turn all of the implicit imports into explicit imports, then removed the exported symbols which originate from third-party libraries, but I'm sure there is a more elegant way to do this. I think that thorough use of __all__ might make the explicit imports fully redundant.
  • NumPy typing always feels like there are too many ways to do things, so I went with a very minimal approach of assuming all ndarrays are arbitrarily-sized np.typing.NDArray[np.float64] which I call Array (should probably be _ArrayFloat64 or something), and anything which is specifically going to be coerced into an ndarray is typed as np.typing.ArrayLike. I'm aware some packages like optype and NumType may provide more descriptive and flexible type hints, but I don't want to add any external dependencies unless absolutely necessary.
  • I did not use backwards-compatibility stuff like from typing_extensions import because I was under the impression those are largely intended for source code, but I need to look into it more to see if those imports should be modified.
  • The stubs simply assume you have the necessary optional dependencies to work, such as Pandas. I'm not certain whether that's the correct way to handle them.

And I'm sure there's more, I just wanted to start getting feedback now rather than keep postponing this pull request forever!

@ischoegl
Copy link
Member

ischoegl commented Jul 15, 2025

Thanks for your efforts on this, @TimothyEDawson - this looks really promising! I am, however, somewhat concerned about maintaining parallel signatures in pyx and pyi files: for any change or addition, things need to be edited in two (usually large!) parallel files, which, imho, is not ideal.

Based on what you suggest here, I ran a couple of quick tests: specifically, I was interested in whether type hints are preserved if they are added directly in the pyx file. I ran some rudimentary tests on interfaces/cython/cantera/_utils.pyx (i.e., one of the shorter files), without adding AnyMap.

Based on successful tests, it appears that adding type hints directly to pyx may be more maintainable than keeping separate pyi files. However, this assumes that things can be combined, i.e., where static types are known, add it to pyx, whereas dynamic properties and methods may have to be added via pyi.

You can find my (very rudimentary) test here: https://github.com/ischoegl/cantera/tree/test-type-hints (just one commit) ... I simply copied over what you had in the pyi file.

@TimothyEDawson
Copy link
Contributor Author

TimothyEDawson commented Jul 15, 2025

@ischoegl yes, we're on the same page. That was the same concern I highlighted in the pull request text.

There's no inherent issues with .pyx files as they're generally treated the same as .py files as far as type hints go, that was not part of the motivation for this. There's an order of resolution to type hints as defined here: https://typing.python.org/en/latest/spec/distributing.html#partial-stub-packages Because stub files take precedence over inline type hints, the path forward would essentially be to delete type hints from the stub files as they are added to the source code files.

I am open to whichever approach the Cantera developers wish to take. The two options which seem best to me are:

  • Merge in a full set of stubs with this pull request, then gradually work toward moving type hints into the inline code via subsequent pull requests.
  • Work towards inlining the type hints as much as possible within this pull request, stopping when it's either complete (no more .pyi files) or it gets too tricky (arbitrary).

I lean toward the first option because it gives users a working set of type hints while we work out details for the inline version. It also means I don't even need to touch any .pyx files. Maintainability is a concern, but mostly just for API changes, and any new functionality which is properly type-hinted inline won't need to be added to the .pyi files at all.

At the same time, realistically it will be trivial to move the majority of these type hints inline right now. I expect there will be some edge cases which might take a while to figure out what to do, but if that's the route we want to go, I'd be happy to try it.

@TimothyEDawson
Copy link
Contributor Author

One issue I encounter when moving things into the .pyx files is that the language server extensions I use in VS Code (e.g. Pylance, Ruff, Pyrefly) don't recognize Cython syntax, and thus can't help me catch errors like they can in type stubs. That's mostly a convenience, so I'm willing to live with it if I can come up with a robust testing procedure which could catch such errors.

I'm going to focus on finishing up the stub files, as I'm already quite close, and designing some testing infrastructure to ensure the type hints match the runtime code. In its current state I'm already using it extensively on my other projects, and finding it to be very helpful (and also finding areas which need reworking).

I think there are also many good discussions to be had in the nuances of typing. For one example, I haven't found a way to represent attributes which are only sometimes available, e.g. how a SolutionArray based on a Solution object doesn't have Q related attributes, and one based on a PureFluid won't have the attributes associated with Kinetics. I did find that I can use generics in such a way that your IDE will flag that an attribute is unreachable, so I'm moving forward with implementing that now, but unreachable attributes will unfortunately still be visible in the intellisense/tab-completion and at runtime when calling dir.

@codecov
Copy link

codecov bot commented Jul 31, 2025

Codecov Report

❌ Patch coverage is 93.43284% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 75.51%. Comparing base (6ae17f6) to head (bf60396).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
interfaces/cython/cantera/ctml2yaml.py 93.75% 8 Missing and 3 partials ⚠️
interfaces/cython/cantera/liquidvapor.py 76.19% 5 Missing ⚠️
interfaces/cython/cantera/_types.py 95.45% 0 Missing and 3 partials ⚠️
interfaces/cython/cantera/yaml2ck.py 90.47% 1 Missing and 1 partial ⚠️
interfaces/cython/cantera/lxcat2yaml.py 97.50% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1926      +/-   ##
==========================================
+ Coverage   75.47%   75.51%   +0.04%     
==========================================
  Files         454      455       +1     
  Lines       56798    56928     +130     
  Branches     9356     9361       +5     
==========================================
+ Hits        42866    42990     +124     
- Misses      10764    10765       +1     
- Partials     3168     3173       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@bryanwweber bryanwweber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this! Aside from the line comments, two other suggestions:

  1. Is there any way to add tests for these types? Both correctness relative to implementation and completeness of the types. The former to catch changes in signatures, the latter to catch new functions
  2. Can the imports of Cantera functions be made relative instead of absolute? I'm a little worried about having another Cantera installation on the PYTHONPATH and mixing up the hints

Comment on lines 18 to 29
from cantera._utils import __git_commit__ as __git_commit__
from cantera._utils import __version__ as __version__
from cantera._utils import hdf_support as hdf_support
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These appear to be unused here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While true, they are exported symbols which exist in onedim's namespace. You can verify that by executing:
import cantera as ct print(ct.onedim.__git_commit__)

That being said, it's probably fine to remove them as its existence here should be visible from the onedim.py file, and the relevant type information should discernable from _onedim.pyi. Some of the odd code like this originated from MyPy's stubgen, which I used as a rough starting point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should focus on documenting the intended public interface (here and elsewhere), as opposed to the details that aren't relevant to end users. These constants are imported here, sure, but that's essentially an implementation detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely, though it's not always clear what is relevant and what isn't. Even attributes with an underscore are sometimes important to me, e.g. SolutionArray._phase.

Copy link
Member

@bryanwweber bryanwweber Aug 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal preference is to elide as many implementation details as necessary, including defining types that represent a combination of objects if that's necessary. The reason I feel this way is that I think it will be much easier to get something merged than to chase complete correctness. I'm not sure if that changes anything you're already doing though 😁

units: ApplicationRegistry
Q_: Quantity

def copy_doc(method: F) -> F: ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not meant to be a public function, only a convenience wrapper. If your checking works without including this, I think you can remove it

@TimothyEDawson
Copy link
Contributor Author

Hey @bryanwweber , thank you for the review! I'll respond to all the comments soon.

Regarding testing, certainly, that's one of the things I'm working on. Any static type checker like Mypy will easily flag static functions which are missing types. Dynamic attributes might be tricky. mypy stubtest cantera was able to catch some that I missed.

Regarding absolute vs. relative imports, certainly we can swap it. I have a strong personal preference for absolute imports, though I don't know if there are very strong objective reasons to prefer one over the other. I would note that the way I generally avoid the issue you raised when working on a Python package is to perform an editable install (e.g. pip install -e .), though I am unsure if that is an option when using Scons and probably won't reflect Cython code changes.

There are a few high level items I'd love to discuss, some of which may go beyond the scope of this pull request. One is the public/private member distinction - I've noticed many symbols which are exported by Cantera which probably should not be, such as external packages (e.g. Numpy) and the copy_doc decorator function you called out. (Although a properly-typed copy_doc might be useful, I actually have an updated version of it to push at some point.)

One of my goals was to type everything which is exported, even if it wasn't really intended to be publicly accessible, or at the very least everything which isn't prefixed by an underscore (and using judgement for things which are). However, it quickly became clear to me that Cantera should be utilizing __all__ to control the exports of each module, and be a lot more restrictive. This would also solve some issues which I see have workarounds, such as the function composite._make_functions which has a comment saying it exists simply to avoid polluting the module namespace.

There's also a lot that could be added to pyproject.toml to control what options what we want enforced in a given type checker (e.g. MyPy). I have a setup on my end which I haven't pushed. That might be a can of worms given how much can be done within pyproject.toml and how many static type-checkers (and linters) are available, I was planning to defer that until after I've found a testing procedure I'm happy with.

And one last general note, any appearance of Any is a placeholder, and there's certainly a lot of work left to do with every *args, **kwargs and TypedDict. For implementations using args and kwargs I considered whether we could just specify the expected signature instead (i.e. only the inputs which will actually be used), but we should probably either a) still retain some kind of indication that the function will accept any arbitrary arguments, or b) rewrite the actual implementation so it does not do that anymore, if that's not a desired feature. Any thoughts?

@ischoegl
Copy link
Member

[…] I've noticed many symbols which are exported by Cantera which probably should not be, such as external packages (e.g. Numpy) and the copy_doc decorator function you called out. […]

One of my goals was to type everything which is exported, even if it wasn't really intended to be publicly accessible, or at the very least everything which isn't prefixed by an underscore (and using judgement for things which are). However, it quickly became clear to me that Cantera should be utilizing __all__ to control the exports of each module, and be a lot more restrictive.

There was some prior discussion on this a long time ago, see #616

@TimothyEDawson
Copy link
Contributor Author

TimothyEDawson commented Aug 1, 2025

@ischoegl I appreciate that background information! I strongly believe that it should be revisited. Making an extra step necessary to make a new class or module-level attribute exported, and thus opt-in, would be a good thing in my opinion. It's worth noting that developers are still free to directly import things which are not in the list, they just need to know it's there.

And to the point about dir(ct), yes, that's not common; however, if anyone tries from cantera import *, they will end up with a lot of extra stuff which could cause issues. Of course, star imports are generally discouraged for that very reason, but Cantera itself uses them a lot. All that namespace pollution also shows up within intellisense and autocomplete suggestions in modern IDEs and the Python REPL.

I was planning to make a new issue to this effect, but I could just as well comment on the original thread.

@ischoegl
Copy link
Member

ischoegl commented Aug 1, 2025

@ischoegl I appreciate that background information! I strongly believe that it should be revisited. […]

A lot has happened since, so I tend to agree. Ad star imports, see #1791 … that was for build scripts, but the issues are somewhat related.

@ischoegl
Copy link
Member

ischoegl commented Aug 1, 2025

I was planning to make a new issue to this effect, but I could just as well comment on the original thread.

@TimothyEDawson ... Feel free to create a new issue while referencing #616. Your angle is sufficiently different and likely more convincing; some pros and cons for (internal) star imports are discussed in #1791.

@bryanwweber
Copy link
Member

though I am unsure if that is an option when using Scons and probably won't reflect Cython code changes.

Indeed, this doesn't work as you suspected. It's also somewhat common as a dev to have PYTHONPATH set to control where the interface is imported from. In that case, relative imports ensure that the correct source code is picked up.

@bryanwweber
Copy link
Member

bryanwweber commented Aug 2, 2025

I've noticed many symbols which are exported by Cantera which probably should not be

I'm not sure about "should" here, but I agree we could simplify things. I'd guess the high-level interface is pretty static at this point, so I could see a case where __init__.py only imports the intended public interface rather than * imports from the submodules. I don't think using __all__ makes a ton of sense, though 😕

One of my goals was to type everything which is exported

As I've said elsewhere, I think we should only type the intended exported interface for now. For two reasons, first it makes this PR simpler, and second because it gives us flexibility to change the unintended interface with a little more freedom.

There's also a lot that could be added to pyproject.toml to control what options what we want enforced in a given type checker 

I'd rather add a separate config file. To the extent we can support multiple type checkers, that would be good I think

@TimothyEDawson
Copy link
Contributor Author

I'm quickly closing in on being able to run a moderately strict mypy within the interfaces/cython folder and having it return no errors! I only have 15 errors remaining across 2 files - ctml2yaml.pyi and lxcat2yaml.pyi. I'm hoping that won't be too difficult to incorporate into the testing architecture. I'm looking into how typeshed does its testing for inspiration: https://github.com/python/typeshed/tree/main/tests.

There are some tools for automatically merging stub files into the source code files which might be worth trying, but I assume they won't work for the .pyx files. I'm also not positive that Cython syntax supports the full breadth of typing features I'm currently employing.

@TimothyEDawson
Copy link
Contributor Author

I'm not sure about "should" here, but I agree we could simplify things. I'd guess the high-level interface is pretty static at this point, so I could see a case where __init__.py only imports the intended public interface rather than * imports from the submodules. I don't think using __all__ makes a ton of sense, though 😕

Whenever I get around to opening that new issue regarding use of star imports and __all__, I'd definitely like to hear why! I have personally found __all__ to be incredibly convenient and useful in my own projects.

As I've said elsewhere, I think we should only type the intended exported interface for now. For two reasons, first it makes this PR simpler, and second because it gives us flexibility to change the unintended interface with a little more freedom.

So if we don't have types for everything exported to begin with, it makes the testing infrastructure much more complex. Right now it's very close to "does everything have a type? Good!" whereas if I start removing things I've already annotated, the tests will need to now know what all of the exceptions are in order to pass (which is doable, e.g. by adding # type: ignore directives). It's quite easy to just type things that are optional, as I already have.

However, a third option of using __all__ to substantially reduce the exported symbols would be great, and I could whip that up in an afternoon. Then anything not contained in those exports can and should be removed from these stubs.

It's also worth noting that once there's a generalized testing infrastructure in place, it would be pretty straightforward to enforce that new functions must be a.) Added to __all__ if they're intended to be part of the public interface, and b.) must be typed to be accepted, but typing remains optional for implementation details.

@TimothyEDawson
Copy link
Contributor Author

TimothyEDawson commented Aug 2, 2025

I'd rather add a separate config file. To the extent we can support multiple type checkers, that would be good I think

Why would you do that? Anything that would go into mypy.ini can also go into a [tool.mypy] section within pyproject.toml, likewise for every modern static type checker and linter. What is the benefit to creating a bunch of config files instead of adding the lines to the one which is already present?

@ischoegl
Copy link
Member

ischoegl commented Aug 2, 2025

[...] I'd guess the high-level interface is pretty static at this point, so I could see a case where __init__.py only imports the intended public interface rather than * imports from the submodules. I don't think using __all__ makes a ton of sense, though 😕

I knew that that __all__ would be controversial! 😂

So if we don't have types for everything exported to begin with, it makes the testing infrastructure much more complex. Right now it's very close to "does everything have a type? Good!" whereas if I start removing things I've already annotated, the tests will need to now know what all of the exceptions are in order to pass (which is doable, e.g. by adding # type: ignore directives). It's quite easy to just type things that are optional, as I already have.

I think there's a case to be made to first tackle the public interface, and then finalize the typing? From my perspective, I am with @TimothyEDawson in his desire to create a clean public interface. In my own projects, I have used both __all__ and the __init__.py approach; both accomplish similar things while having their own pros and cons; I have long abandoned star imports as they are frowned upon for a reason. I'd suggest following @bryanwweber's __init__.py compromise, so we can avoid onerous exception handling for testing in this PR? Just my 2 cents, of course.

@ischoegl ischoegl mentioned this pull request Aug 16, 2025
5 tasks
@TimothyEDawson
Copy link
Contributor Author

Wanted to highlight one of the new features I added a bit ago. By making SolutionArray a generic type, the phase object used as the input is visible in its type signature:

image

Which enables the language server to spot when you're trying to access a passthrough property which isn't available:

image

And to infer the type of the underlying phase object:

image

@TimothyEDawson
Copy link
Contributor Author

TimothyEDawson commented Aug 17, 2025

I switched to relative imports and started merging stub files into the source code, and I'm not super happy with the results.

Relative imports causes stubtest to raise an error for any module which doesn't end up with a corresponding .py, .pyx, or .pyi file within the installed site-packages folder. E.g.

.venv/lib/python3.13/site-packages/cantera/__init__.pyi:26: error: Cannot find implementation or library stub for module named "cantera._cantera"  [import-not-found]
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:26: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:61: error: Cannot find implementation or library stub for module named "cantera.constants"  [import-not-found]
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:76: error: Cannot find implementation or library stub for module named "cantera.delegator"  [import-not-found]
.venv/lib/python3.13/site-packages/cantera/__init__.pyi:142: error: Cannot find implementation or library stub for module named "cantera.reactionpath"  [import-not-found]

Putting the type annotations into the source code seems to also switch Mypy into a mode where, for .py files, it checks the internals and not just the function signatures, which obviously greatly expands the scope of this effort. Though there's probably an option to toggle somewhere to only check the signatures for correctness. E.g.

.venv/lib/python3.13/site-packages/cantera/data.py:19: error: Need type annotation for "data_files" (hint: "data_files: set[<type>] = ...")  [var-annotated]

I'll keep these changes for now and keep working on other aspects, just wanted to leave a note about it here.

@TimothyEDawson
Copy link
Contributor Author

Woops, guess I messed something up. I can see the following error in Build docs:

Extension error:
Here is a summary of the problems encountered when running the examples:

Unexpected failing examples (1):

    ../samples/python/kinetics/custom_reactions.py failed leaving traceback:

    Traceback (most recent call last):
      File "/home/runner/work/cantera/cantera/build/doc/samples/python/kinetics/custom_reactions.py", line 74, in <module>
        @ct.extension(name="extensible-Arrhenius", data=ExtensibleArrheniusData)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    TypeError: Argument 'data' has incorrect type (expected cantera.reaction.ExtensibleRateData, got type)

So I see that my mistake was putting data: ExtensibleRateData | None instead of data: type[ExtensibleRateData] | None. I'll fix that really quick, and hopefully that's all.

@TimothyEDawson
Copy link
Contributor Author

TimothyEDawson commented Aug 17, 2025

Alright, I think the error causing "CI / ubuntu-22.04 with Python 3.10, Numpy latest, Cython ==0.29.31" to fail:

[ RUN      ] Reaction.PythonExtensibleRate
Traceback (most recent call last):
  File "/home/runner/work/cantera/cantera/build/python/cantera/__init__.py", line 4, in <module>
    from ._cantera import *
  File "build/python/cantera/_cantera.pyx", line 26, in init cantera._cantera
TypeError: 'ABCMeta' object is not subscriptable

where line 26 in _cantera.pyx is: _path: Sequence[str] | None,, is specific to Cython 0.29.31. I'm guessing it just doesn't support the generic type syntax. I'm not sure why the various Python 3.14 tests are failing - they all pass for me locally, though I use 3.14.0-rc.1 and I see these are using 3.14.0-rc.2.

@ischoegl
Copy link
Member

ischoegl commented Aug 19, 2025

Hi @TimothyEDawson ... while #1947 is probably moot, I wanted to leave a note here.

Putting the type annotations into the source code seems to also switch Mypy into a mode where, for .py files, it checks the internals and not just the function signatures, which obviously greatly expands the scope of this effort. Though there's probably an option to toggle somewhere to only check the signatures for correctness.

I'd expect a Mypy option to prevent this also. If there isn't, I'd be 👍 with leaving things in separate .pyi files. PS: I actually don't think there is a straightforward option for Mypy, but there may be other tools that are less restrictive.

Regarding the post-merge tests: it appears to be a single issue with HDF. On your machine, you won't see it unless you have HDF support enabled, but there could be other reasons also.

Other than that, could you rebase this PR on the current main to avoid inclusion of unrelated PR changes that were recently merged?

@TimothyEDawson
Copy link
Contributor Author

Hey @ischoegl , I will work on rebasing soon, and might squash some of my commits while I'm at it.

Regarding #1947 , in my opinion you closed it prematurely - it was the opening to a conversation, not the end of it. At the same time, I wonder if it may be moot for a different reason - if we do utilize stub files with explicit imports (and optionally __all__ lists), we may end up achieving everything required without actually modifying the source files (even if speth is not happy with the long lists of imports to be maintained). We'll need to revisit this once I've marked this branch ready for review.

I've made a lot of progress on adding types directly into the file conversion modules, so hopefully I can get those pushed in the next week or two. In parallel I have been digging into the established testing infrastructure and Github Actions to figure out where the typing tests should be inserted. They're independent of pytest so I'm guessing it should be its own "Python type checking" step after the "Run Python tests", within the Ubuntu, Clang, MacOS, and Windows jobs. Though since type checking should be largely independent of the underlying code, I'm not sure if it really makes sense to add it to all of them. It also requires additional dependencies so I need to make sure I'm handling those properly.

I had considered putting it in a standalone type-checking job, but since stubtest does require a built Cantera I'm hesitant to add more builds just for that. (Static type checks using Mypy, Pyright, etc. don't require the code to be compiled). It does matter what version of Python you're running the tests with, too, so I at least need the 3.10 - 3.13 test matrix.

@TimothyEDawson
Copy link
Contributor Author

Thinking about some of the earlier discussions and what I've found working with the .pyx files, I wonder if it might be a worthy follow-on effort to start converting the .pyx files into pure Python .py files using Cython's pure Python syntax. That way we could merge the stub files into the Cython files, which would both drastically reduce the maintenance burden and enable static type checking throughout the codebase.

I believe this is how major packages like NumPy do things, as I see only a handful of .pyx files in there confined to the random module. It could be a substantial undertaking, but as long as the test coverage is good there's low risk of breaking changes. I'd be more concerned about performance regressions, personally - are there currently any good ways to monitor for those?

@ischoegl
Copy link
Member

Thinking about some of the earlier discussions and what I've found working with the .pyx files, I wonder if it might be a worthy follow-on effort to start converting the .pyx files into pure Python .py files using Cython's pure Python syntax. That way we could merge the stub files into the Cython files, which would both drastically reduce the maintenance burden and enable static type checking throughout the codebase.

I believe this is how major packages like NumPy do things, as I see only a handful of .pyx files in there confined to the random module. It could be a substantial undertaking, but as long as the test coverage is good there's low risk of breaking changes. I'd be more concerned about performance regressions, personally - are there currently any good ways to monitor for those?

Not sure. I‘m afraid that we don’t have the manpower for an undertaking of that scale. If we were ever to go that route, we should think about code generation (Cantera/enhancements#39). I‘ve recently implemented this for our CLib interface, but the Python API is far more complex.

@speth
Copy link
Member

speth commented Oct 22, 2025

Thinking about some of the earlier discussions and what I've found working with the .pyx files, I wonder if it might be a worthy follow-on effort to start converting the .pyx files into pure Python .py files using Cython's pure Python syntax. That way we could merge the stub files into the Cython files, which would both drastically reduce the maintenance burden and enable static type checking throughout the codebase.

That's an interesting thought. I don't believe this was possible for our use of Cython until Cython 3.0, which introduced the from cython.cimports import ... mechanism for accessing C/C++ library methods. I think it would be worth a proof-of-concept to see how it interacts with other things. For one, I'm curious how it interacts with the type checkers, given that despite being a .py file, it's not actually an importable module in this case, and the "real" module is a compiled extension just like we have now. I'm also curious how well this works for Sphinx, VS Code / Pylance, interactive debugging, and other dev tools, where limited support for the pyx format has been an occasional annoyance. Of course, as Ingmar notes, this would be a pretty hefty undertaking, so we'd want to be sure that there were significant benefits.

@TimothyEDawson
Copy link
Contributor Author

That's an interesting thought. I don't believe this was possible for our use of Cython until Cython 3.0, which introduced the from cython.cimports import ... mechanism for accessing C/C++ library methods. I think it would be worth a proof-of-concept to see how it interacts with other things. For one, I'm curious how it interacts with the type checkers, given that despite being a .py file, it's not actually an importable module in this case, and the "real" module is a compiled extension just like we have now. I'm also curious how well this works for Sphinx, VS Code / Pylance, interactive debugging, and other dev tools, where limited support for the pyx format has been an occasional annoyance. Of course, as Ingmar notes, this would be a pretty hefty undertaking, so we'd want to be sure that there were significant benefits.

My thoughts exactly, I honestly have no idea how the tooling will behave. It would probably be best to try things out in a project like NumPy, see how they work there in practice, and then do a proof-of-concept with a couple of the simpler .pyx files here. I'd be happy to try it out once I'm done with this pull request.

@speth
Copy link
Member

speth commented Oct 22, 2025

If you are interested in pursuing this, I'd suggest writing something up as an issue in our Enhancements repository, so we don't lose track of this discussion after this PR is merged. Certainly this should not be started until after we get through the next release (which I'd expect by the end of the year).

@TimothyEDawson
Copy link
Contributor Author

I definitely will!

Adding stubs alongside the Python files to provide type hints for
external use. These are intended to cover the public interface,
including all dynamically-generated attributes.
Moved literal_type_guard to a new _types.py file and added a new
function add_args_to_signature. Made several minor typing improvements
in the course of updating yaml2ck.py.
Only numpy.typing.ArrayLike needs to be explicitly exported, which can
be achieved with the `from X import Y as Y` syntax.
TODO: Update add_args_to_signature to support TypeForm[T] instead of
only type[T] so it can properly accept e.g. Iterable[str].
Includes an initial whitelist to make stubtest pass located at
interfaces/cython/.mypyignore, which should be reduced to the minimal
set of items which cannot be correctly inferred at runtime (e.g.
__cinit__ signatures).
Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates so far, @TimothyEDawson. I had just a couple of additional comments based on reviewing the remaining files.

I'm ready to approve and merge this PR once these are resolved and if the suggestion to remove the "model" type literals is adopted.

Copy link
Member

@ischoegl ischoegl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimothyEDawson ... overall, I believe that with @speth and @bryanwweber commenting, this is in good hands.

The only thing I'd request, either as part of this PR or as a follow-up, is to add a brief section describing stub files and how to debug them to our development guide, i.e., https://cantera.org/stable/develop/index.html#adding-new-features-to-cantera. Not all of our contributors are seasoned Python programmers, so having some pointers in place would go a long way.

Previously, these types attempted to hard-code all valid values for
models including thermo, transport, and kinetics. These have been
replaced with plain `str` types to avoid conflicts with user-defined
models.
@TimothyEDawson
Copy link
Contributor Author

Alright, I think I managed to address all of the review comments, including a couple from quite a while back which I had forgotten to come back to.

I do already have a working merged cti2yaml.py and am part way toward merging ck2yaml.py, but they're sizable changes and required some bug fixes along the way, so it may be better to table them for now. I did at least try to correct some mistakes in the existing stub files which I found along the way.

@ischoegl I'll definitely try to write something up for the docs, but I would prefer to do it as a follow-up PR if I may.

Copy link
Member

@speth speth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates, @TimothyEDawson. While there is still some work to be done on the type annotations, I think this is in a reasonable place, and I look forward to considering further changes in much smaller future PRs.

@speth speth merged commit 5e042dc into Cantera:main Oct 24, 2025
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add type hinting to the Python module

5 participants