Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@kervel
Copy link
Collaborator

@kervel kervel commented Sep 22, 2025

This PR makes it easier to use rustgen to generate rust crates with a python API:

  • add the pyo3-stub-gen crate to the python3 generator, so that a python stub module is generated for a rust crate (this means the rust API will have python type hints on it, and you can use your IDEs code completion and type checker)

  • add an extra flag to rustgen to make the rustgen more friendly for iterative development (now i can split the package in an autogenerated part and a handwritten part, and regenerate without being afraid my handwritten content will be lost)

  • add a demo function to load yaml using rust in the lib.rs handwritten part, that can be further tweaked (or removed if not needed) by the user

  • document the whole flow of creating a rust crate with python API.

@kervel kervel force-pushed the rustgen-pyo3-stubgen branch 2 times, most recently from 94a0b24 to 5a24a92 Compare September 22, 2025 13:51
@kervel kervel force-pushed the rustgen-pyo3-stubgen branch from 5a24a92 to 993c252 Compare September 29, 2025 08:14
@codecov
Copy link

codecov bot commented Sep 29, 2025

Codecov Report

❌ Patch coverage is 80.70175% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.85%. Comparing base (ec19f23) to head (cc59180).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
linkml/generators/rustgen/rustgen.py 75.64% 8 Missing and 11 partials ⚠️
linkml/generators/rustgen/template.py 93.75% 0 Missing and 2 partials ⚠️
linkml/generators/rustgen/cli.py 50.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2916      +/-   ##
==========================================
+ Coverage   80.43%   83.85%   +3.42%     
==========================================
  Files         141      141              
  Lines       15715    15807      +92     
  Branches     3187     3205      +18     
==========================================
+ Hits        12640    13255     +615     
+ Misses       2408     1838     -570     
- Partials      667      714      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kervel kervel changed the title [rustgen] generate stubs for python type hints Rustgen: ability to easily generate rust packages with ergonomic python bindings Sep 29, 2025
@kervel kervel force-pushed the rustgen-pyo3-stubgen branch from 993c252 to 1e4504a Compare September 29, 2025 09:46
@kervel kervel marked this pull request as ready for review September 29, 2025 09:50
@kervel kervel requested a review from tfliss September 29, 2025 09:50
@kervel kervel force-pushed the rustgen-pyo3-stubgen branch 2 times, most recently from 94da5cc to 9b15e71 Compare September 29, 2025 10:55
@kervel kervel changed the title Rustgen: ability to easily generate rust packages with ergonomic python bindings Rustgen: ability to generate rust packages with ergonomic python bindings Sep 29, 2025
@kervel kervel force-pushed the rustgen-pyo3-stubgen branch from 9b15e71 to 355ec22 Compare September 29, 2025 11:14
@kervel kervel mentioned this pull request Sep 30, 2025
Add --handwritten-lib flag that redirects generated sources into src/generated, leaves a reusable lib.rs shim, and exposes a YAML loading helper to Python. Update templates, stubgen binary, and rustgen tests accordingly, and document the end-to-end workflow including stub generation before building wheels.
@tfliss
Copy link
Contributor

tfliss commented Oct 8, 2025

Ok, I read through all the code today, with an AI assist to explain some of the rust / crate bits. rustgen.py feels like it's getting a little long and I left a few other comments, probably not anything I'll hold up the PR on. Tomorrow I'll try running the tests and a cli example and maybe try it out in a colab notebook. For the most part this looks good and very useful. Thank you.

@kervel
Copy link
Collaborator Author

kervel commented Oct 8, 2025

hi @tfliss .. is it possible you forgot to submit your PR review ? i don't see the comments ..


{% if root_struct_name %}
#[cfg(feature = "serde")]
/// Example helper that loads a YAML document into the root class. Edit or extend as needed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is a 'yaml' feature needed here specifically in addition to regular serde? Maybe this being an example it isn't that important.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well the "serde" feature flag actually brings in a serde_yml dependency:

SERDE_IMPORTS = Imports(
    imports=[
        Import(
            module="serde",
            version="1.0",
            features=["derive"],
            objects=[
                ObjectImport(name="Serialize"),
                ObjectImport(name="Deserialize"),
                ObjectImport(name="de::IntoDeserializer"),
            ],
            feature_flag="serde",
        ),
        Import(module="serde-value", version="0.7.0", objects=[ObjectImport(name="Value")]),
        Import(module="serde_yml", version="0.0.12", feature_flag="serde", alias="_"),
        Import(module="serde_path_to_error", version="0.1.17", objects=[], feature_flag="serde"),
    ]
)

so i think that's fine..

{% endif %}

#[cfg(feature = "stubgen")]
::pyo3_stub_gen::impl_stub_type!({{ enum_name }} = {{ struct_names | join(' | ') }});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be an edge case, or even not allowed in linkml, but if the enum is empty the join will probably generate incorrect syntax. Maybe caught earlier? I saw it in a couple of places, and maybe also the case for some for loops. Won't hold up the PR on this, though. I'm curious if there's a compliance case for empty enums.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tis is about the OrSubtype enums not about regular linkml enums ... these enums are only generated when a class has at least one subtype.

It is needed to model slot range where a slot value can be something or a subtype of that something. rust has no struct inheritance ...

i'll treat the other comments by changing / clarifying code .

let mut issues = Vec::new();

for (name, module) in &stub.modules {
let normalized_name = name.replace('-', "_");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this name-cleaning going to be robust enough? Consider at least pulling the normalize into a helper function that is a clear place for future improvements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also check for i.e. * in names that are converted to paths.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok the helper is there now. its in the rust code not the python code, and its a bit more involved than i originally tought. but its there.

bf.write(rendered_bin)

if self.handwritten_lib:
shim_path = src_dir / "lib.rs"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the path elements be checked for i.e. '..' and '/' and the construction of the path, mkdir, etc. be generalized somewhat? It seems to be repeated. I actually need to look at this in panderagen soon, so wonder if that could go into a shared module.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i have done something, not sure if it is what you wanted.

import personinfo
# Update the path below to point at your data file.
container = personinfo.load_yaml_container("path/to/data.yaml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you point this to say one of the tutorial person files or some other specific file? I got None when I used examples/tutorial/tutorial07/personinfo.yaml

I probably could find or make one to use, but a specific file would be easier for new users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, examples/PersonSchema/data/example_personinfo_data.yaml loads the file of course, but prints [<builtins.Person object at 0x10cabd110>, <builtins.Person object at 0x10cabd2c0>]
I guess this is ok for now, seems odd that Person is defined in builtins.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I just changed my version of this script fragment to the following and to me it feels better for a quick-start example. Consider using something like this:

container = personinfo.load_yaml_container("examples/PersonSchema/data/example_personinfo_data.yaml")
print(container.persons[0].name)

I ran the generator and coverage and it all works fine. I'm marking my review 'changes requested' because you said you're making minor documentation / comment edits and code clarification. Please request a new review when that's done and I'll approve.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, examples/PersonSchema/data/example_personinfo_data.yaml loads the file of course, but prints [<builtins.Person object at 0x10cabd110>, <builtins.Person object at 0x10cabd2c0>]

i just found out that you can tweak the module name (so that its no longer builtins.) for generated classes. however, its a bit involved (change every #pyclass if i see) and then i have hardcoded module names all over the place. so i'd rater leave it as is for now

the "fix" would be to write:

#[pyclass(module = "mymodule")]

instead of just pyclass.. but then for instance renaming the module becomes next to impossible (having to manually edit generated code). so i'd argue that the fix is worse than the problem.

Copy link
Contributor

@tfliss tfliss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, marking 'request changes' since you said you're making minor edits based on review comments. Overall it looks great and the functionality is useful. Thank you.

import personinfo
# Update the path below to point at your data file.
container = personinfo.load_yaml_container("path/to/data.yaml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I just changed my version of this script fragment to the following and to me it feels better for a quick-start example. Consider using something like this:

container = personinfo.load_yaml_container("examples/PersonSchema/data/example_personinfo_data.yaml")
print(container.persons[0].name)

I ran the generator and coverage and it all works fine. I'm marking my review 'changes requested' because you said you're making minor documentation / comment edits and code clarification. Please request a new review when that's done and I'll approve.

@amc-corey-cox
Copy link
Contributor

@kervel and @tfliss are these comments completed? Is this ready for me to review?

@kervel
Copy link
Collaborator Author

kervel commented Oct 10, 2025

Not yet! Will try next week...

@kervel kervel requested a review from tfliss October 14, 2025 08:28
@kervel
Copy link
Collaborator Author

kervel commented Oct 14, 2025

hello @tfliss , i think i implemented your feedback.

Copy link
Contributor

@tfliss tfliss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kervel I really appreciate your thoughtful work to implement the suggestions. This looks really great and I will look at re-using your file system methods.

@amc-corey-cox If you or another core dev are able to merge I reviewed this PR extensively, used the python API successfullly, and Frank went above and beyond implementing the comment suggestions.

@kevinschaper kevinschaper merged commit 551d2da into main Oct 16, 2025
23 checks passed
@kevinschaper kevinschaper deleted the rustgen-pyo3-stubgen branch October 16, 2025 17:19
@amc-corey-cox amc-corey-cox restored the rustgen-pyo3-stubgen branch October 16, 2025 21:03
@amc-corey-cox amc-corey-cox deleted the rustgen-pyo3-stubgen branch October 16, 2025 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants