-
Notifications
You must be signed in to change notification settings - Fork 134
Rustgen: ability to generate rust packages with ergonomic python bindings #2916
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
94a0b24 to
5a24a92
Compare
5a24a92 to
993c252
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #2916 +/- ##
==========================================
+ Coverage 80.43% 83.85% +3.42%
==========================================
Files 141 141
Lines 15715 15807 +92
Branches 3187 3205 +18
==========================================
+ Hits 12640 13255 +615
+ Misses 2408 1838 -570
- Partials 667 714 +47 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
993c252 to
1e4504a
Compare
94da5cc to
9b15e71
Compare
9b15e71 to
355ec22
Compare
Add --handwritten-lib flag that redirects generated sources into src/generated, leaves a reusable lib.rs shim, and exposes a YAML loading helper to Python. Update templates, stubgen binary, and rustgen tests accordingly, and document the end-to-end workflow including stub generation before building wheels.
355ec22 to
5bfcbc3
Compare
|
Ok, I read through all the code today, with an AI assist to explain some of the rust / crate bits. rustgen.py feels like it's getting a little long and I left a few other comments, probably not anything I'll hold up the PR on. Tomorrow I'll try running the tests and a cli example and maybe try it out in a colab notebook. For the most part this looks good and very useful. Thank you. |
|
hi @tfliss .. is it possible you forgot to submit your PR review ? i don't see the comments .. |
|
|
||
| {% if root_struct_name %} | ||
| #[cfg(feature = "serde")] | ||
| /// Example helper that loads a YAML document into the root class. Edit or extend as needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a 'yaml' feature needed here specifically in addition to regular serde? Maybe this being an example it isn't that important.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well the "serde" feature flag actually brings in a serde_yml dependency:
SERDE_IMPORTS = Imports(
imports=[
Import(
module="serde",
version="1.0",
features=["derive"],
objects=[
ObjectImport(name="Serialize"),
ObjectImport(name="Deserialize"),
ObjectImport(name="de::IntoDeserializer"),
],
feature_flag="serde",
),
Import(module="serde-value", version="0.7.0", objects=[ObjectImport(name="Value")]),
Import(module="serde_yml", version="0.0.12", feature_flag="serde", alias="_"),
Import(module="serde_path_to_error", version="0.1.17", objects=[], feature_flag="serde"),
]
)so i think that's fine..
| {% endif %} | ||
|
|
||
| #[cfg(feature = "stubgen")] | ||
| ::pyo3_stub_gen::impl_stub_type!({{ enum_name }} = {{ struct_names | join(' | ') }}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be an edge case, or even not allowed in linkml, but if the enum is empty the join will probably generate incorrect syntax. Maybe caught earlier? I saw it in a couple of places, and maybe also the case for some for loops. Won't hold up the PR on this, though. I'm curious if there's a compliance case for empty enums.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tis is about the OrSubtype enums not about regular linkml enums ... these enums are only generated when a class has at least one subtype.
It is needed to model slot range where a slot value can be something or a subtype of that something. rust has no struct inheritance ...
i'll treat the other comments by changing / clarifying code .
| let mut issues = Vec::new(); | ||
|
|
||
| for (name, module) in &stub.modules { | ||
| let normalized_name = name.replace('-', "_"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this name-cleaning going to be robust enough? Consider at least pulling the normalize into a helper function that is a clear place for future improvements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also check for i.e. * in names that are converted to paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok the helper is there now. its in the rust code not the python code, and its a bit more involved than i originally tought. but its there.
| bf.write(rendered_bin) | ||
|
|
||
| if self.handwritten_lib: | ||
| shim_path = src_dir / "lib.rs" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the path elements be checked for i.e. '..' and '/' and the construction of the path, mkdir, etc. be generalized somewhat? It seems to be repeated. I actually need to look at this in panderagen soon, so wonder if that could go into a shared module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i have done something, not sure if it is what you wanted.
docs/generators/rust.rst
Outdated
| import personinfo | ||
| # Update the path below to point at your data file. | ||
| container = personinfo.load_yaml_container("path/to/data.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you point this to say one of the tutorial person files or some other specific file? I got None when I used examples/tutorial/tutorial07/personinfo.yaml
I probably could find or make one to use, but a specific file would be easier for new users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, examples/PersonSchema/data/example_personinfo_data.yaml loads the file of course, but prints [<builtins.Person object at 0x10cabd110>, <builtins.Person object at 0x10cabd2c0>]
I guess this is ok for now, seems odd that Person is defined in builtins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I just changed my version of this script fragment to the following and to me it feels better for a quick-start example. Consider using something like this:
container = personinfo.load_yaml_container("examples/PersonSchema/data/example_personinfo_data.yaml")
print(container.persons[0].name)
I ran the generator and coverage and it all works fine. I'm marking my review 'changes requested' because you said you're making minor documentation / comment edits and code clarification. Please request a new review when that's done and I'll approve.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, examples/PersonSchema/data/example_personinfo_data.yaml loads the file of course, but prints [<builtins.Person object at 0x10cabd110>, <builtins.Person object at 0x10cabd2c0>]
i just found out that you can tweak the module name (so that its no longer builtins.) for generated classes. however, its a bit involved (change every #pyclass if i see) and then i have hardcoded module names all over the place. so i'd rater leave it as is for now
the "fix" would be to write:
#[pyclass(module = "mymodule")]instead of just pyclass.. but then for instance renaming the module becomes next to impossible (having to manually edit generated code). so i'd argue that the fix is worse than the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, marking 'request changes' since you said you're making minor edits based on review comments. Overall it looks great and the functionality is useful. Thank you.
docs/generators/rust.rst
Outdated
| import personinfo | ||
| # Update the path below to point at your data file. | ||
| container = personinfo.load_yaml_container("path/to/data.yaml") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I just changed my version of this script fragment to the following and to me it feels better for a quick-start example. Consider using something like this:
container = personinfo.load_yaml_container("examples/PersonSchema/data/example_personinfo_data.yaml")
print(container.persons[0].name)
I ran the generator and coverage and it all works fine. I'm marking my review 'changes requested' because you said you're making minor documentation / comment edits and code clarification. Please request a new review when that's done and I'll approve.
|
Not yet! Will try next week... |
|
hello @tfliss , i think i implemented your feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kervel I really appreciate your thoughtful work to implement the suggestions. This looks really great and I will look at re-using your file system methods.
@amc-corey-cox If you or another core dev are able to merge I reviewed this PR extensively, used the python API successfullly, and Frank went above and beyond implementing the comment suggestions.
This PR makes it easier to use rustgen to generate rust crates with a python API:
add the pyo3-stub-gen crate to the python3 generator, so that a python stub module is generated for a rust crate (this means the rust API will have python type hints on it, and you can use your IDEs code completion and type checker)
add an extra flag to rustgen to make the rustgen more friendly for iterative development (now i can split the package in an autogenerated part and a handwritten part, and regenerate without being afraid my handwritten content will be lost)
add a demo function to load yaml using rust in the lib.rs handwritten part, that can be further tweaked (or removed if not needed) by the user
document the whole flow of creating a rust crate with python API.