Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@xadupre
Copy link
Contributor

@xadupre xadupre commented Aug 7, 2025

onnx2 is a prototype which replicates onnx API without using protobuf. It replaces the protobuf definition of ONNX classes (ModelProto, TensorProto, ...). It is able to load and save onnx files with the same format as protobuf.

This PR is a first step to open a discussion about whether or not we should remove the dependency on protobuf and a way to show this is possible.

The code is placed in a separate folder to let others experiment with it.

import onnx.onnx2 as onnx2

proto = onnx2.load("filename.onnx")
assert isinstance(proto, onnx2.ModelProto)
print(f"The model has {len(proto.graph.node)} nodes.")

The code is not perfect and can be surely improved but it already supports the serialization and parsing for all onnx classes (except TrainingInfoProto and Opaque). It is concrete enough to open a discussion on whether this could replace protobuf. Many tests were implemented to make sure this can be done (see tests in onnx/tests/test_onnx2_* and onnx/tests/cpp/test_onnx2_*). One file goes through all onnx files in the backend test and makes sure the model remains the same after it was deserialized and serialized again with onnx2. Every model is loaded with onnx2, serialized with onnx2 then parsed with onnx and serialized with onnx. The final serialized strings is equal to the original one. Currently, the serialized strings have the same content, only the order of the attribute may change.

It currently provides the following features:

  • Compatibility with protobuf format.
  • No dependency on protobuf, so it should make any build easier.
  • No need to serialize/deserialize when switching from python to C++ and the way around.
  • No limitation on file size (2 Gb for protobuf). The code has yet to be tested to support tensors bigger than 2 Gb.
  • Writing / Reading file with external data is directly handled in C++.
  • The parsing a file can be called parallelized. When the parser walks through the file, it skips every big tensor and pushes the loading of the field raw_data in a queue. This queue is processed by as many threads as the user wants. This is the only implemented strategy but with 4 threads, a 1Gb model can be read twice faster.
  • The current API in python is very close to the existing ones, functions in onnx.helper.py runs without any change.
  • It is possible to skip the loading of the weights of a tensor even if the model was stored in one unique file (no external weight).

The features which are not implemented yet:

  • TrainingInfoProto and Opaque, easy to add, skipped because rarely used
  • The python API still misses some unfrequent functions such as HasField (easy to fix).
  • The parallelized serialization is not yet implemented but that's an easy fix.
  • The C++ API is close to the existing one but this was not tested yet.
  • oneof rule: the implementation is not complete yet and could be improvoed. Right now, every oneof attribute does not share the same space in memory.

The features we could easily implement:

  • Since the library implements the parsing of a file or a string, we could easily implements a way to load the weights directly wherever the user wants them to be and avoid sometimes an unnecessary copy.
  • New serialization format: it could be done elsewhere but this work offers a quick start easier to leverage.
  • Partial loading or writing of models, to load different parts of a model on different machines for example. Even though a file has to be read entirely, it is possible and easy to quickly go through the desired parts without parsing too much of the head of the file.
  • Research around a more compressed format (detects duplication of strings, compressed strings, ...)
  • ...

Proto Definition

Their definition can be found in onnx/onnx2/cpu/onnx2.h. Here is a simple example for class OperatorSetIdProto. The current implementation in C++ uses many macros instead of templates. That's something which could be easily improved.

BEGIN_PROTO(OperatorSetIdProto, "Defines a unqiue pair domain, opset version for a set of operators.")
FIELD_STR(
    domain,
    1,
    "The domain of the operator set being identified. The empty string ("
    ") or absence of this field implies the operator set that is defined as part of the "
    "ONNX specification. This field MUST be present in this version of the IR when "
    "referring to any other operator set.")
FIELD_DEFAULT(
    int64_t,
    version,
    2,
    0,
    "The version of the operator set being identified. This field MUST be present in "
    "this version of the IR.")
END_PROTO()

Their implementation is done in onnx/onnx2/cpu/onnx2.cpp. It should be possible to refactor the code to avoid writing the list of all attributes four times (it is a prototype):

  • one to serialize
  • one to parse
  • one to compute the size of the serialized objects (this is important to speed up the writing),
    this feature alone allows any future parallelization of the serialization.
  • one to print: this one could really be improved, it was more a way to experiment an implementation with templates and not macros.
IMPLEMENT_PROTO(OperatorSetIdProto)
uint64_t OperatorSetIdProto::SerializeSize(utils::BinaryWriteStream& stream, SerializeOptions& options) const {
  uint64_t size = 0;
  SIZE_FIELD_EMPTY(size, options, stream, domain)
  SIZE_FIELD(size, options, stream, version)
  return size;
}
void OperatorSetIdProto::SerializeToStream(utils::BinaryWriteStream& stream, SerializeOptions& options) const {
  WRITE_FIELD_EMPTY(options, stream, domain)
  WRITE_FIELD(options, stream, version)
}
void OperatorSetIdProto::ParseFromStream(utils::BinaryStream& stream, ParseOptions& options){
    READ_BEGIN(options, stream, OperatorSetIdProto)
    READ_FIELD(options, stream, domain)
    READ_FIELD(options, stream, version)
    READ_END(options, stream, OperatorSetIdProto)
}
std::vector<std::string> OperatorSetIdProto::PrintToVectorString(utils::PrintOptions& options) const {
  return write_proto_into_vector_string(options, NAME_EXIST_VALUE(domain), NAME_EXIST_VALUE(version));
}

Everything not related to the onnx class definition was placed in other files. This work would be used to define other classes than the ONNX ones.

More in onnx/onnx2/README.md.

A comparison of the python bindings and idea of the performance gain.

image

Unit tests are currently passing. The style needs to be addressed. I'll do that if this idea gets some attraction.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lintrunner found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@xadupre xadupre changed the title ONNX2: prototype for a replacement of protobuf = same API, same format as ONNX but without protobuf [WIP] ONNX2: prototype for a replacement of protobuf = same API, same format as ONNX but without protobuf Aug 7, 2025
@xadupre xadupre marked this pull request as draft August 7, 2025 12:18
@codecov
Copy link

codecov bot commented Aug 7, 2025

Codecov Report

❌ Patch coverage is 83.49206% with 312 lines in your changes missing coverage. Please review.
✅ Project coverage is 55.98%. Comparing base (9c6b6c0) to head (890d160).
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
onnx/onnx2/helper.py 56.16% 122 Missing and 38 partials ⚠️
onnx/test/test_onnx2_backend_onnx_vs_onnx2.py 33.13% 109 Missing and 4 partials ⚠️
onnx/onnx2/io_helper.py 69.76% 7 Missing and 6 partials ⚠️
onnx/test/test_onnx2_models.py 91.11% 8 Missing and 4 partials ⚠️
onnx/test/test_onnx2_helper.py 96.35% 8 Missing and 1 partial ⚠️
onnx/test/test_onnx2_onnx_vs_onnx2.py 99.66% 2 Missing and 1 partial ⚠️
onnx/onnx2/pychecker.py 88.88% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7208      +/-   ##
==========================================
+ Coverage   54.33%   55.98%   +1.64%     
==========================================
  Files         511      519       +8     
  Lines       31819    33708    +1889     
  Branches     2848     2995     +147     
==========================================
+ Hits        17290    18871    +1581     
- Misses      13753    14008     +255     
- Partials      776      829      +53     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@xadupre xadupre changed the title [WIP] ONNX2: prototype for a replacement of protobuf = same API, same format as ONNX but without protobuf ONNX2: prototype for a replacement of protobuf = same API, same format as ONNX but without protobuf Aug 7, 2025
xadupre and others added 17 commits August 31, 2025 22:45
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
Switch to pybind11 3.0.

---------

Signed-off-by: xadupre <[email protected]>
### Description
<!-- - Describe your changes. -->
`BUILD_ONNX_PYTHON` has been deprecated since 1.18.
### Motivation and Context
Better code.
#

Signed-off-by: cyy <[email protected]>
Co-authored-by: Andreas Fehlner <[email protected]>
Signed-off-by: xadupre <[email protected]>
Remove deprecated methods that are scheduled to removed in 1.20. Fixes
onnx#7213

---------

Signed-off-by: Justin Chu <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
currently, convInteger does handle ``x_zero_points`` and
``w_zero_points`` correctly when they are not ``None``.
The ``None`` check on the arguments only worked when they are ``None``
since calling ``bool``on a numpy Array does not work.
Furthermore, the broadcasting for ``w_zero_points`` did not work
correctly, since ``w_zero_points`` needs to be applied
per-output-channel and thus needs manual dimension expanding.

### Motivation and Context
The problem arouse when testing onnx-reference via the
[onnx-tests](https://github.com/cbourjau/onnx-tests) tool.

---------

Signed-off-by: Konstantin Pueckler <[email protected]>
Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Andreas Fehlner <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Co-authored-by: justinchuby <[email protected]>
Signed-off-by: xadupre <[email protected]>
andife and others added 22 commits August 31, 2025 22:45
### Description
<!-- - Describe your changes. -->

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

---------

Signed-off-by: Andreas Fehlner <[email protected]>
Signed-off-by: Justin Chu <[email protected]>
Co-authored-by: Justin Chu <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
<!-- - Describe your changes. -->

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

Signed-off-by: Andreas Fehlner <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
<!-- - Describe your changes. -->

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

Signed-off-by: Andreas Fehlner <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
Clean up tests. Remove unnecessary skips and imports.

### Motivation and Context
As per issue onnx#7098 there are unnecessary skips and imports in some
tests. I found 17 tests that were marked as skipped a couple of years
ago, now they pass without warnings.

---------

Signed-off-by: Eetu Tunturi <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
In the Attentiion op definition, update the inputs to Range to be
scalars as opposed to 1-element vectors, as required by the Range op
spec.

### Motivation and Context
See discussion
[here](onnx#7164 (comment)).

---------

Signed-off-by: Yuan Yao <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
<!-- - Describe your changes. -->

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

Signed-off-by: Andreas Fehlner <[email protected]>
Signed-off-by: xadupre <[email protected]>
This helps automatic license checkers like pip-licenses to identify the
right license for this project

This is also the recommendation at
https://packaging.python.org/en/latest/guides/writing-pyproject-toml/#license

Signed-off-by: Erik Cederstrand <[email protected]>
Signed-off-by: xadupre <[email protected]>
### Description
<!-- - Describe your changes. -->

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

---------

Signed-off-by: Andreas Fehlner <[email protected]>
Signed-off-by: xadupre <[email protected]>
…toml) (onnx#7246)

### Description
<!-- - Describe your changes. -->

### Motivation and Context
<!-- - Why is this change required? What problem does it solve? -->
<!-- - If it fixes an open issue, please link to the issue here. -->

Signed-off-by: Andreas Fehlner <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
Signed-off-by: xadupre <[email protected]>
@cbourjau
Copy link
Contributor

cbourjau commented Sep 1, 2025

I realize that this is a draft PR, but I just wanted to say that, IMHO, a non-code RFC would be a more manageable basis for a discussion. The Rust RFC template is an excellent starting point for such an endeavor in my experience.

@xadupre
Copy link
Contributor Author

xadupre commented Sep 1, 2025

Not sure I'll continue that effort. I wrote the code quite fast. My point was to prove it is doable to remove protobuf. When there is a need, this work will pop up again.

@xadupre
Copy link
Contributor Author

xadupre commented Sep 1, 2025

Issues we would not run into by using this: microsoft/onnxruntime#25825.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

10 participants