Releases: huggingface/optimum
v2.0.0: Optimum ONNX, TF Lite, BetterTransformer
Breaking changes since v1.x
v2.0.0 introduces several breaking changes
1. ONNX integration moved to Optimum ONNX
ONNX export and ONNX Runtime inference related integrations were moved to Optimum ONNX #2298
Installation
How to obtain the same behavior as v1.x
pip install "optimum-onnx[onnxruntime]"or equivalently
pip install "optimum[onnxruntime]"π¨ You shouldn't install optimum without an extra if you want to be able to export your model to ONNX, please follow the installation instructions from the documentation π¨
ONNX Runtime Training
ONNX Runtime Training officially deprecated, more information on this in optimum-onnx v0.0.1 release notes
2. TF Lite export
TF Lite export officially deprecated #2340
3. BetterTransformer
BetterTransformer officially deprecated #2305
Improvements
Optimum pipelines
- Optimum pipelines by @IlyasMoutawwakil in #2343
General improvements
- Cleanup tasks manager by @IlyasMoutawwakil in #2346
- Remove legacy export by @IlyasMoutawwakil in #2359
- Native namespace packages (PEP 420) by @IlyasMoutawwakil in #2361
- Fix register loop by @IlyasMoutawwakil in #2364
New Contributors
- @openvino-dev-samples made their first contribution in #2335
Full Changelog: v1.27.0...v2.0.0
v1.27.0: Last release before v2, Transformers 4.53 support, SmolLM3, VisualBert...
π Major Upgrades
- Transformers v4.53 support and SmolLM3 model addition by @IlyasMoutawwakil in #2326
- Batched inference support across all decoders by @IlyasMoutawwakil in #2319
- VisualBert support by @Abdennacer-Badaoui in #2303
π§ Enhancements & Fixes
- Fix taskmanager by @echarlaix in #2296
- Add task onnx register by @echarlaix in #2291
- ExporterConfig refactorization by @echarlaix in #2157
- remove timm from exporters extra by @echarlaix in #2299
- No more forcing separators by @IlyasMoutawwakil in #2279
- Fix broken Trainer documentation link in README by @VolodymyrBg in #2304
- Propagate library_name parameter in from_pretrained to export by @tomaarsen in #2328
- Fix 'Block pattern could not be match. Pass block_name_to_quantize argument in quantize_model' while loading Qwen VL GPTQ model by @arunmadhusud in #2295
π§Ή Deprecations & v2
- Deprecated support for TFLite, BetterTransformer, and ONNXRuntimeβTraining, these integrations will be fully removed in v2.
- TensorFlow models export will be removed in v2, consistent with Transformer library dropping TF/JAX support.
- ONNX and ONNXRuntime integrations will move into the new OptimumβONNX package.
New Contributors
- @dependabot[bot] made their first contribution in #2292
- @arunmadhusud made their first contribution in #2295
- @VolodymyrBg made their first contribution in #2304
Full Changelog: v1.26.1...v1.27.0
v1.26.1: Patch release
Add back from_transformers for base model by @echarlaix in #2288
v1.26.0: ColPali, D-FINE, InternLM2
ONNX export
- D-FINE support by @xenova in #2249
- ColPali support by @Balladie in #2251
- InternLM2 support by @gmf14 in #2244
- Chinese CLIP support by @xenova in #1591
- Qwen3 support by @IlyasMoutawwakil in #2278
New features & enhancements
- Add onnxslim support by @inisis in #2258
- Introduce ORTSessionMixin and enable general io binding by @IlyasMoutawwakil in #2234
- Fix and uniformize hub kwargs by @IlyasMoutawwakil in #2276
- Add compatibility with transformers 4.52 by @echarlaix in #2270
- Distribute and complete onnxruntime tests (decoder models) by @IlyasMoutawwakil in #2278
- Add ONNX Runtime optimization support for ModernBERT by @amas0 in #2208
New Contributors
v1.25.3: Patch release
- Fix ORT pipelines by @echarlaix in #2274
Full Changelog**: v1.25.2...v1.25.3
v1.25.2: Patch release
What's Changed
- Upgrade optimum-intel in setup extras by @echarlaix in #2271
- Match transformers behavior with return_dict by @IlyasMoutawwakil in #2269
Full Changelog: v1.25.1...v1.25.2
v1.25.1: Patch release
What's Changed
- Updated readme/pypi page by @IlyasMoutawwakil in #2268
- Fix bug ORTModelForFeatureExtraction by @Abdennacer-Badaoui in #2267
- Fix doc TPU section by @echarlaix in #2265
Full Changelog: v1.25.0...v1.25.1
v1.25.0: ViTPose, RT-DETR, EfficientNet, Moonshine ONNX
π New Features & Enhancements
- Add ONNX export support for ViTPose, RT-DETR, EfficientNet, Moonshine
- Infer if the model needs to be exported to ONNX during loading
from optimum.onnxruntime import ORTModelForCausalLM
model_id = "meta-llama/Llama-3.2-1B"
- model = ORTModelForCausalLM.from_pretrained(model_id, export=True)
+ model = ORTModelForCausalLM.from_pretrained(model_id)- Transformers v4.49, v4.50 and v4.51 compatibility
π₯ New Contributors
A huge thank you to our first-time contributors:
- @ruidazeng
- @ariG23498
- @janak2
- @qubvel
- @zhxchen17
- @xieofxie
- @EFord36
- @Thas-Tayapongsak
- @hans00
- @Abdennacer-Badaoui
What's Changed
- Update ort training installation instructions by @echarlaix in #2173
- Dev version by @echarlaix in #2175
- Fixed All Typos in docs by @ruidazeng in #2185
- Remove deprecated ORTModel class by @echarlaix in #2187
- avoid library_name guessing if it is known in parameters standartization by @eaidova in #2179
- Infer whether a model needs to be exported to ONNX or not by @echarlaix in #2181
- Update optimum neuron extra by @dacorvo in #2190
- Add support for Moonshine ONNX export (& seq2seq models with non-legacy cache &
Tensor.repeat_interleave) by @xenova in #2162 - ViTPose by @ariG23498 in #2183
- ViTPose export fix by @echarlaix in #2192
- Remove ORTTrainer code snippet from README by @echarlaix in #2194
- Remove README code snippets by @echarlaix in #2195
- Add transformers v4.49 support by @echarlaix in #2191
- Fix test benchmark suite by @echarlaix in #2199
- fix the onnx export custom model example; fix repo name; fix opset version; remove deprecated arg; by @janak2 in #2203
- Limit transformers version for bettertransformer support by @echarlaix in #2198
- Add ONNX config for RT-DETR (and RT-DETRv2) by @qubvel in #2201
- Remove deprecated notebook by @echarlaix in #2205
- Update CI runner to ubuntu 22.04 by @echarlaix in #2206
- Add executorch documentation section by @echarlaix in #2193
- Fix typo in exporters/onnx/utils.py by @zhxchen17 in #2210
- Link Optimum-ExecuTorch to parent Optimum on Hub by @guangy10 in #2222
- Fix CI and update Transformers (4.51.1) by @IlyasMoutawwakil in #2225
- Remove FP16_Optimizer patch for DeepSpeed by @Rohan138 in #2213
- Fix diffusers by @IlyasMoutawwakil in #2229
- Remove diffusers extra by @echarlaix in #2207
- TRT engine docs by @IlyasMoutawwakil in #1396
- Always use a deafult user agent by @IlyasMoutawwakil in #2230
- dedup _get_model_external_data_paths by @xieofxie in #2217
- Clean up workflows by @IlyasMoutawwakil in #2231
- reduce area of patch_everywhere for avoid unexpected replacements by @eaidova in #2237
- add dinov2 onnx optimizer support by @EFord36 in #2227
- Fix code quality test by @echarlaix in #2239
- Add onnx export for efficientnet by @Thas-Tayapongsak in #2214
- add loading image processor by @eaidova in #2254
- Fix
CLIPSdpaAttentionhad dropped since v4.48 by @hans00 in #2245 - Increase clip opset by @echarlaix in #2256
- Add feature extraction support for image models by @Abdennacer-Badaoui in #2255
- adding token classification task for qwen2 by @Abdennacer-Badaoui in #2261
- upgrade min transformers version for phi3 by @echarlaix in #2263
v1.24.0: SD3 & Flux, DinoV2, Modernbert, GPTQModel, Transformers v4.48...
Release Notes: Optimum v1.24.0
Weβre excited to announce the release of Optimum v1.24.0. This update expands ONNX-based model capabilities and includes several improvements, bug fixes, and new contributions from the community.
π New Features & Enhancements
ORTQuantizernow supports models with ONNX subfolders.- ONNX Runtime IO Binding support for all supported Transformers models (no models left behind).
- SD3 and Flux model support added to
ORTDiffusionPipelineenabling latest diffusion-based models. - Transformers v4.47 and v4.48 compatibility, ensuring seamless integration with the latest advancements in Hugging Face's ecosystem.
- ONNX export support extended to various models, including Decision Transformer, ModernBERT, Megatron-BERT, Dinov2, OLMo, and many more (see details).
π§ Key Fixes & Optimizations
- Dropped support for Python 3.8
- Bug fixes in
ModelPatcher, SDXL refiner export, and device checks for improved reliability.
π₯ New Contributors
A huge thank you to our first-time contributors:
Your contributions make Optimum better! π
For a detailed list of all changes, please check out the full changelog.
π Happy optimizing!
What's Changed
- Onnx granite by @gabe-l-hart in #2043
- Drop python 3.8 by @echarlaix in #2086
- Update Dockerfile base image by @echarlaix in #2089
- add transformers 4.36 tests by @echarlaix in #2085
- [
fix] Allow ORTQuantizer over models with subfolder ONNX files by @tomaarsen in #2094 - SD3 and Flux support by @IlyasMoutawwakil in #2073
- Remove datasets as required dependency by @echarlaix in #2087
- Add ONNX Support for Decision Transformer Model by @ra9hur in #2038
- Generate guidance for flux by @IlyasMoutawwakil in #2104
- Unbundle inputs generated by
DummyTimestepInputGeneratorby @JingyaHuang in #2107 - Pass the revision to SentenceTransformer models by @bndos in #2105
- Rembert onnx support by @mlynatom in #2108
- fix bug
ModelPatcherreturns empty outputs by @LoSealL in #2109 - Fix workflow to mark issues as stale by @echarlaix in #2110
- Remove doc-build by @echarlaix in #2111
- Downgrade stale bot to v8 and fix permissions by @echarlaix in #2112
- Update documentation color from google tpu section by @echarlaix in #2113
- Fix workflow to mark PRs as stale by @echarlaix in #2116
- Enable transformers v4.47 support by @echarlaix in #2119
- Add ONNX export support for MGP-STR by @xenova in #2099
- Add ONNX export support for OLMo and OLMo2 by @xenova in #2121
- Pass on
model_kwargswhen exporting a SentenceTransformers model by @sjrl in #2126 - Add ONNX export support for DinoV2, Hiera, Maskformer, PVT, SigLIP, SwinV2, VitMAE, and VitMSN models by @xenova in #2001
- move check_dummy_inputs_allowed to common export utils by @eaidova in #2114
- Remove CI macos runners by @echarlaix in #2129
- Enable GPTQModel by @jiqing-feng in #2064
- Skip private model loading for external contributors by @echarlaix in #2130
- fix sdxl refiner export by @eaidova in #2133
- Export to ExecuTorch: Initial Integration by @guangy10 in #2090
- Fix AutoModel can't load gptq model due to module prefix mismatch vs AutoModelForCausalLM by @LRL-ModelCloud in #2146
- Update docker files by @echarlaix in #2102
- Limit diffusers version by @IlyasMoutawwakil in #2150
- Add ONNX export support for ModernBERT by @xenova in #2131
- Allow GPTQModel to auto select Marlin or faster kernels for inference only ops by @LRL-ModelCloud in #2138
- fix device check by @jiqing-feng in #2136
- Replace check_if_xxx_greater with is_xxx_version by @echarlaix in #2152
- Add tf available and version by @echarlaix in #2154
- Add ONNX export support for
PatchTSTby @xenova in #2101 - fix infer task from model_name if model from sentence transformer by @eaidova in #2151
- Unpin diffusers and pass onnx exporters tests by @IlyasMoutawwakil in #2153
- Uncomment modernbert config by @IlyasMoutawwakil in #2155
- Skip optimum-benchmark when loading namespace modules by @IlyasMoutawwakil in #2159
- Fix PR doc upload by @regisss in #2161
- Move executorch to optimum-executorch by @echarlaix in #2165
- Adding Onnx Support For Megatron-Bert by @pragyandev in #2169
- Transformers 4.48 by @IlyasMoutawwakil in #2158
- Update ort CIs (slow, gpu, train) by @IlyasMoutawwakil in #2024
v1.23.3: Patch release
- Add sentence-transformers and timm documentation example by @echarlaix in #2072
- Create token type ids when not provided by @echarlaix in #2081
- Add transformers v4.46 support by @echarlaix in #2078