Thanks to visit codestin.com
Credit goes to Github.com

Skip to content
/ usls Public

A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models such as YOLO, FastVLM, and more.

License

Notifications You must be signed in to change notification settings

jamjamjon/usls

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

usls

Rust CI Crates.io Version ONNXRuntime MSRV Rust MSRV


πŸ“˜ API Documentation | 🌟 Examples | πŸ“¦ Model Zoo


usls is a cross-platform Rust library powered by ONNX Runtime for efficient inference of SOTA vision and vision-language models (typically under 1B parameters).

(Generated by Seedream4.5)

🌟 Highlights

  • ⚑ High Performance: Multi-threading, SIMD, and CUDA-accelerated processing
  • 🌐 Cross-Platform: Linux, macOS, Windows with ONNX Runtime execution providers (CUDA, TensorRT, CoreML, OpenVINO, DirectML, etc.)
  • πŸ—οΈ Unified API: Single Model trait inference with run()/forward()/encode_images()/encode_texts() and unified Y output
  • πŸ“₯ Auto-Management: Automatic model download (HuggingFace/GitHub), caching and path resolution
  • πŸ“¦ Multiple Inputs: Image, directory, video, webcam, stream and combinations
  • 🎯 Precision Support: FP32, FP16, INT8, UINT8, Q4, Q4F16, BNB4, and more
  • πŸ› οΈ Full-Stack Suite: DataLoader, Annotator, and Viewer for complete workflows
  • 🌱 Model Ecosystem: 50+ SOTA vision and VLM models

πŸš€ Quick Start

Run the YOLO-Series demo to explore models with different tasks, precision and execution providers:

  • Tasks: detect, segment, pose, classify, obb
  • Versions: YOLOv5, YOLOv6, YOLOv7, YOLOv8, YOLOv9, YOLOv10, YOLO11, YOLOv12, YOLOv13, YOLO26
  • Scales: n, s, m, l, x
  • Precision: fp32, fp16, q8, q4, q4f16, bnb4
  • Execution Providers: CPU, CUDA, TensorRT, TensorRT-RTX, CoreML, OpenVINO, and more

Examples

# CPU: Object detection with YOLO26n (FP16)
cargo run -r --example yolo -- --task detect --ver 26 --scale n --dtype fp16

# CUDA model + CPU processor: Instance segmentation with YOLO11m
cargo run -r -F cuda --example yolo -- --task segment --ver 11 --scale m --device cuda:0 --processor-device cpu

# CUDA model + CUDA processor: Pose estimation with YOLOv8m
cargo run -r -F cuda-full --example yolo -- --task pose --ver 8 --scale s --device cuda:0 --processor-device cuda:0

# TensorRT model + CPU processor
cargo run -r -F tensorrt --example yolo -- --device tensorrt:0 --processor-device cpu

# TensorRT model + CUDA processor (CUDA 12.4)
cargo run -r -F tensorrt-cuda-12040 --example yolo -- --device tensorrt:0 --processor-device cuda:0

# TensorRT-RTX model + CUDA processor
cargo run -r -F nvrtx-full --example yolo -- --device nvrtx:0 --processor-device cuda:0

# TensorRT-RTX model + CPU processor
cargo run -r -F nvrtx --example yolo -- --device nvrtx:0

# Apple Silicon CoreML
cargo run -r -F coreml --example yolo -- --device coreml

# Intel OpenVINO (CPU/GPU/VPU)
cargo run -r -F openvino -F ort-load-dynamic --example yolo -- --device openvino:CPU

# Show all available options
cargo run -r --example yolo -- --help

See YOLO Examples for more details and use cases.

See Device Combination Guide for feature and device configurations.

Performance

Environment: NVIDIA RTX 3060Ti (TensorRT-10.11.0.33, CUDA 12.8, TensorRT-RTX-1.3.0.35) / Intel i5-12400F

Setup: YOLO26n, COCO2017 validation set (5,000 images), Resolution: 640x640, Conf thresholds: [0.35, 0.3, ..]

Results are for rough reference only.

EP Image
Processor
DType Batch Preprocess Inference Postprocess Total
TensorRT CUDA FP16 1 ~233Β΅s ~1.3ms ~14Β΅s ~1.55ms
TensorRT-RTX CUDA FP32 1 ~233Β΅s ~2.0ms ~10Β΅s ~2.24ms
TensorRT-RTX CUDA FP16 1 ❓ ❓ ❓ ❓
CUDA CUDA FP32 1 ~233Β΅s ~5.0ms ~17Β΅s ~5.25ms
CUDA CUDA FP16 1 ~233Β΅s ~3.6ms ~17Β΅s ~3.85ms
CUDA CPU FP32 1 ~800Β΅s ~6.5ms ~14Β΅s ~7.31ms
CUDA CPU FP16 1 ~800Β΅s ~5.0ms ~14Β΅s ~5.81ms
CPU CPU FP32 1 ~970Β΅s ~20.5ms ~14Β΅s ~21.48ms
CPU CPU FP16 1 ~970Β΅s ~25.0ms ~14Β΅s ~25.98ms
TensorRT CUDA FP16 8 ~1.2ms ~6.0ms ~55Β΅s ~7.26ms
TensorRT CPU FP16 8 ~18.0ms ~25.5ms ~55Β΅s ~43.56ms

πŸ“¦ Model Zoo

Status:β€‚βœ… Supported  |  ❓ Unknown  |β€‚β€‚βŒ Not Supported For Now

πŸ” All ONNX models are available from the ONNX Models Repository

πŸ”₯ YOLO-Series
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
YOLOv5 Image Classification
Object Detection
Instance Segmentation
demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
YOLOv6 Object Detection demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
YOLOv7 Object Detection demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
YOLOv8 Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection
demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
YOLO11 Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection
demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
YOLOv9 Object Detection demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
YOLOv10 Object Detection demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
YOLOv12 Image Classification
Object Detection
Instance Segmentation
demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
YOLOv13 Object Detection demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
YOLO26 Object Detection
Instance Segmentation
Image Classification
Oriented Object Detection
Keypoint Detection
demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
🏷️ Image Classification & Tagging
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
BEiT Image Classification demo βœ… βœ… βœ… βœ… ❌ ❌ ❌
ConvNeXt Image Classification demo βœ… βœ… βœ… βœ… ❌ ❌ ❌
FastViT Image Classification demo βœ… βœ… βœ… βœ… ❌ ❌ ❌
MobileOne Image Classification demo βœ… βœ… βœ… βœ… ❌ ❌ ❌
DeiT Image Classification demo βœ… βœ… βœ… βœ… ❌ ❌ ❌
RAM Image Tagging demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
RAM++ Image Tagging demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
🎯 Object Detection
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
RT-DETRv1 Object Detection demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
RT-DETRv2 Object Detection demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
RT-DETRv4 Object Detection demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
RF-DETR Object Detection demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
PP-PicoDet Object Detection demo ❌ ❓ βœ… ❌ ❌ ❌ ❌
D-FINE Object Detection demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
DEIM Object Detection demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
DEIMv2 Object Detection demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
🎨 Image Segmentation
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
SAM Segment Anything demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
SAM-HQ Segment Anything demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
MobileSAM Segment Anything demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
EdgeSAM Segment Anything demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
YOLOE-v8/11-Prompt-Free Open-Set Detection And Segmentation demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
YOLOE-26-Prompt-Free Open-Set Detection And Segmentation demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
FastSAM Instance Segmentation demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
SAM2 Segment Anything demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
SAM3-Tracker Segment Anything demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
BiRefNet - COD Camouflaged Object Detection demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - DIS Dichotomous Image Segmentation demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - HRSOD High-Resolution Salient Object Detection demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - Massive Multi-Dataset Robust Segmentation demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
✨ Background Removal
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
RMBG Image Segmentation
Background Removal
demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BEN2 Image Segmentation
Background Removal
demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
βœ‚οΈ Image Matting & Portrait Segmentation
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
MODNet Image Matting demo βœ… ❓ βœ… βœ… βœ… ❌ ❌
MediaPipe Selfie Image Segmentation demo βœ… ❓ βœ… βœ… βœ… ❌ ❌
BiRefNet - Portrait Portrait Background Removal demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - Matting Portrait Matting & Background Removal demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - HR Matting High-Resolution Portrait Matting demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - General General Purpose Segmentation demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - HR General High-Resolution General Segmentation demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - Lite General Lightweight General Segmentation (2K) demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
BiRefNet - General Tiny Lightweight General Segmentation with Swin-V1-Tiny demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
πŸ—ΊοΈ Open-Set Detection & Segmentation
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
GroundingDINO Open-Set Detection With Language demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
MM-GDINO Open-Set Detection With Language demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
LLMDet Open-Set Detection With Language demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
OWLv2 Open-Set Object Detection demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
YOLO-World Open-Set Detection With Language demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
YOLOE-Prompt-Based Open-Set Detection And Segmentation demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
YOLOE-26-Prompt-Based Open-Set Detection And Segmentation demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
SAM3-Image Open-Set Detection And Segmentation demo βœ… βœ… βœ… βœ… βœ… βœ… βœ…
πŸƒ Multi-Object Tracking
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
ByteTrack Multi-Object Tracking demo ❌ ❌ ❌ ❌ ❌ ❌ ❌
πŸ’Ž Image Super-Resolution
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
Swin2SR Image Restoration demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
APISR Anime Super-Resolution demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
🀸 Pose Estimation
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
RTMPose Keypoint Detection demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
DWPose Keypoint Detection demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
RTMW Keypoint Detection demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
RTMO Keypoint Detection demo βœ… ❓ βœ… βœ… βœ… βœ… ❌
πŸ” OCR & Document Understanding
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
DB Text Detection demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
FAST Text Detection demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
LinkNet Text Detection demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
SVTR Text Recognition demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
TrOCR Text Recognition demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
SLANet Table Recognition demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
DocLayout-YOLO Object Detection demo βœ… βœ… βœ… βœ… βœ… ❌ ❌
🧩 Vision-Language Models (VLM)
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
BLIP Image Captioning demo βœ… ❓ βœ… ❓ ❌ ❌ ❌
Florence2 A Variety of Vision Tasks demo βœ… ❓ βœ… βœ… ❌ ❌ ❌
Moondream2 Open-Set Object Detection
Open-Set Keypoints Detection
Image Captioning
Visual Question Answering
demo βœ… ❓ ❌ ❌ βœ… βœ… ❌
SmolVLM Visual Question Answering demo βœ… ❓ βœ… ❓ ❓ ❓ ❓
SmolVLM2 Visual Question Answering demo βœ… ❓ βœ… ❓ ❓ ❓ ❓
FastVLM Vision Language Models demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
🧬 Embedding Model
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
CLIP Vision-Language Embedding demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
jina-clip-v1 Vision-Language Embedding demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
jina-clip-v2 Vision-Language Embedding demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
mobileclip Vision-Language Embedding demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
DINOv2 Vision Embedding demo βœ… ❓ βœ… ❌ ❌ ❌ ❌
DINOv3 Vision Embedding demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
πŸ“ Depth Estimation
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
DepthAnything v1 Monocular Depth Estimation demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
DepthAnything v2 Monocular Depth Estimation demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
DepthPro Monocular Depth Estimation demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
Depth-Anything-3 Monocular
Metric
Multi-View
demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
🌌 Others
Model Task / Description Demo Dynamic Batch TensorRT FP32 FP16 Q8 Q4f16 BNB4
Sapiens Foundation for Human Vision Models demo βœ… ❓ βœ… βœ… βœ… βœ… βœ…
YOLOPv2 Panoptic Driving demo βœ… ❓ βœ… ❌ ❌ ❌ ❌

Documentation

πŸ”§ Cargo Features

❕ Features in italics are enabled by default.

  • Core & Utilities

    • ort-download-binaries: Automatically download prebuilt ONNX Runtime binaries from pyke.
    • ort-load-dynamic: Manually link ONNX Runtime. Useful for custom builds or unsupported platforms. See Linking Guide for more details.
    • viewer: Real-time image/video visualization (similar to OpenCV imshow). Empowered by minifb.
    • video: Video I/O support for reading and writing video streams. Empowered by video-rs.
    • hf-hub: Download model files from Hugging Face Hub.
    • annotator: Annotation utilities for drawing bounding boxes, keypoints, and masks on images.
  • Image Formats

    Additional image format support (optional for faster compilation):

    • image-all-formats: Enable all additional image formats.
    • image-gif, image-bmp, image-ico, image-avif, image-tiff, image-dds, image-exr, image-ff, image-hdr, image-pnm, image-qoi, `image-tga: Individual image format support.
  • Model Categories

    • vision: Core vision models (Detection, Segmentation, Classification, Pose, etc.).
    • vlm: Vision-Language Models (CLIP, BLIP, Florence2, etc.).
    • mot: Multi-Object Tracking utilities.
    • all-models: Enable all model categories.
  • Execution Providers

    Hardware acceleration for inference. Enable the one matching your hardware:

    • cuda: NVIDIA CUDA execution provider (pure model inference acceleration).
    • tensorrt: NVIDIA TensorRT execution provider (pure model inference acceleration).
    • nvrtx: NVIDIA NvTensorRT-RTX execution provider (pure model inference acceleration).
    • cuda-full: cuda + cuda-runtime-build (Model + Image Preprocessing acceleration).
    • tensorrt-full: tensorrt + cuda-runtime-build (Model + Image Preprocessing acceleration).
    • nvrtx-full: nvrtx + cuda-runtime-build (Model + Image Preprocessing acceleration).
    • coreml: Apple Silicon (macOS/iOS).
    • openvino: Intel CPU/GPU/VPU.
    • onednn: Intel Deep Neural Network Library.
    • directml: DirectML (Windows).
    • webgpu: WebGPU (Web/Chrome).
    • rocm: AMD GPU acceleration.
    • cann: Huawei Ascend NPU.
    • rknpu: Rockchip NPU.
    • xnnpack: Mobile CPU optimization.
    • acl: Arm Compute Library.
    • armnn: Arm Neural Network SDK.
    • azure: Azure ML execution provider.
    • migraphx: AMD MIGraphX.
    • nnapi: Android Neural Networks API.
    • qnn: Qualcomm SNPE.
    • tvm: Apache TVM.
    • vitis: Xilinx Vitis AI.
  • CUDA Support

    NVIDIA GPU acceleration with CUDA image processing kernels (requires cudarc):

    • cuda-full: Uses cuda-version-from-build-system (auto-detects via nvcc).
    • cuda-11040, cuda-11050, cuda-11060, cuda-11070, cuda-11080: CUDA 11.x versions (Model + Preprocess).
    • cuda-12000, cuda-12010, cuda-12020, cuda-12030, cuda-12040, cuda-12050, cuda-12060, cuda-12080, cuda-12090: CUDA 12.x versions (Model + Preprocess).
    • cuda-13000, cuda-13010: CUDA 13.x versions (Model + Preprocess).
  • TensorRT Support

    NVIDIA TensorRT execution provider with CUDA runtime libraries:

    • tensorrt-full: Uses cuda-version-from-build-system (auto-detects via nvcc).
    • tensorrt-cuda-11040, tensorrt-cuda-11050, tensorrt-cuda-11060, tensorrt-cuda-11070, tensorrt-cuda-11080: TensorRT + CUDA 11.x runtime.
    • tensorrt-cuda-12000, tensorrt-cuda-12010, tensorrt-cuda-12020, tensorrt-cuda-12030, tensorrt-cuda-12040, tensorrt-cuda-12050, tensorrt-cuda-12060, tensorrt-cuda-12080, tensorrt-cuda-12090: TensorRT + CUDA 12.x runtime.
    • tensorrt-cuda-13000, tensorrt-cuda-13010: TensorRT + CUDA 13.x runtime.

    Note: tensorrt-cuda-* features enable TensorRT execution provider with CUDA runtime libraries for image processing. The "cuda" in the name refers to cudarc dependency.

  • NVRTX Support

    NVIDIA NvTensorRT-RTX execution provider with CUDA runtime libraries:

    • nvrtx-full: Uses cuda-version-from-build-system (auto-detects via nvcc).
    • nvrtx-cuda-11040, nvrtx-cuda-11050, nvrtx-cuda-11060, nvrtx-cuda-11070, nvrtx-cuda-11080: NVRTX + CUDA 11.x runtime.
    • nvrtx-cuda-12000, nvrtx-cuda-12010, nvrtx-cuda-12020, nvrtx-cuda-12030, nvrtx-cuda-12040, nvrtx-cuda-12050, nvrtx-cuda-12060, nvrtx-cuda-12080, nvrtx-cuda-12090: NVRTX + CUDA 12.x runtime.
    • nvrtx-cuda-13000, nvrtx-cuda-13010: NVRTX + CUDA 13.x runtime.

    Note: nvrtx-cuda-* features enable NVRTX execution provider with CUDA runtime libraries for image processing. The "cuda" in the name refers to cudarc dependency.


πŸš€ Device Combination Guide

Scenario Model Device (--device) Processor Device (--processor-device) Required Features (-F)
CPU Only cpu cpu vision (default)
GPU Inference (Slow Preprocess) cuda cpu cuda
GPU Inference (Fast Preprocess) cuda cuda cuda-full or cuda-120xxx
TensorRT (Slow Preprocess) tensorrt cpu tensorrt
TensorRT (Fast Preprocess) tensorrt cuda tensorrt-full or tensorrt-cuda-120xxx

⚠️ In multi-GPU environments (e.g., cuda:0, cuda:1), you MUST ensure that both --device and --processor-device use the SAME GPU ID.


❓ FAQ

  • ONNX Runtime Issues: For ONNX Runtime related errors, please check the ort issues or onnxruntime issues.
  • Other Issues: For other questions or bug reports, see issues or open a new discussion.

⚠️ Compatibility Note

If you encounter linking errors with __isoc23_strtoll or similar glibc symbols, use the dynamic loading feature:

cargo run -F ort-load-dynamic --example

Why no LM models?

This project focuses on vision and VLM models under 1B parameters for efficient inference.

Many high-performance inference engines already exist for LM/LLM models like vLLM.

Pure text embedding models may be considered in future releases.

How fast is it?

Refer to YOLO performance benchmarks in the Performance section above.

This project uses multi-threading, SIMD, and CUDA hardware acceleration for optimization.

While vision models like YOLO and RFDETR are optimized, other models may need further interface and post-processing optimization.

🀝 Contributing

This is a personal project maintained in spare time, so progress on performance optimization and new model support may vary.

We highly welcome PRs for model optimization! If you have expertise in specific models and can help optimize their interfaces or post-processing, your contributions would be invaluable. Feel free to open an issue or submit a pull request for suggestions, bug reports, or new features.

πŸ™ Acknowledgments

Thanks to all the open-source libraries and their maintainers that make this project possible. See Cargo.toml for a complete list of dependencies.

πŸ“œ License

This project is licensed under LICENSE.