0% found this document useful (0 votes)

61 views3 pages

OpenVINO Quick Start Guide

Uploaded by

niko1abc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

61 views3 pages

OpenVINO Quick Start Guide

Uploaded by

niko1abc

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 3

Accelerate AI Inference

Q1 2024 | Updates are here. 

Post your questions here. 
Overview Cheat Sheet
Read the documentation here.

Bring AI everywhere with OpenVINO™: enabling developers to quickly optimize, deploy, and scale AI
applications across hardware device types with cutting-edge compression features and advanced performance
capabilities.

What is OpenVINO ?
TM

OpenVINO is an open-source toolkit for optimizing and deploying deep learning models. Deploy AI across
devices (from PC to cloud) with automatic acceleration!

Documentation Get started Blog Examples

Use OpenVINO with…

PyTorch TensorFlow Hugging Face ONNX and more

Build, Optimize, Deploy

OpenVINO accelerates inference and simplifies deployment across hardware, with a “build once, deploy
everywhere” philosophy. To accomplish this, OpenVINO supports + integrates with frameworks (like PyTorch)
and offers advanced compression capabilities.

Build your model in the training

framework or grab a pre-trained
model from Hugging Face

Optimize your model for faster

responses & smaller memory

Deploy the same model across

hardware, leveraging automatic
performance enhancements

Leverage the hardware’s

AI acceleration by default

OpenVINO Installation
Linux install Windows install macOS install

PyPI example for Linux, macOS & Windows: #set up python venv 
python -m pip install openvino
The install table also has: APT, YUM, Conda, vcpkg, Homebrew, Docker, Conan, & npm

Interactive Notebook Examples

Test out 150+ interactive Jupyter notebooks with cutting-edge open-source models. 
Includes model compression, pipeline details, interactive GUIs, and more. 
Try out top models for a range of use cases, including:
LLMs YOLO-v9 Stable Diffusion CLIP Segment Anything Whisper

Setup: Windows Ubuntu macOS RedHat CentOS AzureML Docker SageMaker

Model Compression with NNCF
NNCF is OpenVINO’s deep learning model compression tool, offering cutting-edge AI compression
capabilities, including:
Quantization: reducing the bit-size of the weights, while preserving accurac
Weight Compression: easy post-training optimization for LLMs
Pruning for Sparsity: drop connections in the model that don’t add valu
Model Distillation: a larger ‘teacher’ model trains a smaller ‘student’ model
Compression results in smaller and faster models that can be deployed across devices.
Easy install: pip install nncf

Documentation GitHub NNCF Notebooks NNCF + Hugging Face

PyTorch + OpenVINO Options

PyTorch models can be directly converted within OpenVINO™:
import openvino as ov 
import torch 
model = torch.load("model.pt") # Convert model loaded from PyTorch file 
model.eval() 
ov_model = ov.convert_model(model) 
core = ov.Core() 
compiled_model = core.compile_model(ov_model) # Compile model from memory
Or, you can use the OpenVINO backend for torch.compile:
import openvino.torch 
import torch 
# Compile PyTorch model # 
opts = {"device" : "CPU", "config" : {"PERFORMANCE_HINT" : "LATENCY"}} 
compiled_model = torch.compile(model, backend="openvino", options=opts)

Direct conversion PyTorch Backend Examples Blog

Performance Features
OpenVINO can do automatic performance enhancements at runtime customized to your hardware (preserving
model accuracy), including:

Asynchronous execution, batch processing, tensor fusion, load balancing, dynamic inference parallelism,
automatic BF16 conversion, and more.

Creates a smaller memory footprint of framework + model improving edge deployments.

There are also optional security features: the ability to compute on an encrypted model.
Additional advanced performance features:
Automatic evice election (A TO) selects the best available devic
D S U

Multi- evice Execution (M LTI) parallelizes inference across device

D U

H eterogeneous Execution ( ETE O) e ciently splits inference between core

H R ffi

Automatic Batching ad-hoc groups inference re uests for max memory core utilizatio
q /

Performance ints auto-ad usts runtime parameters to prioritize latency or throughpu

H j

D ynamic hapes reshapes models to accept arbitrarily-sized inputs, for data exibilit
S fl

Benchmark Tool characterizes model performance in various hardware and pipelines

Supported ardware H

OpenVINO supports CP , P , and NP . ( peci cations)

U G U U S fi

The plugin architecture of OpenVINO enables development and plug-independent inference solutions
dedicated to different devices. Learn more about the Plugin, OpenVINO Plugin Library, and how to build one
with CMake.
Additional community-supported plugins for Nvidia, ava and ust can be found here.
J R
OpenVINO can Accelerate as a Backend
If you want to stay in another framework API, OpenVINO provides accelerating backends:

import openvino.torch 
#compile PyTorch model as usual with PyTorch 
PyTorch
compiled_model = torch.compile(model, backend="openvino", options =
{"device" : "CPU"})

onnx_model = onnx.load("model.onnx") 
ONNX Runtime onnx.save_model(onnx_model, 'saved_model.onnx’) 
sess.set_providers([‘OpenVINOExecutionProvider’])

from optimum.intel import OVModelForCausalLM 

#define model_id, use transformers tokenizer & pipeline 
Hugging Face
model = OVModelForCausalLM.from_pretrained(model_id) 
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)

$ docker run --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /path/to/

model_repository:/models nvcr.io/nvidia/tritonserver:<xx.yy>-
Nvidia Triton py3 tritonserver --model-repository=/models
Config File: name: "model_a" 
backend: "openvino"

ov_llm=HuggingFacePipeline.from_model_id(…backend="openvino", 
model_kwargs={"device":"CPU","ov_config": ov_config}) 
LangChain
ov_chain = prompt | ov_llm 
print(ov_chain.invoke({"question":“what is neurobiology?”}))

Hugging Face Integration

Hugging Face + Intel Optimum offers OpenVINO integration with Hugging Face models and pipelines. You can
grab pre-optimized models and use OpenVINO compression features & Runtime capabilities within the
Hugging Face API.
Here is an example with an LLM (from this notebook) on how to swap default Hugging Face code for optimized
OpenVINO-Hugging Face code:
-from transformers import AutoModelForCausalLM 
+from optimum.intel.openvino import OVModelForCausalLM 
from transformers import AutoTokenizer, pipeline 
model_id = “togethercomputer/RedPajama-INCITE-Chat-3B-v1” 
-model = AutoModelForCausalLM.from_pretrained(model_id) 
+model = OVModelForCausalLM.from_pretrained(model_id, export=True)

Inference Documentation Compression Documentation Reference Documentation Examples

OpenVINO™ Model Server (OVMS)

OVMS hosts models and makes them accessible to
software components over standard network protocols: a
client sends a request to the model server, which performs
model inference and sends a response back to the client.

OVMS is a high-performance system for serving models.

Implemented in C++ for scalability and optimized for
deployment on Intel architectures, the model server uses a
KServe standard, while applying OpenVINO for inference
execution. Inference service is provided via gRPC or REST
API, making deploying new models/experiments easy.

Documentation QuickStart Guide Features Demos

Join the OpenVINO Community

We welcome code contributions and feedback! Submit on GitHub and engage on GitHub discussions or our
forum. Share your examples (via PR) to be featured here.

Notices & Disclaimers: Intel does not control or audit third-party data. You should consult other sources to evaluate accuracy. © Intel
Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may
be claimed as the property of others. Legal Notices and Disclaimers

Deep Learning With PyTorch Guide For Beginners and Intermediate
100% (7)
Deep Learning With PyTorch Guide For Beginners and Intermediate
120 pages
Deep Learning Model Comparisons
No ratings yet
Deep Learning Model Comparisons
60 pages
Getting Started With OpenVINO
No ratings yet
Getting Started With OpenVINO
3 pages
Machine Learning (16CIC73) Project Report Template
33% (3)
Machine Learning (16CIC73) Project Report Template
12 pages
AMX Quick Start Guide
No ratings yet
AMX Quick Start Guide
1 page
1725876250-Unit 3 Computer Vision With OpenVINO
No ratings yet
1725876250-Unit 3 Computer Vision With OpenVINO
30 pages
OceanofPDF - Com Python Machine Learning The Beginners Gu - Lilly Trinity
No ratings yet
OceanofPDF - Com Python Machine Learning The Beginners Gu - Lilly Trinity
115 pages
About Python
No ratings yet
About Python
17 pages
Modul 11 Optimasi Model Deep Learning
No ratings yet
Modul 11 Optimasi Model Deep Learning
9 pages
Run LLMs Locally with OpenVINO
No ratings yet
Run LLMs Locally with OpenVINO
61 pages
Criminal Face Detection Final Document
No ratings yet
Criminal Face Detection Final Document
45 pages
Build A Native AI Agent Pipeline With OpenVINO From 0 To 1 - by OpenVINO™ Toolkit - OpenVINO-toolkit - Nov, 2024 - Medium
No ratings yet
Build A Native AI Agent Pipeline With OpenVINO From 0 To 1 - by OpenVINO™ Toolkit - OpenVINO-toolkit - Nov, 2024 - Medium
16 pages
Complete Bundle Testbank PyTorch Recipes ProblemSolution Approach To Build Train and Deploy Neural Network Models 2nd Edition Pradeepta Mishra
100% (1)
Complete Bundle Testbank PyTorch Recipes ProblemSolution Approach To Build Train and Deploy Neural Network Models 2nd Edition Pradeepta Mishra
402 pages
Major Project (Gurman, Harneet, Kuldeep, Sahibjot)
No ratings yet
Major Project (Gurman, Harneet, Kuldeep, Sahibjot)
72 pages
Yolov6 Using Vitis AI Library
No ratings yet
Yolov6 Using Vitis AI Library
6 pages
(Activity Template) Connect The Workflow Pieces
No ratings yet
(Activity Template) Connect The Workflow Pieces
12 pages
Module 3 - Creating Scalable and Future-Ready AI Applications With The OpenVINO Runtime
No ratings yet
Module 3 - Creating Scalable and Future-Ready AI Applications With The OpenVINO Runtime
48 pages
Book AI
No ratings yet
Book AI
144 pages
Yolov8 Keypoint Detection
No ratings yet
Yolov8 Keypoint Detection
31 pages
Demidovskij 2021 J. Phys. Conf. Ser. 1828 012012
No ratings yet
Demidovskij 2021 J. Phys. Conf. Ser. 1828 012012
9 pages
Water Quality Prediction: SVM vs XGBoost
No ratings yet
Water Quality Prediction: SVM vs XGBoost
104 pages
Optimizing Inference Server For Maximum Tokens - Sec
No ratings yet
Optimizing Inference Server For Maximum Tokens - Sec
4 pages
Li Thesis 2022 - 2
No ratings yet
Li Thesis 2022 - 2
45 pages
Car Brand Detection Prototype Using Intel® Distribution of Openvino™ Toolkit
No ratings yet
Car Brand Detection Prototype Using Intel® Distribution of Openvino™ Toolkit
18 pages
Openvino Toolkit Llms Solution White Paper
No ratings yet
Openvino Toolkit Llms Solution White Paper
21 pages
Module5 - Streamlining AI Application Development and Deployment With Deep Learning Workbench
No ratings yet
Module5 - Streamlining AI Application Development and Deployment With Deep Learning Workbench
34 pages
High-Speed Inference With Llama - CPP and Vicuna On CPU by Benjamin Marie Jun, 2023 Towards AI
No ratings yet
High-Speed Inference With Llama - CPP and Vicuna On CPU by Benjamin Marie Jun, 2023 Towards AI
21 pages
Eesam Openvino v2
No ratings yet
Eesam Openvino v2
58 pages
CoE Workshop2
No ratings yet
CoE Workshop2
6 pages
PyTorch Workflow Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
No ratings yet
PyTorch Workflow Fundamentals - Zero To Mastery Learn PyTorch For Deep Learning
43 pages
Ai Inference Software Solutions Catalogue 2022
No ratings yet
Ai Inference Software Solutions Catalogue 2022
91 pages
Module2 - Optimization & Quantization of AI Models For Improved Performance
No ratings yet
Module2 - Optimization & Quantization of AI Models For Improved Performance
45 pages
Human Action Detection Prototype Using Intel® Distribution of Openvino™ Toolkit
No ratings yet
Human Action Detection Prototype Using Intel® Distribution of Openvino™ Toolkit
19 pages
Driverless AIBooklet
No ratings yet
Driverless AIBooklet
135 pages
Openvino Toolkit Llms Solution White Paper
No ratings yet
Openvino Toolkit Llms Solution White Paper
23 pages
OpenVINO Installation Guide 2019R1
No ratings yet
OpenVINO Installation Guide 2019R1
30 pages
Data Science & AI Master's Program
No ratings yet
Data Science & AI Master's Program
47 pages
Chapter VI - Introduction To Deep Learning
No ratings yet
Chapter VI - Introduction To Deep Learning
38 pages
Open (For Business) - Big Tech, Concentrated Power, and The Political Economy of Open AI
No ratings yet
Open (For Business) - Big Tech, Concentrated Power, and The Political Economy of Open AI
27 pages
GPT 4o Creates and Responds To Auto Big Bench 144 Tasks 1719566767
No ratings yet
GPT 4o Creates and Responds To Auto Big Bench 144 Tasks 1719566767
6 pages
YOLOv5 Model Export Guide
No ratings yet
YOLOv5 Model Export Guide
13 pages
Wine - Data2.py: Import As Import As Def
No ratings yet
Wine - Data2.py: Import As Import As Def
2 pages
Running Generative AI On Intel AI Laptops
No ratings yet
Running Generative AI On Intel AI Laptops
10 pages
01 AI Quantizer and AI Compiler - TensorFlow2 and
No ratings yet
01 AI Quantizer and AI Compiler - TensorFlow2 and
21 pages
Ajay Kumar Garg Engineering College: (Shapemyskills)
No ratings yet
Ajay Kumar Garg Engineering College: (Shapemyskills)
32 pages
Codeyolov 5
No ratings yet
Codeyolov 5
16 pages
ML Project Report
No ratings yet
ML Project Report
12 pages
OpenVINO Python API - OpenVINO™ Documentation - Version (2024)
No ratings yet
OpenVINO Python API - OpenVINO™ Documentation - Version (2024)
3 pages
Machine Learning - Python Libraries
No ratings yet
Machine Learning - Python Libraries
12 pages
Breaking The Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads
No ratings yet
Breaking The Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads
15 pages
Optimization For AI Inference Engines On GPUs
No ratings yet
Optimization For AI Inference Engines On GPUs
13 pages
Object
No ratings yet
Object
3 pages
ML Summit - Deploying and Explaining Your Model (2021) (Copy)
No ratings yet
ML Summit - Deploying and Explaining Your Model (2021) (Copy)
23 pages
Openvino - Ai: Visit Github To Try For Yourself
No ratings yet
Openvino - Ai: Visit Github To Try For Yourself
5 pages
10 - Machine - Learning - Frameworks - To - Try - in - 2021 For Me
No ratings yet
10 - Machine - Learning - Frameworks - To - Try - in - 2021 For Me
15 pages
Inicai 2V1
No ratings yet
Inicai 2V1
7 pages
PyTorch for Deep Learning Beginners
No ratings yet
PyTorch for Deep Learning Beginners
31 pages
2.1 Pytorch Intro Slides
No ratings yet
2.1 Pytorch Intro Slides
14 pages
Deep Learning Booklet
No ratings yet
Deep Learning Booklet
55 pages
Deep Learning With H2O
No ratings yet
Deep Learning With H2O
31 pages
Winder Ai
No ratings yet
Winder Ai
10 pages
Accelerating Throughput With Optimized Yolov7 1742810904
No ratings yet
Accelerating Throughput With Optimized Yolov7 1742810904
28 pages
OpenVINS-A Research Platform For Visual-Inertial Estimation
No ratings yet
OpenVINS-A Research Platform For Visual-Inertial Estimation
7 pages
GENAI Questions
No ratings yet
GENAI Questions
14 pages
Neelay Shah & Neena Maldikar - GStreamer Video Analytics - Optimizing Inference Across HW Targets
No ratings yet
Neelay Shah & Neena Maldikar - GStreamer Video Analytics - Optimizing Inference Across HW Targets
30 pages
Torchbench: Benchmarking Pytorch With High Api Surface Coverage
No ratings yet
Torchbench: Benchmarking Pytorch With High Api Surface Coverage
13 pages
Computer Vision Pretrained Models: What Is Pre-Trained Model?
No ratings yet
Computer Vision Pretrained Models: What Is Pre-Trained Model?
10 pages
Intro To Pytorch
No ratings yet
Intro To Pytorch
12 pages
PyTorch Cheat Sheet & Quick Reference
No ratings yet
PyTorch Cheat Sheet & Quick Reference
6 pages
Project 2
No ratings yet
Project 2
10 pages
Development
No ratings yet
Development
13 pages
P4L4.1 AI Model Quantization-Compilation-and Inference With Vitis-AI and PYNQ
No ratings yet
P4L4.1 AI Model Quantization-Compilation-and Inference With Vitis-AI and PYNQ
56 pages
Major Code
No ratings yet
Major Code
12 pages
PyTorch For Machine Learning
No ratings yet
PyTorch For Machine Learning
5 pages
HypLL: The Hyperbolic Learning Library
No ratings yet
HypLL: The Hyperbolic Learning Library
4 pages
Data Science Interview Prep Guide
No ratings yet
Data Science Interview Prep Guide
2 pages
Introduction To PyTorch
No ratings yet
Introduction To PyTorch
9 pages
GenAI 5 AI PC Overview Resource Guide
No ratings yet
GenAI 5 AI PC Overview Resource Guide
4 pages
Feed LinkedIn 7
No ratings yet
Feed LinkedIn 7
1 page
PyTorch Is An Open
No ratings yet
PyTorch Is An Open
2 pages
01 Coding The God Bot (Dragged) 6
No ratings yet
01 Coding The God Bot (Dragged) 6
1 page
Deep Learning With PyTorch 1
No ratings yet
Deep Learning With PyTorch 1
1 page
Selvakumar Perumal: Education Skills
No ratings yet
Selvakumar Perumal: Education Skills
1 page
WSL Environment
No ratings yet
WSL Environment
4 pages
Nvidia
No ratings yet
Nvidia
18 pages
Installing Open CV
No ratings yet
Installing Open CV
6 pages
Setup Deep Live Cam - Bat
No ratings yet
Setup Deep Live Cam - Bat
3 pages
Wine Quality Prediction Seminar
No ratings yet
Wine Quality Prediction Seminar
6 pages

OpenVINO Quick Start Guide

Uploaded by

OpenVINO Quick Start Guide

Uploaded by

Accelerate AI Inference

Q1 2024 | Updates are here.

Documentation Get started Blog Examples

Use OpenVINO with…

Build, Optimize, Deploy

Build your model in the training

Optimize your model for faster

Deploy the same model across

Leverage the hardware’s

Interactive Notebook Examples

Setup: Windows Ubuntu macOS RedHat CentOS AzureML Docker SageMaker

Documentation GitHub NNCF Notebooks NNCF + Hugging Face

PyTorch + OpenVINO Options

Direct conversion PyTorch Backend Examples Blog

Creates a smaller memory footprint of framework + model improving edge deployments.

Multi- evice Execution (M LTI) parallelizes inference across device

H eterogeneous Execution ( ETE O) e ciently splits inference between core

Performance ints auto-ad usts runtime parameters to prioritize latency or throughpu

Benchmark Tool characterizes model performance in various hardware and pipelines

OpenVINO supports CP , P , and NP . ( peci cations)

from optimum.intel import OVModelForCausalLM

$ docker run --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v /path/to/

Hugging Face Integration

Inference Documentation Compression Documentation Reference Documentation Examples

OpenVINO™ Model Server (OVMS)

OVMS is a high-performance system for serving models.

Documentation QuickStart Guide Features Demos

Join the OpenVINO Community

You might also like

Q1 2024 | Updates are here. 

from optimum.intel import OVModelForCausalLM