{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Serving ONNX model with MXNet Model Server\n", "\n", "This tutorial dmeonstrates how to serve an ONNX model with MXNet Model Server. We'll use a pre-trained SqueezeNet model from ONNX model zoo. The same example can be easily applied to other ONNX models.\n", "\n", "Frameworks used in this tutorial:\n", "* [MXNet Model Server](https://github.com/awslabs/mxnet-model-server) that uses [MXNet](http://mxnet.io)\n", "* [ONNX](https://onnx.ai)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Installing pre-requisites:\n", "Next we'll install the pre-requisites:\n", "* [ONNX](https://github.com/onnx/onnx)\n", "* [MXNetModelServer](https://github.com/awslabs/mxnet-model-server)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!conda install -y -c conda-forge onnx\n", "!pip install mxnet-model-server" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Downloading a pre-trained ONNX model\n", "\n", "Let's go ahead and download a aqueezenet onnx model into a new directory." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "!mkdir squeezenet\n", "%cd squeezenet\n", "!curl -O https://s3.amazonaws.com/model-server/models/onnx-squeezenet/squeezenet.onnx" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inspecting the ONNX model\n", "\n", "Let's make sure the exported ONNX model is well formed" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import onnx\n", "\n", "# Load the ONNX model\n", "model = onnx.load(\"squeezenet.onnx\")\n", "\n", "# Check that the IR is well formed, identified issues will error out\n", "onnx.checker.check_model(model)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "## Packaging the ONNX model for serving with MXNet Model Server (MMS)\n", "\n", "To serve the ONNX model with MMS, we will first need to prepare some files that will be bundled into a **Model Archive**. \n", "The Model Archive containes everything MMS needs to setup endpoints and serve the model. \n", "\n", "The files needed are:\n", "* squeezenet.onnx - the ONNX model file\n", "* signature.json - defining the input and output of the model\n", "* synset.txt - defining the set of classes the model was trained on, and returned by the model" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's go ahead and download the files we need:\n", "!curl -O https://s3.amazonaws.com/model-server/models/onnx-squeezenet/signature.json\n", "!curl -O https://s3.amazonaws.com/model-server/models/onnx-squeezenet/synset.txt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's take a peek into the **signature.json** file:\n", "!cat signature.json" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's take a peek into the synset.txt file:\n", "!head -n 10 synset.txt" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Let's package everything up into a Model Archive bundle\n", "!mxnet-model-export --model-name squeezenet --model-path .\n", "!ls -l squeezenet.model" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Serving the Model Archive with MXNet Model Server\n", "Now that we have the **Model Archive**, we can start the server and ask it to setup HTTP endpoints to serve the model, emit real-time operational metrics and more.\n", "\n", "We'll also test the server's endpoints." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Spawning a new process to run the server\n", "import subprocess as sp\n", "server = sp.Popen(\"mxnet-model-server --models squeezenet=squeezenet.model\", shell=True)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Check out the health endpoint\n", "!curl http://127.0.0.1:8080/ping" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Download an image Trying out image classification\n", "!curl -O https://s3.amazonaws.com/model-server/inputs/kitten.jpg\n", "!curl -X POST http://127.0.0.1:8080/squeezenet/predict -F \"input_0=@kitten.jpg\"" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Lastly, we'll terminate the server\n", "server.terminate()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }