Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 51b57f1

Browse files
authored
point to service-streamer tutorial
1 parent 5859c0e commit 51b57f1

1 file changed

Lines changed: 15 additions & 0 deletions

File tree

intermediate_source/flask_rest_api_tutorial.py

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -326,6 +326,21 @@ def get_prediction(image_bytes):
326326
# Next steps
327327
# --------------
328328
#
329+
# Before putting the server into production, we need to solve two issues:
330+
#
331+
# - One request is served at a time, it is much slower compared to a local batch prediction
332+
# - It will cause CUDA out-of-memory error on GPU when there are large concurrent requests
333+
#
334+
# We can cache user requests in batches and schedule the prediction process.
335+
# Follow `service-streamer tutorial <https://github.com/ShannonAI/service-streamer/wiki/Vision-Recognition-Service-with-Flask-and-service-streamer>`_
336+
# you will solve these issues with a few lines of code.
337+
#
338+
# .. Note ::
339+
# `service-streamer` <https://github.com/ShannonAI/service-streamer>`_ is a middleware for web service
340+
# of machine learning applications. Queued requests from users are sampled into mini-batches. Service-streamer
341+
# can significantly enhance the overall performance of the web server by improving GPU utilization.
342+
#
343+
#
329344
# The server we wrote is quite trivial and and may not do everything
330345
# you need for your production application. So, here are some things you
331346
# can do to make it better:

0 commit comments

Comments
 (0)