[Survey] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization
-
Updated
Dec 15, 2025 - Python
[Survey] Towards Efficient Large Language Model Serving: A Survey on System-Aware KV Cache Optimization
ClearML - Model-Serving Orchestration and Repository Solution
SecretFlow-Serving is a serving system for privacy-preserving machine learning models.
MoDM is a cache-aware, hybrid serving system that accelerates image generation by dynamically combining small and large diffusion models for efficient, high-quality output.
An async ML service built with FastAPI, Celery, RabbitMQ, and Redis for efficient, scalable ML model serving
Implementation of an ML Model Serving with Flask, the model is LGBM trained on Kaggle titanic data.
Machine Learning (MLeap) Model Serving application for Scala
Dhruva is a full-fledged DPG platform for serving AI models at scale.
Predviđanje rezultata telemarketinga
Simple web application developed with streamlit for serving Machine Learning Model
Add a description, image, and links to the serving-ml topic page so that developers can more easily learn about it.
To associate your repository with the serving-ml topic, visit your repo's landing page and select "manage topics."