Thanks to visit codestin.com
Credit goes to github.com

Skip to content

import-ai/asr-server

Repository files navigation

ASR Server

High-performance speech recognition service powered by FunASR with multi-worker parallel processing.

Features

  • Multi-language ASR (Chinese, English, Cantonese, Japanese, Korean)
  • Speaker diarization and timestamp annotation
  • Task queue with parallel worker processing
  • Flexible GPU/CPU deployment
  • RESTful API with FastAPI

Quick Start

Docker (Recommended)

docker compose up -d

Local Development

# Install dependencies
pip install -r requirements.txt

# Run server
python main.py

Server runs at http://localhost:8000

Configuration

Variable Description Default
NUM_WORKERS Number of parallel workers 1
DEVICE_TEMPLATE Device allocation pattern cuda:0
MODEL_DIR ASR model directory /model/SenseVoiceSmall
MAX_QUEUE_SIZE Task queue capacity 1000
TASK_TIMEOUT Task timeout (seconds) 300

Device Examples

# Single GPU
DEVICE_TEMPLATE="cuda:0"

# Multi-GPU (auto-assign: Worker 0�GPU 0, Worker 1�GPU 1...)
DEVICE_TEMPLATE="cuda:{worker_id}"

# CPU only
DEVICE_TEMPLATE="cpu"

API

Transcribe Audio

POST /api/v1/transcribe

Example:

curl -X POST "http://localhost:8000/api/v1/transcribe" \
  -F "[email protected]" \
  -F "language=auto"

Response:

{
  "text": "Complete transcription",
  "sentence_info": [
    {
      "start_time": "00:00:00",
      "end_time": "00:00:03",
      "sentence": "First sentence",
      "speaker": 0
    }
  ]
}

Queue Statistics

GET /api/v1/queue/stats

Returns queue size, worker status, and task counts.

Health Check

GET /api/v1/health

Testing

# Concurrent load test
python test_api.py audio.wav

Architecture

Request TaskQueue  Worker 0 (GPU 0)
                   Worker 1 (GPU 0)
                   Worker N (GPU 0)

Each worker runs an independent model instance for parallel processing.

About

asr-server for omnibox

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published