Lainbow is a music analysis engine designed to process local music libraries. It offers batch processing capabilities to extract deep learning embeddings (MERT, CLAP, MuQ, MuQ-MuLan) and various acoustic features from audio files. These features are then stored in a vector database to power an API for tasks like similarity-based music recommendations and natural language search.
The project is primarily intended for integration with the MPD client, Sola MPD.
Lainbow is built on a microservices architecture, with each component containerized using Docker. The system consists of four main services that work together:
- Web API Server: The main entry point for user requests. It handles API calls, interacts with the databases, and delegates heavy tasks to the batch server.
- Inference Server: A dedicated service that hosts the deep learning models and performs inference tasks (e.g., generating embeddings from audio or text).
- Batch Server: A Celery-based worker that processes long-running, asynchronous tasks, such as scanning the music library and analyzing songs.
- Databases: A set of databases for storing metadata, vector embeddings, and managing task queues:
- PostgreSQL: Stores song metadata, features, and task information.
- Milvus: A vector database for storing and searching high-dimensional embeddings.
- RabbitMQ: A message broker that facilitates communication between the Web API and the Batch Server.
graph TD
subgraph User
U[User]
end
subgraph "api.yaml"
Web_API[Web API Server]
end
subgraph "batch.yaml"
Batch_Server[Batch Server]
end
subgraph "inference.yaml"
Inference_Server["Inference Server (GPU Required)"]
end
subgraph "database.yaml"
PostgreSQL[PostgreSQL]
Milvus["Vector DB (Milvus)"]
RabbitMQ["Message Queue (RabbitMQ)"]
end
%% Connections
U --> Web_API
Web_API -- Enqueue Task --> RabbitMQ
Web_API -- CRUD --> PostgreSQL
Web_API -- Search --> Milvus
RabbitMQ -- Consume Task --> Batch_Server
Batch_Server -- HTTP Request --> Inference_Server
Batch_Server -- CRUD --> PostgreSQL
Batch_Server -- Insert --> Milvus
style U fill:#f9f,stroke:#333,stroke-width:2px
style Web_API fill:#bbf,stroke:#333,stroke-width:2px
style Inference_Server fill:#ffc,stroke:#333,stroke-width:2px
style Batch_Server fill:#cdf,stroke:#333,stroke-width:2px
Before running the application, you need to download the required deep learning models.
-
Install Dependencies: First, ensure you have the necessary Python packages installed:
pip install -r requirements.txt
-
Download Models: Run the provided script to download and place the models in the
./modelsdirectory. The script will skip any models that are already downloaded.python download_models.py
This script will download:
- The MERT model (
m-a-p/MERT-v1-330M) - The CLAP model (
laion/clap-htsat-unfused) - The MuQ model (
OpenMuQ/MuQ-large-msd-iter) - The MuQ-MuLan model (
OpenMuQ/MuQ-MuLan-large)
Once the script completes, the application will be ready to run.
- Create
.envfile: Copy the template to create your own environment file.cp .env.template .env
- Edit
.envfile: Open the.envfile and customize the variables. You must at least setMUSIC_NAS_ROOT_DIRto the absolute path of your music library. The default settings should work out-of-the-box if you are running all components on a single machine with no port conflicts.
The docker/inference.Dockerfile uses a specific PyTorch base image optimized for author's GPU (RTX 5070 Ti). You may need to modify the FROM instruction in this Dockerfile to use a base image compatible with your GPU hardware.
Additionally, ensure that the NVIDIA Container Toolkit is properly installed and configured. This is required for Docker to access and utilize the GPU.
Start the services on your server(s). You can run each service on a separate server or on the same machine. Ensure that the .env file on each server is configured correctly to allow communication between the services.
Database Server
docker compose -f docker-compose.database.yaml up -dWeb API Server
docker compose -f docker-compose.api.yaml up -dInference API Server
docker compose -f docker-compose.inference.yaml up -dBatch Server
docker compose -f docker-compose.batch.yaml up -d --scale batch-cpu=N(Note: N is the number of parallel processes. A value around 4-6 is recommended.)
- Summary: Health check endpoint.
- Response (200 OK):
{"status":"ok"}
- Summary: Get database statistics.
- Description: Retrieve statistics about the current state of the database, including song counts and task statuses.
- Response (200 OK):
{ "total_songs": 1000, "songs_with_acoustic_features": 50, "songs_with_clap": 100, "songs_with_mert": 100, "songs_with_muq": 200, "songs_with_muq_mulan": 200, "pending_tasks": 5, "running_tasks": 2 }
- Summary: Start a music library scan.
- Description: Triggers a background task to scan the music library path specified in
.env. - Response (200 OK): Returns a
TaskResultobject.{ "id": "...", "name": "scan", "status": "PENDING", "result": null }
- Summary: Run library vacuum.
- Description: Triggers a background task to remove database entries for songs that no longer exist on disk.
- Response (200 OK): Returns a
TaskResultobject.
- Summary: Run song analysis.
- Description: Enqueues a task to analyze all songs and generate embeddings for the specified models.
- Request Body (optional):
If the body is omitted, it defaults to
{ "models": ["muq", "muq_mulan"] }["muq", "muq_mulan"]. - Response (200 OK): Returns a
TaskResultobject.
- Summary: Get task status.
- Description: Retrieves the current status and result of a specific background task.
- Path Parameters:
task_id(UUID, required): The ID of the task.
- Response (200 OK): Returns a
TaskResultobject with the current status.
- Summary: Search songs by natural language.
- Description: Searches for songs based on a natural language query, e.g., "a song for a summer evening".
- Query Parameters:
q(string, required): Natural language query.model_name(string, optional, default:muq_mulan): The text embedding model to use. Can beclapormuq_mulan.limit(integer, optional, default: 10): Maximum number of results.
- Response (200 OK): Returns a list of
SearchSongobjects.
- Summary: Get detailed song analysis.
- Description: Retrieves detailed analysis for a given song, including metadata, features, and embedding status.
- Path Parameters:
file_path(string, required): The URL-encoded file path of the song.
- Response (200 OK): Returns a
SongAnalysisobject.
- Summary: Find similar songs.
- Description: Finds songs acoustically similar to the given song using a specified vector embedding model.
- Path Parameters:
file_path(string, required): The URL-encoded file path of the song.
- Query Parameters:
model_name(string, optional, default:muq): The embedding model to use. Can beacoustic_features,clap,mert,muq,muq_mulan.limit(integer, optional, default: 10): Maximum number of results.
- Response (200 OK): Returns a list of
SearchSongobjects.