Thanks to visit codestin.com
Credit goes to huggingface.co

Darwin-36B-Opus MLX Text-Only 8-bit

This repository contains a text-only 8-bit MLX conversion of FINAL-Bench/Darwin-36B-Opus.

The model was converted with mlx-lm and is intended for efficient inference on Apple Silicon.

Model Details

  • Original model: FINAL-Bench/Darwin-36B-Opus
  • Format: MLX
  • Quantization: 8-bit
  • Modality: Text-only
  • Runtime: Apple Silicon
  • Recommended server: cubist38/mlx-openai-server

Run with mlx-openai-server

This model is designed to be served through my open-source OpenAI-compatible MLX server:

cubist38/mlx-openai-server

Install the server:

pip install mlx-openai-server

Then launch the model:

mlx-openai-server launch \
  --model-path Darwin-36B-Opus-mlx-text-only-8bit \
  --reasoning-parser qwen3_moe \
  --tool-call-parser qwen3_coder \
  --debug \
  --served-model-name Darwin-36B-Opus

The server exposes an OpenAI-compatible API, making it easy to use with existing OpenAI SDKs, agents, and tools.

Example: OpenAI-Compatible Python Client

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="not-needed",
)

response = client.chat.completions.create(
    model="Darwin-36B-Opus",
    messages=[
        {
            "role": "user",
            "content": "Explain evolutionary model merging in simple terms.",
        }
    ],
    temperature=0.7,
    max_tokens=512,
)

print(response.choices[0].message.content)

Example: curl

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer not-needed" \
  -d '{
    "model": "Darwin-36B-Opus",
    "messages": [
      {
        "role": "user",
        "content": "What makes Darwin-36B-Opus interesting?"
      }
    ],
    "temperature": 0.7,
    "max_tokens": 512
  }'

Notes

  • This is a text-only MLX conversion.
  • This is an 8-bit quantized version, so outputs may differ from the original checkpoint.
  • The recommended way to serve this model is through mlx-openai-server.
  • The launch command uses:
    • --reasoning-parser qwen3_moe
    • --tool-call-parser qwen3_coder
    • --served-model-name Darwin-36B-Opus

Attribution

All credit for the original model goes to FINAL-Bench/Darwin-36B-Opus.

This repository provides only an MLX text-only 8-bit conversion for Apple Silicon users.

License

Please refer to the original model repository for licensing and usage terms.

Downloads last month
189
Safetensors
Model size
35B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for GiaHuy/Darwin-36B-Opus-mlx-text-only-8bit

Quantized
(4)
this model