Cactus Kotlin Multiplatform Library

Official Kotlin Multiplatform library for Cactus, a framework for deploying LLM models and speech-to-text locally in your app. Requires iOS 12.0+, Android API 24+.

Resources

Installation

1. Add the repository to your `settings.gradle.kts`:

dependencyResolutionManagement {
    repositories {
        maven {
            name = "GitHubPackagesCactus"
            url = uri("https://maven.pkg.github.com/cactus-compute/cactus-kotlin")
            credentials {
                username = properties.getProperty("github.username") ?: System.getenv("GITHUB_ACTOR")
                password = properties.getProperty("github.token") ?: System.getenv("GITHUB_TOKEN")
            }
        }
    }
}

2. Add credentials

Add your GitHub username and token to local.properties:

github.username=your-username
github.token=your-personal-access-token

You can generate a personal access token by following the instructions on GitHub's documentation. The token needs read:packages scope.

Or set them as environment variables: GITHUB_ACTOR and GITHUB_TOKEN.

3. Add to your KMP project's `build.gradle.kts`:

kotlin {
    sourceSets {
        commonMain {
            dependencies {
                implementation("com.cactus:library:0.3-beta.5")
            }
        }
    }
}

4. Add the permissions to your manifest (Android)

<uses-permission android:name="android.permission.INTERNET" /> // for model downloads
<uses-permission android:name="android.permission.RECORD_AUDIO" /> // for transcription

Getting Started

Context Initialization (Required)

Initialize the Cactus context in your Activity's onCreate() method before using any SDK functionality:

import com.cactus.CactusContextInitializer

class MainActivity : ComponentActivity() {
    override fun onCreate(savedInstanceState: Bundle?) {
        super.onCreate(savedInstanceState)
        
        // Initialize Cactus context (required)
        CactusContextInitializer.initialize(this)
        
        // ... rest of your code
    }
}

Telemetry Setup (Optional)

import com.cactus.services.CactusTelemetry

// Initialize telemetry for usage analytics (optional)
CactusTelemetry.setTelemetryToken("your_token_here")

Language Model (LLM)

The CactusLM class provides text completion capabilities with high-performance local inference.

Basic Usage

import com.cactus.CactusLM
import com.cactus.CactusInitParams
import com.cactus.CactusCompletionParams
import com.cactus.ChatMessage
import kotlinx.coroutines.runBlocking

runBlocking {
    val lm = CactusLM()

    try {
        // Download a model by slug (e.g., "qwen3-0.6", "gemma3-270m")
        // If no model is specified, it defaults to "qwen3-0.6"
        val downloadSuccess = lm.downloadModel("qwen3-0.6")
        
        // Initialize the model
        val initSuccess = lm.initializeModel(
            CactusInitParams(
                model = "qwen3-0.6",
                contextSize = 2048
            )
        )

        // Generate completion with default parameters
        val result = lm.generateCompletion(
            messages = listOf(
                ChatMessage(content = "Hello, how are you?", role = "user")
            )
        )

        result?.let { response ->
            if (response.success) {
                println("Response: ${response.response}")
                println("Tokens per second: ${response.tokensPerSecond}")
                println("Time to first token: ${response.timeToFirstTokenMs}ms")
            }
        }
    } finally {
        // Clean up
        lm.unload()
    }
}

Streaming Completions

runBlocking {
    val lm = CactusLM()
    
    // Download model (defaults to "qwen3-0.6" if model parameter is omitted)
    lm.downloadModel()
    lm.initializeModel(CactusInitParams())

    // Get the streaming response
    val result = lm.generateCompletion(
        messages = listOf(ChatMessage(content = "Tell me a story", role = "user")),
        onToken = { token, tokenId ->
            print(token)
        }
    )

    // Final result after streaming is complete
    result?.let {
        if (it.success) {
            println("\nFinal response: ${it.response}")
            println("Tokens per second: ${it.tokensPerSecond}")
        }
    }

    lm.unload()
}

Model Discovery

runBlocking {
    val lm = CactusLM()
    
    // Get list of available models
    val models = lm.getModels()
    
    models.forEach { model ->
        println("Model: ${model.name}")
        println("  Slug: ${model.slug}")
        println("  Size: ${model.size_mb} MB")
        println("  Tool calling: ${model.supports_tool_calling}")
        println("  Vision: ${model.supports_vision}")
        println("  Downloaded: ${model.isDownloaded}")
    }
}

Function Calling (Experimental)

import com.cactus.models.CactusTool
import com.cactus.models.ToolParametersSchema
import com.cactus.models.ToolParameter

runBlocking {
    val lm = CactusLM()
    
    lm.downloadModel()
    lm.initializeModel(CactusInitParams())

    val tools = listOf(
        createTool(
            name = "get_weather",
            description = "Get current weather for a location",
            parameters = mapOf(
                "location" to ToolParameter(
                    type = "string",
                    description = "City name",
                    required = true
                )
            )
        )
    )

    val result = lm.generateCompletion(
        messages = listOf(ChatMessage(content = "What's the weather in New York?", role = "user")),
        params = CactusCompletionParams(
            tools = tools
        )
    )

    result?.toolCalls?.forEach { toolCall ->
        println("Tool: ${toolCall.name}")
        println("Arguments: ${toolCall.arguments}")
    }

    lm.unload()
}

Embedding Generation

runBlocking {
    val lm = CactusLM()
    
    lm.downloadModel()
    lm.initializeModel(CactusInitParams())

    val result = lm.generateEmbedding(
        text = "The quick brown fox jumps over the lazy dog"
    )

    result?.let {
        if (it.success) {
            println("Embedding dimension: ${it.dimension}")
            println("First 5 values: ${it.embeddings.take(5)}")
        }
    }

    lm.unload()
}

Inference Modes

CactusLM supports multiple inference modes for flexibility between on-device and cloud-based processing. This is controlled by the mode parameter in CactusCompletionParams.

InferenceMode.LOCAL: (Default) Performs inference locally on the device.
InferenceMode.REMOTE: Performs inference using a remote API. Requires cactusToken.
InferenceMode.LOCAL_FIRST: Attempts local inference first. If it fails, it falls back to the remote API.
InferenceMode.REMOTE_FIRST: Attempts remote inference first. If it fails, it falls back to the local model.

Example using local-first fallback:

val result = lm.generateCompletion(
    messages = listOf(ChatMessage(content = "Hello!", role = "user")),
    params = CactusCompletionParams(
        mode = InferenceMode.LOCAL_FIRST,
        cactusToken = "your_api_token"
    )
)

LLM API Reference

CactusLM Class

suspend fun downloadModel(model: String = "qwen3-0.6"): Boolean - Download an LLM model by slug (e.g., "qwen3-0.6", "gemma3-270m").
suspend fun initializeModel(params: CactusInitParams): Boolean - Initialize a model for inference.
suspend fun generateCompletion(messages: List<ChatMessage>, params: CactusCompletionParams = CactusCompletionParams(), onToken: CactusStreamingCallback? = null): CactusCompletionResult? - Generate text completion. Supports streaming via the onToken callback and different inference modes (local, remote, and fallbacks).
suspend fun generateEmbedding(text: String, modelName: String? = null): CactusEmbeddingResult? - Generate embeddings for the given text.
suspend fun getModels(): List<CactusModel> - Get a list of available models. Results are cached locally to reduce network requests.
fun unload() - Unload the current model and free resources.
fun isLoaded(): Boolean - Check if a model is currently loaded.

Data Classes

CactusInitParams(model: String?, contextSize: Int?) - Parameters for model initialization.
CactusCompletionParams(temperature: Double, topK: Int, topP: Double, maxTokens: Int, stopSequences: List<String>, tools: List<CactusTool>, mode: InferenceMode, cactusToken: String?, model: String?) - Parameters for text completion.
CactusCompletionResult(success: Boolean, response: String?, timeToFirstTokenMs: Double?, totalTimeMs: Double?, tokensPerSecond: Double?, prefillTokens: Int?, decodeTokens: Int?, totalTokens: Int?, toolCalls: List<ToolCall>?) - The result of a text completion.
CactusEmbeddingResult(success: Boolean, embeddings: List<Double>, dimension: Int?, errorMessage: String?) - The result of embedding generation.
ChatMessage(content: String, role: String, timestamp: Long?) - A chat message with role (e.g., "user", "assistant").
CactusModel(slug: String, name: String, download_url: String, size_mb: Int, supports_tool_calling: Boolean, supports_vision: Boolean, isDownloaded: Boolean, quantization: Int) - Information about an available model.
InferenceMode - Enum for selecting inference mode (LOCAL, REMOTE, LOCAL_FIRST, REMOTE_FIRST).

Speech-to-Text (STT)

The CactusSTT class provides speech recognition capabilities using on-device models from providers like Vosk and Whisper.

Choosing a Transcription Provider

You can select a transcription provider when initializing CactusSTT. The available providers are:

TranscriptionProvider.VOSK (Default): Uses Vosk for transcription.
TranscriptionProvider.WHISPER: Uses Whisper for transcription.

import com.cactus.CactusSTT
import com.cactus.TranscriptionProvider

// Initialize with the VOSK provider (default)
val sttVosk = CactusSTT() 

// Or explicitly initialize with the WHISPER provider
val sttWhisper = CactusSTT(TranscriptionProvider.WHISPER)

Basic Usage

With Vosk

import com.cactus.CactusSTT
import com.cactus.SpeechRecognitionParams
import kotlinx.coroutines.runBlocking

runBlocking {
    val stt = CactusSTT() // Defaults to VOSK provider

    // Download STT model (e.g., vosk-en-us)
    val downloadSuccess = stt.download("vosk-en-us")
    
    // Initialize the model
    val initSuccess = stt.init("vosk-en-us")

    // Transcribe from microphone
    val result = stt.transcribe(
        SpeechRecognitionParams(
            maxSilenceDuration = 1000L,
            maxDuration = 30000L,
            sampleRate = 16000
        )
    )

    result?.let { transcription ->
        if (transcription.success) {
            println("Transcribed: ${transcription.text}")
            println("Processing time: ${transcription.processingTime}ms")
        }
    }

    // Stop transcription
    stt.stop()
}

With Whisper

import com.cactus.CactusSTT
import com.cactus.SpeechRecognitionParams
import com.cactus.TranscriptionProvider
import kotlinx.coroutines.runBlocking

runBlocking {
    val stt = CactusSTT(TranscriptionProvider.WHISPER)

    // Download a Whisper model (e.g., whisper-tiny)
    val downloadSuccess = stt.download("whisper-tiny")
    
    // Initialize the model
    val initSuccess = stt.init("whisper-tiny")

    // Transcribe from an audio file
    val fileResult = stt.transcribe(
        params = SpeechRecognitionParams(),
        filePath = "/path/to/audio.wav"
    )

    fileResult?.let { transcription ->
        if (transcription.success) {
            println("Transcribed: ${transcription.text}")
        }
    }

    // Stop transcription
    stt.stop()
}

Transcription Modes

CactusSTT supports multiple transcription modes for flexibility between on-device and cloud-based processing. This is controlled by the mode parameter in the transcribe function.

TranscriptionMode.LOCAL: (Default) Performs transcription locally on the device.
TranscriptionMode.REMOTE: Performs transcription using a remote API (e.g., Wispr). Requires filePath and apiKey.
TranscriptionMode.LOCAL_FIRST: Attempts local transcription first. If it fails, it falls back to the remote API.
TranscriptionMode.REMOTE_FIRST: Attempts remote transcription first. If it fails, it falls back to the local model.

Example using local-first fallback for a file:

// Transcribe from audio file with remote fallback
val fileResult = stt.transcribe(
    params = SpeechRecognitionParams(),
    filePath = "/path/to/audio.wav",
    mode = TranscriptionMode.LOCAL_FIRST,
    apiKey = "your_wispr_api_key"
)

Available Voice Models

You can get a list of available models for the configured provider.

// For VOSK (default)
val voskModels = CactusSTT().getVoiceModels()

// For WHISPER
val whisperModels = CactusSTT().getVoiceModels(TranscriptionProvider.WHISPER)

// Check if a model is downloaded
stt.isModelDownloaded("vosk-en-us")

STT API Reference

CactusSTT Class

CactusSTT(provider: TranscriptionProvider = TranscriptionProvider.VOSK) - Constructor to specify the transcription provider.
suspend fun download(model: String): Boolean - Download an STT model (e.g., "vosk-en-us" or "whisper-tiny-en").
suspend fun init(model: String): Boolean - Initialize an STT model for transcription.
suspend fun transcribe(params: SpeechRecognitionParams = SpeechRecognitionParams(), filePath: String? = null, mode: TranscriptionMode = TranscriptionMode.LOCAL, apiKey: String? = null): SpeechRecognitionResult? - Transcribe speech from microphone or file. Supports different transcription modes.
suspend fun warmUpWispr(apiKey: String) - Warms up the remote Wispr service for lower latency.
fun stop() - Stop ongoing transcription.
fun isReady(): Boolean - Check if the STT service is initialized and ready.
suspend fun getVoiceModels(provider: TranscriptionProvider = TranscriptionProvider.VOSK): List<VoiceModel> - Get a list of available voice models for the configured provider.
suspend fun isModelDownloaded(modelName: String): Boolean - Check if a specific model has been downloaded.

Data Classes

TranscriptionProvider - Enum for selecting the provider (VOSK, WHISPER).
SpeechRecognitionParams(maxSilenceDuration: Long, maxDuration: Long, sampleRate: Int) - Parameters for controlling speech recognition.
SpeechRecognitionResult(success: Boolean, text: String?, processingTime: Double?) - The result of a transcription.
VoiceModel - Contains information about an available voice model.

Platform-Specific Setup

Android

Works automatically - native libraries included
Requires API 24+ (Android 7.0)
ARM64 architecture supported

iOS

Add the Cactus package dependency in Xcode
Requires iOS 12.0+
Supports ARM64 and Simulator ARM64

Building the Library

To build the library from source:

# Build the library and publish to localMaven
./build_library.sh

Example App

Check out the example app in the example/ directory for a complete Kotlin Multiplatform implementation showing:

Model discovery and fetching available models
Model downloading with progress tracking
Text completion with both regular and streaming modes
Speech-to-text transcription with multiple provider support (Vosk and Whisper)
Voice model management and provider switching
Embedding generation
Function calling capabilities
Error handling and status management
Compose Multiplatform UI integration

To run the example:

cd example

# For desktop
./gradlew :composeApp:run

# For Android/iOS - use Android Studio or Xcode

Performance Tips

Model Selection: Choose smaller models for faster inference on mobile devices
Context Size: Reduce context size for lower memory usage (e.g., 1024 instead of 2048)
Memory Management: Always call unload() when done with models
Batch Processing: Reuse initialized models for multiple completions
Model Caching: Use getModels() for efficient model discovery - results are cached locally to reduce network requests

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
assets		assets
example		example
gradle		gradle
library		library
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.gradle.kts		build.gradle.kts
build_library.sh		build_library.sh
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts

License

cactus-compute/cactus-kotlin

Folders and files

Latest commit

History

Repository files navigation

Cactus Kotlin Multiplatform Library

Resources

Installation

1. Add the repository to your settings.gradle.kts:

2. Add credentials

3. Add to your KMP project's build.gradle.kts:

4. Add the permissions to your manifest (Android)

Getting Started

Context Initialization (Required)

Telemetry Setup (Optional)

Language Model (LLM)

Basic Usage

Streaming Completions

Model Discovery

Function Calling (Experimental)

Embedding Generation

Inference Modes

LLM API Reference

CactusLM Class

Data Classes

Speech-to-Text (STT)

Choosing a Transcription Provider

Basic Usage

With Vosk

With Whisper

Transcription Modes

Available Voice Models

STT API Reference

CactusSTT Class

Data Classes

Platform-Specific Setup

Android

iOS

Building the Library

Example App

Performance Tips

Support

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

1. Add the repository to your `settings.gradle.kts`:

3. Add to your KMP project's `build.gradle.kts`:

Packages