Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

Copilot
Copy link
Contributor

@Copilot Copilot AI commented Aug 22, 2025

This PR implements a new preload_models configuration option that enables loading models into memory when the LocalAI server starts, improving response times by eliminating the initial model loading delay on first request.

Changes Made

Configuration Enhancement:

  • Added preload_models boolean field to the InferenceConfig struct
  • Updated YAML parsing to support the new configuration option

LocalAI Integration:

  • Implemented generateLocalAIConfig() function to create appropriate preload configuration for LocalAI
  • Modified container image configuration to automatically include --config-file=/config.yaml when preload is enabled
  • Generates proper preload configuration format with model IDs and filenames

Usage Example:

apiVersion: v1alpha1
debug: true
preload_models: true
models:
  - name: llama-3.2-1b-instruct
    source: https://huggingface.co/MaziyarPanahi/Llama-3.2-1B-Instruct-GGUF/resolve/main/Llama-3.2-1B-Instruct.Q4_K_M.gguf
    sha256: "e4650dd6b45ef456066b11e4927f775eef4dd1e0e8473c3c0f27dd19ee13cc4e"

Generated LocalAI Configuration:
When preload_models: true is set, AIKit automatically generates:

preload_models:
  - id: llama-3.2-1b-instruct
    name: Llama-3.2-1B-Instruct.Q4_K_M.gguf
    preload: true

Benefits

  • Faster Response Times: Models are loaded into memory on startup rather than on first request
  • Improved User Experience: Eliminates the "cold start" delay when making the first inference request
  • Production Ready: Works with all supported model sources (HTTP, OCI, local files)
  • Backward Compatible: Existing configurations continue to work unchanged

Testing

  • Added comprehensive unit tests for configuration parsing and LocalAI config generation
  • Updated existing tests to cover the new functionality
  • Added documentation with examples and usage guidance
  • All tests pass and linting is clean

Fixes #613.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/repos/mudler/LocalAI/releases/tags/v3.4.0
    • Triggering command: curl -s REDACTED (http block)
  • localai.io
    • Triggering command: curl -s REDACTED (dns block)

If you need me to access, download, or install something from one of these locations, you can either:


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

@Copilot Copilot AI changed the title [WIP] [REQ] load to memory on startup Add preload_models configuration option to load models into memory on startup Aug 22, 2025
@Copilot Copilot AI requested a review from sozercan August 22, 2025 06:12
Copilot finished work on behalf of sozercan August 22, 2025 06:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[REQ] load to memory on startup

2 participants