Add preload_models configuration option to load models into memory on startup #614
+187
−10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements a new
preload_models
configuration option that enables loading models into memory when the LocalAI server starts, improving response times by eliminating the initial model loading delay on first request.Changes Made
Configuration Enhancement:
preload_models
boolean field to theInferenceConfig
structLocalAI Integration:
generateLocalAIConfig()
function to create appropriate preload configuration for LocalAI--config-file=/config.yaml
when preload is enabledUsage Example:
Generated LocalAI Configuration:
When
preload_models: true
is set, AIKit automatically generates:Benefits
Testing
Fixes #613.
Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
https://api.github.com/repos/mudler/LocalAI/releases/tags/v3.4.0
curl -s REDACTED
(http block)localai.io
curl -s REDACTED
(dns block)If you need me to access, download, or install something from one of these locations, you can either:
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.