Thanks to visit codestin.com
Credit goes to github.com

Skip to content
This repository was archived by the owner on Jul 4, 2025. It is now read-only.

Hostfix: remove not needed params from load_model #2209

Merged
merged 10 commits into from
Jun 12, 2025

Conversation

qnixsynapse
Copy link
Contributor

Describe Your Changes

  • Remove unneeded params from load_model and use llama.cpp defaults for most of the params

Fixes Issues

  • Closes #
  • Closes #

Self Checklist

  • Added relevant comments, esp in complex areas
  • Updated docs (for bug fixes / features)
  • Created issues for follow-up changes or refactoring needed

The --pooling flag was removed as the mean pooling functionality not needed in chat models. This fixes the regression
Adds support for the ctx_len parameter by appending --ctx-size with its value. Removed outdated parameter mappings from the kParamsMap to reflect current implementation details and ensure consistency.
When the model path contains both "jan" and "nano" (case-insensitive), automatically add
speculative decoding parameters to adjust generation behavior. This improves
flexibility by enabling environment-specific configurations without manual
parameter tuning. Also includes necessary headers for string manipulation and
fixes whitespace in ctx_len handling.
The comment was redundant as the code's purpose is clear without it, improving readability.
@qnixsynapse qnixsynapse enabled auto-merge (squash) June 12, 2025 06:59
qnixsynapse and others added 6 commits June 12, 2025 12:47
This commit introduces new configuration parameters and their corresponding command-line flags for the local engine. The changes include:
- Adding "flash_attn" to ignored parameters
- Mapping UI parameters to CLI flags (e.g., cpu_threads → --threads)
- Expanding support for various model configuration options

These additions enhance the flexibility of the local engine by enabling fine-grained control over performance and behavior through both UI and CLI interfaces.
The condition was updated to include 'qwen' in the check for triggering specific parameters
('--temp', '--top-p', etc.), aligning it with the existing 'jan' and 'nano' validation logic. This allows
the same parameter configuration to apply to 'qwen' models as well as the original keywords.
Removed deprecated parameters such as "dynatemp_exponent" and "ctx_len" handling logic,
which were no longer needed. Added "flash_attn" back to the ignored parameters list.
Cleaned up the parameter conversion logic by removing conditional blocks for
specific model optimizations that are no longer required.
@qnixsynapse qnixsynapse merged commit 3a63826 into dev Jun 12, 2025
@qnixsynapse qnixsynapse deleted the hostfix/remove_pooling branch June 12, 2025 08:17
@github-project-automation github-project-automation bot moved this to QA in Jan Jun 12, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Status: QA
Development

Successfully merging this pull request may close these issues.

2 participants