Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[pull] main from abetlen:main #61

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 361 commits into from
Feb 28, 2025
Merged

[pull] main from abetlen:main #61

merged 361 commits into from
Feb 28, 2025

Conversation

pull[bot]
Copy link

@pull pull bot commented Apr 9, 2024

See Commits and Changes for more details.


Created by pull[bot]

Can you help keep this open source service alive? 💖 Please sponsor : )

abetlen and others added 28 commits May 10, 2024 09:44
Bumps [pypa/cibuildwheel](https://github.com/pypa/cibuildwheel) from 2.17.0 to 2.18.0.
- [Release notes](https://github.com/pypa/cibuildwheel/releases)
- [Changelog](https://github.com/pypa/cibuildwheel/blob/main/docs/changelog.md)
- [Commits](pypa/cibuildwheel@v2.17.0...v2.18.0)

---
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Special tokens are already mapped from metadata by llama.cpp
…1333)

* implement min_tokens

* set default to 0

* pass min_tokens

* fix

* remove copy

* implement MinTokensLogitsProcessor

* format

* fix condition
updated-dependencies:
- dependency-name: pypa/cibuildwheel
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrei <[email protected]>
feloy and others added 29 commits December 6, 2024 04:47
Add support for Cuda 12.6.1
Update version of Cuda 12.5.0 to 12.5.1

Co-authored-by: Andrei <[email protected]>
* fix: added missing exit_stack.close() to /v1/chat/completions

* fix: added missing exit_stack.close() to /v1/completions
…e of asyncio to lock llama_proxy context (#1798)

* fix: make use of asyncio to lock llama_proxy context

* fix: use aclose instead of close for AsyncExitStack

* fix: don't call exit stack close in stream iterator as it will be called by finally from on_complete anyway

* fix: use anyio.Lock instead of asyncio.Lock

---------

Co-authored-by: Andrei <[email protected]>
#1793)

- Replaced deprecated llama_sample_token with llama_sampler_sample
- Updated llama_token_to_piece signature to include lstrip and special arguments
- Changed llama_load_model_from_file to use positional argument for params (currently does not accept a keyword argument for it)
* fix: chat API logprobs format

* Fix optional properties
* fix: correct issue with handling lock during streaming

move locking for streaming into get_event_publisher call so it is locked and unlocked in the correct task for the streaming reponse

* fix: simplify exit stack management for create_chat_completion and create_completion

* fix: correct missing `async with` and format code

* fix: remove unnecessary explicit use of AsyncExitStack

fix: correct type hints for body_model

---------

Co-authored-by: Andrei <[email protected]>
* feat: Sync with llama.cpp

Add `no_perf` field to `llama_context_params` to optionally disable performance timing measurements.

* fix: Display performance metrics by default

---------

Co-authored-by: Andrei <[email protected]>
@pull pull bot merged commit 710e19a into MZWNET:main Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.