Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: lycying/llama.cpp

Tags

master-da5303c

Toggle master-da5303c's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
bugfix: default should not be interactive (ggml-org#304)

master-d7def1a

Toggle master-d7def1a's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Warn user if a context size greater than 2048 tokens is specified (gg…

…ml-org#274)

LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.

master-c494ed5

Toggle master-c494ed5's commit message

Verified

This commit was signed with the committer’s verified signature.
ggerganov Georgi Gerganov
Fix off-by-one bug (ggml-org#115)

master-ad5fd5b

Toggle master-ad5fd5b's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Improved quantize script (ggml-org#222)

* Improved quantize script

I improved the quantize script by adding error handling and allowing to select many models for quantization at once in the command line. I also converted it to Python for generalization as well as extensibility.

* Fixes and improvements based on Matt's observations

Fixed and improved many things in the script based on the reviews made by @mattsta. The parallelization suggestion is still to be revised, but code for it was still added (commented).

* Small fixes to the previous commit

* Corrected to use the original glob pattern

The original Bash script uses a glob pattern to match files that have endings such as ...bin.0, ...bin.1, etc. That has been translated correctly to Python now.

* Added support for Windows and updated README to use this script

New code to set the name of the quantize script binary depending on the platform has been added (quantize.exe if working on Windows) and the README.md file has been updated to use this script instead of the Bash one.

* Fixed a typo and removed shell=True in the subprocess.run call

Fixed a typo regarding the new filenames of the quantized models and removed the shell=True parameter in the subprocess.run call as it was conflicting with the list of parameters.

* Corrected previous commit

* Small tweak: changed the name of the program in argparse

This was making the automatic help message to be suggesting the program's usage as being literally "$ Quantization Script [arguments]". It should now be something like "$ python3 quantize.py [arguments]".

master-2456837

Toggle master-2456837's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Support for multiple reverse prompts. (ggml-org#299)

Co-authored-by: Johnman <>
Co-authored-by: Johnman <tjohnman@github>

master-22213a1

Toggle master-22213a1's commit message

Verified

This commit was signed with the committer’s verified signature.
ggerganov Georgi Gerganov
Change RMSNorm eps to 1e-6 (ggml-org#173)

I think this is what is used in the Python code

master-368d0c8

Toggle master-368d0c8's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Respect the maximum number of tokens in interactive. (ggml-org#298)

Co-authored-by: Johnman <johnman@github>
Co-authored-by: Georgi Gerganov <[email protected]>

master-084e2f0

Toggle master-084e2f0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
interactive mode: print '\n' in sigint_handler, this flush stdout thu…

…s ensure color reset. (ggml-org#283)

master-70f01cb

Toggle master-70f01cb's commit message

Verified

This commit was signed with the committer’s verified signature.
ggerganov Georgi Gerganov
Drop trailing new line from file prompts (ggml-org#80)

master-50fae10

Toggle master-50fae10's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Add --ignore-eos parameter (ggml-org#181)

Co-authored-by: Georgi Gerganov <[email protected]>