Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Support for 4 bit Quantization #580

@vikigenius

Description

@vikigenius

Language model progress has been rapid recently and with the llama weights being released, so much progress is being made on the c++ side

https://github.com/ggerganov/llama.cpp

I see that fp16 is on the roadmap soon.

But it might also be a good idea to consider support for 4 bit quantization and related techniques. Is that something that will be considered?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions