Thanks to visit codestin.com
Credit goes to github.com

Skip to content

krychu/llama

 
 

Repository files navigation

Llama 2 on CPU, and Mac M1/M2 GPU

This is a fork of https://github.com/facebookresearch/llama that runs on CPU and Mac M1/M2 GPU (mps) if available.

Please refer to the official installation and usage instructions as they are exactly the same.

image

MacBook Pro M1 with 7B model:

  • MPS (default): ~4.3 words per second
  • CPU: ~0.67 words per second

There is also an extra message shown during text generation that reports the number and speed at which tokens are being generated.

About

Inference code for LLaMA models on CPU and Mac M1/M2 GPU

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.1%
  • Shell 6.9%