Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Killian604/rain-collector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rain-collector

Collect your memories and customize your very own personalized AI. Emphasis on privacy.

Installation

  1. Clone the repo

More instructions TBD


Usage

Main CLI:

python rc --help

Gradio apps

gradio vllm_gradio_chat_stream.py --watch-dirs ./backend/ gradio


Folders

frontend/ is where the JS frontend goes backend/ is where the Python backend goes. Model is implemented and served from here. cookbook/ is for short scripts that show you how to do stuff

Downloading Llama 3.1

  1. Visit host site: https://huggingface.co/meta-llama/Meta-Llama-3.1-8B
  2. Request access to the model
  3. Generate a Huggingface token
  4. Download Llama 3.1:
  • huggingface-cli download --repo-type model --token $HFTOKEN meta-llama/Meta-Llama-3.1-8B-Instruct, or
    • Minor note: the default download location is: ~/.cache/huggingface/hub/
  • huggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --token $HFTOKEN --local-dir ./models/Meta-Llama-3.1-8B-Instruct
  • huggingface-cli download meta-llama/Meta-Llama-3.1-8B-Instruct --include "original/*" --local-dir Meta-Llama-3.1-8B-Instruct --token $HFTOKEN

Ollama

  • ollama pull llama3.1
  • ollama run llama3.1:8b-instruct-fp16
    curl http://localhost:11434/api/chat -d '{
      "model": "llama3.1",
      "messages": [
        {
          "role": "user",
          "content": "who wrote the book godfather?"
        }
      ],
      "stream": false
    }'

TTS

  • import nltk;nltk.download('averaged_perceptron_tagger_eng')

Notes

  • Loading a GGUF model requires GGUF to be installed: pip install gguf

Voice

  • fish-speech: easy voice cloning
  • MeloTTS has decent default voices, but fish is better for fast cloning. No ability to tune defaults.
  • tacotron2 TTS isn't as strong as other competitors
  • parler sounds good, but bails out of the script during long blocks of text
  • Coqui/XTTS: Not worth pursuing since development has been discontinued.
  • StyleTTS2: Incredibly hard to implement fast inference to prove concept
  • GPT-SoVITS: Extremely hacky repo. Not enough bonuses to use this over fish-audio, so dropping usage. Written in Chinese first.
  • suno/bark: Slow to inference. No voice cloning. Has SOME non-talking attributes like [laughter]
  • kokoro: Could not get to load from build_model() based on usage instructions.
  • piper/piper-tts: pip package does not exist despite being used in installation instructions. Aimed for Pi4 usage.

_

ASR

  • To ASR a 4.5h podcast with Whisper Large, it takes about 30 min
    • So it takes about 1 min of runtime per 9 min of audio

https://huggingface.co/openai/whisper-large-v3

FAQ

  • Q: How do I opt out of ChromaDB's opt-out telemetry?
  • A:
    from chromadb.config import Settings
    client = chromadb.Client(Settings(anonymized_telemetry=False))
    # or if using PersistentClient
    client = chromadb.PersistentClient(path="/path/to/save/to", settings=Settings(anonymized_telemetry=False))
  • Q: I'm missing flash_attn, how do I install?
  • A: pip install flash-attn --no-build-isolation

_ https://zulko.github.io/moviepy/getting_started/updating_to_v2.html

https://github.com/meta-llama/llama-recipes/tree/main

Wikipedia

Get


Questions to answer:

  • Where does Ollama save models after pull?

$HUGGINGFACE_HUB_CACHE gradio vllm_gradio_chat_stream.py --watch-dirs ./backend/

Evaluations and MMLU leaderboard

https://github.com/huggingface/evaluation-guidebook https://github.com/huggingface/lighteval/wiki/Use-VLLM-as-backend Source: https://github.com/huggingface/blog/blob/main/open-llm-leaderboard-mmlu.md

-Original MMLU: compare probabilities of possible answers, use highest prob between options as response -HELM implementation: expectation is that correct answer will be highest prob, otherwise false -AI Harness: comparison of long-form response to long-form answer

  • Self note: probs are summed (and normalized probably). If unsure or value is
  • Docs note: "For numerical stability we gather them by summing the logarithm of the probabilities and we can decide (or not) to compute a normalization in which we divide the sum by the number of tokens to avoid giving too much advantage to longer answers "

Notes for accepting future calls

  • Must be within a time limit (20-60s? Depends on topic)
  • Must not be too loud
  • Must not have prohibited words
  • Must not have gratuitous curse words
  • (Must be on topic?)

Other resources

https://github.com/langchain-ai/rag-from-scratch?tab=readme-ov-file

About

Collect your knowledge and memories, and let your personal AI leverage them to water your life.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published